#✨│ai-help

1 messages · Page 332 of 1

restive kiln
#

A bit ocnfusing, dunno what im looking at

wicked fractal
#

could someone probably help me out. I try to set up a local claude code with a free model on VSC but im getting an error if i type on the terminal; claude . Im surely missing a step im not a coder im just new in this space

#

i just watched a video and did everything that the guy said but i think im missing some basic stuff

wicked bane
#

ok

wicked fractal
wicked bane
wicked fractal
#

import what

proven hill
wicked fractal
#

bro is trolling my life

proven hill
wicked fractal
proven hill
wicked fractal
#

cause i needed help and u troll me

proven hill
wicked fractal
#

what do i need to import i cant read ur mind btw

#

chat gpt said i should install an ai extension so that vs code can connect with openrouter

wicked fractal
# proven hill yea

can i probably send u a video its just 5 min but im surely missing smthing

#

and maybe i can get ur opinion on it aswell maybe u know better options

low shard
#

W8ights isn’t a real server, this looks like a scam, don’t trust it, weights closed and there’s no official continuation

#

many staffers from there started bomb promoting here and got banned, i think i need to make an announcement on this later

#

rvc is retrieval-based-voice-conversion, not realtime voice changer

#

elaborate your pc gpu, your pc os, what are you trying to do and any tutorial link you’re using

low shard
proven hill
#

can i get an invite?

low shard
low shard
proven hill
restive kiln
gentle granite
#

not really but the filters on other generators are too restrictive

reef monolith
#

Hi, guys I am trying for 1 week now to prompt a UGC charakter who is talking words out without mistake in german but it doesnt matter what I do it dont work out. Did someone here face the problem and can help me pleaseee.

teal portal
low shard
low shard
desert notch
#

hello guys i need a help

#

is there anybody free

low shard
# desert notch hello guys i need a help

This is a General AI Discord Server, please elaborate:

  • your pc gpu
  • your pc os
  • what are you trying to do: TTS, AI Covers, E Girl Trolling / Catfishing or Roleplay
  • the tutorial link used
desert notch
#

ai

#

as i use voice tts model for text to speach

proven hill
desert notch
mellow oasis
#

Hello, I'm trying to start making videos on TikTok, with the character of Makise Kurisu, but I don't know where to find a voice of hers to put on the character, does anyone know how to find voices of anime characters for free? I have seen many people on TikTok who have them and don't pay anything

viral mason
#

But if you don't find anything you like you can make one yourself, it's free and easy

dusky island
#

Small question i want to isolate main lead vocals from a recording, what model can be used best to remove crowds and backing vocals in UVR?

viral mason
#

it's pretty good from what I've seen

dusky island
#

Ah okay, yeah im already in the cue atm XD

viral mason
#

mainly for bg vocals but I believe it works on crowds too

dusky island
#

okay how do you know which models are the correct ones to use, because i have to wait in line behind 231 other people. so you know if i know which one to do i wont waste time figuring it all out

#

is there somesort of guide that tells you what each model is for specifically or is it just kind of trying and see what works really?

#

in other words whats really the learning curve when it comes to picking models?

viral mason
#

it's free

#

anyway for vocal separation in general I use this

#

and reverb I use both these two, stereo seperation first then room reverb

dusky island
#

thanks, although the wait time only lestens when you buy premium it seems

viral mason
#

I haven't paid

dusky island
#

yeah mostly in the morning here in europe there are only 1-20 online

#

but thank you so much for the help!

viral mason
#

one last thing, for echo I use this one

#

tho if it starts removing breaths I use deecho-dereverb

daring sail
#

I have a problem with the image, I use flux2 but in the place where I used inpainting I have a loss of quality, the picture in that place looks a little blurry, how can I fix this?

feral crater
viral mason
#

For cleaning a song I use the bs reformer vocal separation, karaoke (background vocals) stereo reverb, room reverb, then finally de-echo

#

@feral crater

gusty merlin
#

anyone have like a bunch of rapper rvc? willing to pay

viral mason
viral mason
mint snow
#

anyone got a website creator?

#

the best one (cheapest)

leaden iris
#

When training a voice model, is it better to include or exclude vocalizations that aren't words or phrases? like grunts, screams, mouthy noises and such?

viral mason
leaden iris
#

Fair fair, I figured as much. I just wasn't sure if it'd enhance or mess up the dataset

#

Ty! ^^

viral mason
viral mason
#

<@&1159293140440723499>

last wolf
#

Hello y’all,

I’m planning to make a personal FitzyVA model (mainly for Cyn/Cynessa and Mel LARPing, mainly the latter though), and I was wondering, which program is the better option for expressing emotions when using RVC, (regardless of the dataset used?)

Personally, I think Vonovox is really good and has significantly lower latency. Meanwhile, Applio is also a solid option and works as more of an all-in-one program, though it does have some flaws in certain areas. But even then. It's really good. And from what I was able to gather. Vonovox is superior for real-time emotional expression. I'm not too sure though. Cuz I did test out both and got mixed results

So which one is better for expressing emotions and stuff, not just basic talking?

analog obsidian
# last wolf Hello y’all, I’m planning to make a personal FitzyVA model (mainly for Cyn/Cyne...

none because the model doesn't learn expressions in the training, it learns linguistics, pitch and spectograms
the outputs can sound a bit monotonous because the RVC core was primarily designed for TTS, although it did come with an option for voice conversion inference (without using F0, so it’s also quite monotonous)
rvc is a kind of hack that removed all of the tts stuff and improved voice conversion by adding f0, an self-supervised embedder (contentvec/hubert) and replaced the hifigan vocoder with nsf hifigan, although in older versions it is possible to use the standard hifigan, without f0 (it's worse)

tldr; rvc is kinda like an enhanced tts, but with the typical 2022 old tts flaws (not lying, the rvc core is that old), like not being able to be expressive enough
more modern TTS can do pretty good and realistic expressions because they learned them in their training, that tech did not existed when rvc was made/was too heavy to run in realtime
realtime clients like vonovox, applio, wokada etc can't fix architectural issues like these

scarlet thistle
#

I saw you joined and aren't there anymore

viral mason
#

that isn't a helper, ping the helper role for help lol

#

anyway what voice changer are u using, what gpu do u have (nvidia or AMD) and what are u using it for?

#

wdym?

viral mason
#

one without effects so it sounds just like the va

#

it's not public here but

#

yknow

last wolf
# viral mason I have a cyn model

thats kinda cool actually, i like when people include the actual va speaing for their voice models instead of just relying on the character clips

viral mason
#

well she did basically make all her voice lines she recorded public on her yt

#

plus the advertisements for merch

last wolf
#

yeah, thats actually smart

#

can extend the few lines cyn has in the show

viral mason
#

ye

viral mason
last wolf
#

oh wait i got dms off

#

add me rq

viral mason
#

I don't mind sharing my models

leaden iris
#

I just finished training a model with applio and I'm getting an index file the size of 500mb, every index i've seen from others was 50mb, anyone know what's up?

analog obsidian
#

now the index will be smaller at the cost of slightly less accuracy

leaden iris
#

Ooh I see, I'm fine with a slight loss, I just don't want a model file to be paired with an index 10x it's size haha

#

Tyvm!

#

Huh.. generating an index with either Faiss or KMeans both results in a 500mb index file

analog obsidian
leaden iris
#

Could it be because I have a huuge dataset?

analog obsidian
leaden iris
#

The dataset is about 50mb total but the amount of individual files is eheh.. 1875

analog obsidian
#

im preprocessing some 3 hour dataset i have, let me see if i get a tiny index

leaden iris
#

Mhm yeah, both times i deleted the index, swapped the algorithm, then generated

analog obsidian
#

im looking the code 1 sec

analog obsidian
#

yep it's bugged

#

i fixed it

#

i love applio

#

when i clicked kmeans it was generating a simple faiss file

#

and now i got a 30mb file

#

bruh

analog obsidian
#

rvc > train > process

#

no need to restart the gui, after you replace the file click kmeans again and train the index

leaden iris
#

Oh huh that's silly, fair enough, thanks a ton!

analog obsidian
#

hope that works for u

#

worked for me

leaden iris
gentle granite
#

I just need less filters

low shard
low shard
proven hill
plucky crown
#

Hi

#

I7 13th gen 13700
Msi mobo mortar b760m
32gb DDR4
3090 GIGABYTE OC 24gb
2x1tb kingston mv3 ssd
1000w PSU

This will be the pc i will buy
Is it good

#

Is it enough need help asapp

#

Helpp

simple ore
simple ore
#

50mb index is a model created with like 5 min audio

#

big chonky model

analog obsidian
simple ore
#

i dunno why even 'Auto' is needed

#

just index_algorithm == "KMeans" or big_npy.shape[0] > 2e5

#

okay, made the fix

#

anyway, my point stands - big dataset, big index and it is actually good

analog obsidian
#

glad it's fixed now

loud rivet
#

Hey.. everyone. I wanted to ask what is the best app for RVC conversions? I use replay.. and sometimes the voice just cracks up somewhere.

And I dont have the GPU required for some apps that I saw on YouTube and stuff.

So if anyone can help me out here.. I will be great!!

loud rivet
#

I dont really know the GPU

#

Because I am a noob... and I only ever used hugging face and uhh replay

#

Yea it's a Microsoft pc

#

You can google it.. because I cant upload the pic here..

viral mason
#

U can check the gpu there

loud rivet
#

Ok one sec

viral mason
#

If it's Intel you can't really run anything good locally

loud rivet
#

Wait sorry ppl I cant find it one moment

viral mason
#

I'd show u how but I'm not on my pc rn I'm still trying to wake up

low shard
loud rivet
low shard
#

if so, search task manager, go to the performance tab and check the GPU component

#

or is it a Mac or a Linux PC?

loud rivet
#

Ok found it

loud rivet
viral mason
#

Lol did that fire wolf guy get banned

low shard
low shard
loud rivet
viral mason
#

I think you misread that question

loud rivet
#

So it says utilization is 1% and shared GPU memory is 0.8/7.9 GB.

While the memory is 0.5/7.9 GB

#

So...what now..?

viral mason
#

?

low shard
low shard
loud rivet
#

Uhhhmmmm....

#

One moment again

#

How do I tell the name..?

#

I dont get it.. sorry

low shard
#

send a screenshot

loud rivet
#

Ok..

loud rivet
loud rivet
low shard
# loud rivet

You have got an Integrated Intel GPU PC, it's not good for Local AI, that's why Replay was slow and having issues for you too btw,
It runs on your hardware

loud rivet
#

Oh.. I didnt know that.

#

So silly of me

#

So what should I do..?

low shard
#

It would be better if you use Cloud (Remote PC) alternatives with a limited Free GPU time

#

it would be faster but ofcourse not unlimited

#

what do you want to do? You could use the suggested cloud, or try to run on your PC CPU which will be slow ofc

loud rivet
low shard
# loud rivet What do u mean?

you either still try Applio (RVC Fork) on your PC CPU, which will be unlimited but slow, or use the fast and more suggested Cloud alternatives, which ofcourse won't be unlimited for free becuase you're using someone else GPU

low shard
loud rivet
#

Oh.. what do u recommend sir?

viral mason
low shard
loud rivet
#

Ohhhhhhhh I see..m

#

Got it I keep that in mind from now on!

low shard
low shard
#

read the guide and let me know

loud rivet
#

Ok! Thank you for ur time and patience with me!

low shard
loud rivet
# low shard you're welcome

oh and sorry to bother again mister. i wanted to ask the link that u shared me. does it also allow music mp3 files to be uploaded? because i make AI music. and want to make JJK characters sing them..

warped laurel
#

can anyone help me in ai

#

i always wanted to do something i do not know what to

#

i always wanted to make money with ai

#

pls anyone help

#

dm me

#

and message ai so i can understand

low shard
plucky crown
#

Help the seller sent me this for the pc
I7 13th gen 13700
Msi mobo mortar b760m
32gb DDR4
3090 GIGABYTE OC 24gb
2x1tb kingston mv3 ssd
1000w PSU

Should i get it?

viral mason
#

Is it a desktop or a laptop

plucky crown
#

Pc

viral mason
#

Intel for the CPU is kinda just ok, the other stuff I'm unfamiliar with as I'm not super into computers

#

How much will it cost?

plucky crown
#

Its 901usd

#

Equivalent in dollars for the whole pc

#

This is the 3090 gigabyte brand , its my first time with gpu pc so i need help

viral mason
#

Depends on what u want to do with it

plucky crown
#

Ai and vidoe editing and fl studio

viral mason
#

Could you expand on what u mean by ai, there's tons of stuff like video, images, rvc, ect

plucky crown
#

Rvc

viral mason
#

Video editing and fl studio can be done for sure on a 3090 just fine

plucky crown
#

I only have 900 usd

#

I can add the 1 usd so i could buy it

#

Should i get it or nah i would regret for the rvc voice ai in this server

viral mason
#

Vonovox should run just fine on a 30 series, I have a 5070ti and it's also pretty good there

#

You want real-time right?

plucky crown
#

Yes

viral mason
#

Yea the PC you're buying should work just fine with everything you want to do

plucky crown
#

5070ti should i get it instead of 3090? But 3090 has higher vram wouldnt that be best for ai voice training

#

Since 5070ti is newer

viral mason
#

Hmm

#

I mean I train on cloud bc I'm lazy so I dunno how it is with training voices locally but I'm sure it works just fine

#

But I definitely think if you can afford it to get a 50 series Nvidia gpu over 30 series

plucky crown
#

But i also want a text ai or train ai they said u need higher vram for that

viral mason
#

Hm

plucky crown
#

I can only afford a 3090 for higher vram

viral mason
#

I'd ask Lyery as I'm not as smart with PCs and rvc as him

#

Might get better insight from him

plucky crown
#

@analog obsidian

#

We needHelpp @analog obsidian

viral mason
#

Just ping once

analog obsidian
# plucky crown We needHelpp <@775545133448953856>

you want it for both realtime and training? hmm, the 5070ti is super fast and should give you good performance in realtime

it's true that you need more vram if you want to train models, but in rvc you don't really want to go above batch size 8 (about 5gb in fp16 mode, 7gb in fp32)

realtime inference uses 2gb of vram iirc

hmm hard decision... the 3090 might be useful in cases where you play unoptimized games that uses lots of vram

plucky crown
#

Yes also the 3090 price for the whole pc the owner/seller is able to get it to 901 usd so i could buy it right now

analog obsidian
#

also, seems like faster gpus do get sligthly less delay, so in theory the 5070ti is technically better for realtime, but yea, the vram

#

also is way faster than the 3090 for training models

plucky crown
#

Its my first seeing in marketplace a whole 3090 pc less than 1000usd so should i get it, i only have 900usd the seller lowered it to match my savings

analog obsidian
#

more vram helps getting less stutters too

plucky crown
#

Yes

analog obsidian
#

yea tbh the 3090 is a good option despite being older just because it can provide a more smooth experience with not that much stutters

plucky crown
#

Should i get it then i wont regret it for real time voice/ the ai voice thingy in this server @analog obsidian ?

analog obsidian
plucky crown
#

I cant afford a 5070ti pc i only have 900usd for a whole pc

analog obsidian
#

ah then the 3090 is enough dw

#

rvc was made in 2023

#

late 2022

plucky crown
#

So 3090 can handle it even if its older?

analog obsidian
#

yup

plucky crown
#

Thanks

plucky crown
analog obsidian
worldly plank
#

whats the directory / name of this program again

viral mason
leaden iris
#

Is it better to have separate audio files, or combine them all into single file for voice model training? Or does it not matter?

Does it matter if it's tons of files? Like hundreds or thousands?

mighty hinge
#

Hi, i make videos for youtube, bit i want to ad the voices in applio, how can i use the TTS? Someone have the link to use it?

left totem
#

The best ai modal, hi ive tried to find the best modal overall, but i dont know what i should use i dont like cloude at all ive used it for 5 mouths is there any other?

patent trellisBOT
hallow thistle
hallow thistle
# plucky crown Help the seller sent me this for the pc I7 13th gen 13700 Msi mobo mortar b760m...

NVIDIA GeForce RTX 3090 is already enough, potentially a bit overkill, for simple RVC inference, but for voice model training, there's an advantage. RTX 5070 Ti is way newer, but RTX 3090 has more VRAM. Kingston NV3 is a budget SSD, uses QLC NAND which gives more capacity but slower than "TLC NAND", so if you were to do heavily files moving, either Kingston KC3000, WD Black SN850X or SN7100 might be better. Ensure that motherboard's BIOS is up to date. Should you get that one? It's your choice still if you have that budget.

#

Another concern is that one power supply. That power supply is stated 1000w, but always check its brand, model and health because you might not expect it to smelly or turn off suddenly during runs. The DDR4 RAM speed is not stated, but probably a bit slower than DDR5 anyway. harukablush

low shard
low shard
daring sail
#

Hi, I have a problem with Flux2 Klein. I'm trying to use inpainting to remove text, but in the areas where I applied it, there's a blur issue, it looks kind of smeared. The second problem is that in those areas the color changes compared to the surrounding environment. Is there any way to fix this?

young mango
daring sail
young mango
hallow thistle
#

Why direct message?

young mango
# daring sail Well, how can this be fixed?

Yeah that is a common thing with inpainting the blur usually comes from denoise being too high so it over rebuilds the area instead of blending it and the color shift happens because the model is reinterpreting lighting in that masked region

#

Try lowering the denoise so it keeps more original detail keep the mask tighter around the text and do it in two passes first remove the text then a lighter pass to match texture and color

daring sail
daring sail
median crag
#

yo does anyone have a working rvc colab? i cant find one

lavish lantern
#

sorry to bother anyone but is there a RVC of sash lilac from freedom planet to download anywhere ive been searching for so long and what is here are sadly gone since weight sadly had its farewell and applio is giving me 404 thing

viral mason
lavish lantern
viral mason
lavish lantern
viral mason
#

ah I see, idk at all about getting applio working locally but I know how to use the interface and train models ect

#

best bet would be looking here

#

-rvc

patent trellisBOT
viral mason
#

it should be in the applio docs

viscid minnow
#

Any good, realistic models you’d recommend?

viscid minnow
viral mason
lavish lantern
#

i see i was hoping someone can share theirs before the it cant be downloaded anymore happened

lavish lantern
# viral mason wdym?

i found older downloads here but it all lead to 404 not found or just dead links for sash lilac so i was wondering if anyone have it they can share it with to me but its understandable if no one have it she isnt a very well known character and so is her game

viral mason
#

is this the character you're looking for?

#

I found a 13 minute voice line video of her that could be used to train a model

lavish lantern
lavish lantern
# viral mason is this the character you're looking for?

ill see how i can make one on my own if my brain can handle it if and if its easy and if not ill have to wait for someone to make it or can share the old one thanks for your time for helping me out and sorry i didnt mention the clear idea of who she was and is and from

viral mason
viral mason
# viral mason

I did skip making an acc and verifying but you do need to verify by phone or one of the other options it gives or the notebooks do not function

lavish lantern
mint snow
#

and i need one for domain

twin snowBOT
#

Hi! I am a bot for DISBOARD (https://disboard.org)

COMMAND LIST

/help: This!
/bump: Bump this server.
/page: Get a link to the server's DISBOARD page.
/invite [channel]: Set invite to this channel. If [channel] is specified, create an invite for that channel. (Admin only.)

How do I add my server to DISBOARD?

  1. Login on DISBOARD website,
  2. Go to Dashboard,
  3. Click "Add New Server"
    Fill out your server info and save it. You will be redirected to Discord's authorization screen. If not, click the "Add Bot" button on the server edit page.

Need help? Join the support server.

viral mason
#

<@&1159293140440723499> hacked account

viral mason
#

<@&1159293140440723499> promoting nsfw

hallow thistle
hallow thistle
kind ether
#

Following the RVC Applio tutorial on the ai hub docs,
I have a RX 6500 XT
Shader ISA: GFX10.3 (gfx1034)
Do i follow the top one or bottom one?

If its the bottom one do i follow this or the other gpus one?

kind ether
wintry scaffold
#

hey guys

#

im new to this ai stuff

#

i tried running llama 3.1 8b version on ollama locally on my system

#

i said "hi"

#

10 seconds later i heard my pc fucking screaming

#

then it shut down

#

what do i do?

#

Full GPU Name: NVIDIA GeForce GT 610
Operating System: windows 11

wind path
#

Hello! I just came here on one question: I have Projects (folder organization) on the ChatGPT Free plan, I also live in Poland 🇵🇱 so I don't have ads (yet, lucky me).. I do have more features, and I wonder - Should I move to Go or Plus? I usually run not a lot msgs per day, but I do give some kind of context (not a lot, but some details, I'm not an advanced user) and the weakest model is still enough for me I guess? I'm also using uBlock, I wonder if it works with ChatGPT lol

wind path
#

This GPU is so weak bro

hallow thistle
low shard
#

not sure if it the GPU would even get detected, probably running on the CPU with RAM (and you probably have 8gb of RAM)

#

unless you try extremely small models, it's best you try cloud

wind path
low shard
# wind path Hello! I just came here on one question: I have Projects (folder organization) o...

uBlock Origin (and lite) probably should work fine

If you're not running into any any issues with your current free tier, it might be best to keep using it

You should decide on it depending on your needs, because if you're doing just small things and it works fine, there isn't really a need

But if you needed more powerful models, for like coding or complex math tasks, or are running out of free messages, you might want to try other services like Gemini, Claude, or subscribe to chatgpt

Like for example, you wouldn't need a 5K super powerful pc if you're just going to use it for Microsoft Word, if you get the analogy

#

its your money and your needs, so it's best you take the decision valuating what you want to do, instead of blinding just buying a 2nd car you might not even use because your 1st car works fine

#

btw I would suggest to learn how to prompt engineer it better, giving it more clear instructions and info can help alot for results, especially on complex tasks

wintry scaffold
#

I'm thinking about a PC upgrade

#

below 100,000 rupees tho ($1,053)

#

what should i get?

#

what specs should i upgrade

hallow thistle
# wintry scaffold what specs should i upgrade

If you mean to upgrade a GPU in your current PC, there's NVIDIA GeForce RTX 3060, but there's another better workaround. With $1000 budget, you might rather expect a complete newer mid-tier PC with more recent specs.

wind path
#

I'm not a programmer tho, I don't work with ChatGPT, it's rather a set of hobbies, not PRO work, so just chilling out.. want to improve myself

wind path
# low shard btw I would suggest to learn how to prompt engineer it better, giving it more cl...

Sounds interesting - I recently did ask my GPT for this topic, and he got me pretty satisfied with it.. He gave me that example, if you would like to rate this as a real person.. I want to objective feedback, another perspective 😊

NOT
What headphones to buy?

BETTER
I'm looking for headphones for work (long sessions, comfort, and good sound quality). What are some reasonable options for a mid-range budget, and what are the real differences?

#

===
I suffer due to ChatGPT prices in PL 😭 The Plus plan is 20$, it's going to be 72PLN by the currency calculations, but it says, like, 100PLN (which is 25$ in this price, and currency converter says 91PLN).. It's weird..

low shard
karmic trellis
#

guys whats this. this happens when i change index rate and when i click on start in vonovox beta

viral mason
inland carbon
#

Hi. I'm trying to run gemma4:26b through Continue on VSC

#

I can't add it to models tho no matter what's in the config

hardy yew
#

run this command in vonovox directory

vale harbor
#

Does anyone know which version this is? I reformatted my computer and can't find it.

viral mason
#

But what gpu do u have, there's an Nvidia version and AMD version

#

I feel like those tik toks where they go "what kind of fruit is this"

viral mason
#

You should try out Vonovox, it's much better than tg fork

#

I'll get the links rq

vale harbor
viral mason
vale harbor
#

but I liked the one I had before better

viral mason
#

U dont wanna just, try it out first?

vale harbor
#

I don't know how to use it—there are so many options, and I have dyslexia

viral mason
#

It's ok the only options you ever need to touch is the pitch

#

And block size (controls the delay and quality)

vale harbor
viral mason
#

I know it looks like a lot but most of that you don't need to even touch

#

Btw make sure to import the index

#

Right underneath the embedder

vale harbor
#

I don't have an index on a model I like

viral mason
#

O ok

#

So u do have a virtual audio cable right?

#

Like VB cable or vac lite

vale harbor
#

virtual cable

viral mason
#

Yah, it lets u use the voice changer on discord or in games

vale harbor
viral mason
vale harbor
#

What should I do?

viral mason
#

Download the second link, extract, then run setup64.exe, and then install driver

vale harbor
#

I already have the first one I mentioned—the virtual cable

viral mason
#

Oh

#

Nvm then

#

So for your output in vonovox select your virtual cable

visual dome
#

hi guys im new just wondering how do i use a voice changer

vale harbor
viral mason
#

Where it says output device, it's underneath the one that says audio device settings

viral mason
viral mason
#

It's ok take your time finding it

vale harbor
#

now what

viral mason
#

Send a screenshot rq so I can see what changed

visual dome
#

i wanna use this just in discord voice calls

viral mason
#

Nice

#

The important part tho is knowing your gpu, if it's Nvidia you can download Vonovox, if it's AMD you need wokada tg fork

vale harbor
#

It won't let me upload the index, but on the other one it did

#

I didn't like it

viral mason
# vale harbor

This seems like it's fine now, if the model is a female and your voice is deep/are a guy try from pitch 3-12 and opposite of you're a girl

#

And if you're using a female voice and are a female keep at 0 same other way around

viral mason
# vale harbor uh?

Ok so if you are a guy and using female voice pitch the voice up until it sounds good

If you're girl and using a guy voice pitch down until it sounds good

If guy using guy voice keep at 0

If girl using girl voice keep at 0

viral mason
vale harbor
#

Can you send me the link to the first one I mentioned?

viral mason
#

:(

#

ok

#

you'll be missing out on the best voice changer but it's ok

vale harbor
#

link pls

viral mason
#

-rt

patent trellisBOT
# viral mason -rt
🔊 Realtime RVC

Guides for Programs that use RVC Models in Realtime for Calls/Games

• Vonovox

A Realtime Voice Changer with similar performance to Wokada Tg-Develop Fork, with extra features, but it supports only Nvidia GPUs on Windows 10/11 unlike other options that have wider support. and without cloud options

• Wokada Tg-Develop Fork

A personal fork (modified version) of Wokada Deiteris Fork, it just adds some Quality of Life improvements to it like supporting Spin Embedder and Audio Effects. Don't expect too much about it since the creator made it originally as a personal project.

• Applio Realtime

A Realtime Voice Changer with similar performance to Vonovox & Wokada Tg-Develop Fork, with extra features.

• Wokada Deiteris Fork

Deiteris' fork (modified version) of wokada that doesn't get updates anymore.

⛔ Outdated/Discouraged

These options are not recommended for use.

• Original Wokada

Not suggested, older versions in youtube tuts are even way worse. GUIDE

• RVC GUI Mainline Realtime

The program is worse compared to the ones above, and much less updated. GUIDE

viral mason
#

u need these two, download both, and extract the first one, then put the second one in the folder made by the first one

vale harbor
#

That's not it; I already downloaded it.

viral mason
#

wdym

viral mason
#

that's what I just sent

vale harbor
#

it's different

vale harbor
viral mason
#

you don't want that one

#

I didn't send u that one :(

vale harbor
#

I downloaded it from your link

viral mason
#

did u have anything else running when u opened the one I sent?

vale harbor
#

no

viral mason
#

hm

vale harbor
viral mason
#

I am confused now

vale harbor
#

now what

viral mason
#

close anything related to the voice changer then try one more time opening it

#

I may need to help you in a voice call if this happens again

#

that might help

vale harbor
#

TY

viral mason
#

do u need any more help?

sly elk
#

how do u use the text to speech models?

south wyvern
#

Im not sure where to post this question, but it's regarding a1111. Its the program i started with and know it well. I had an nvidia 3060ti (8gb) and it was slow but it worked. Ive since upgraded to a 5060ti (16gb), was so excited to jump back into a1111, but then i found misery and pain, lol.

Is there any way to get it to run on a 50xx series GPU? Ive found some discussion online but its all waaaaay outdated, seems like everyone just stopped trying maybe?!? I currently have a driver that has cuda 13.2 and can find absolutely NOTHING that helps... is there help? Is a1111 just dead?

#

thats a very fine website, unfortunately I dont feel like digging through endless stuff that seems irrelevant to me in hopes that maybe I might find something that maybe is what I need. I want to ask a person and have an answer. Im just old fdashioned that way

south wyvern
#

its an webui, an older one

sly elk
#

how do i find models that are less heavy to run?

viral mason
white shadow
#

Whenever i try starting my notebook it gives this error in kaggle.
It does it everytime. I've tried making a new copy, making a new notebook, restarting my browser

nothings working mannn

white shadow
#

ok good, its not just me

cold rover
#

I dont know which program to run for the voice changer... im kinda dumb

viral mason
elder garden
#

how to use ai voice changer without using gpu?

hallow thistle
elder garden
hallow thistle
#

What would you use the voice changer for?

elder garden
#

i want to sound like kanye west

hallow thistle
hallow thistle
hallow thistle
# vale harbor

NVIDIA GeForce RTX 4090 or RTX 5070 Ti? You should set processing unit to your "GPU", extra time to 2.7 s and pitch extraction as rmvpe. Also, output and monitor audio devices look swapped but they'd still work.

hallow thistle
crimson geyser
#

where can i install voice changer nvidia

#

hi

gaunt saddle
#

i got a question if i train a model using a song that got adlibs will the model come out well and will it know how to work with songs that got adlibs'

low shard
foggy dome
#

hii i want some advice how to find the perfect rvc model for me. like how do ppl find the natural ones that will fit them Cat_Stare

hallow thistle
quasi pagoda
#

A gpu without cuda is paperweight

viral mason
#

@waxen kayak

#

-rvc

patent trellisBOT
waxen kayak
#

ok

viral mason
#

Check the applio docs to see how applio works, there are many different options like using it locally on your pc or on the cloud with Kaggle or Google colab

waxen kayak
viral mason
#

No problem! If you need help ping the helpers role

white wedge
#

"Error: Pretrained model sample rate (40000 Hz) does not match dataset audio sample rate (32000 Hz)." both the dataset and pretrained model are 32k

white wedge
proven hill
#

link to the pretrain?

white wedge
#

idk and its not only for that pretrain model ive tried other 32k ones and they all give the same error

viral mason
proven hill
#

did you select 32k in the interface? what are you using? applio?

feral sierra
viral mason
white wedge
#

actually wait

#

i might have a solution 1s

#

not for that foor my problem

feral sierra
main turret
proven hill
leaden iris
#

so i'm very much a beginner when it comes to training voice models, are there any like biig tips or things i should be doing when training? like certain settings or things i can do before or after to improve the quality of models? currently i'm using applio and using essentially default settings

untold marten
#

Heya, I understood that chunk size is used to control the delay of the voice resulted from the rvc app and higher means as well better results. Is it supposed to give such a big difference the OS (windows vs linux) and the developer (deiteris vs Tg)?

  • On windows, I need to set the chunksize to 192ms at minimum to have a stable perf and set extra to 2.7s
  • On linux, I can go as low as 24ms altho it's running tight (sometimes goes yellow and lag spikes in voice). Also 2.7s extra

I also tested different delays of chunksize on linux from 50-100-150-200.... they all sound mostly the same in quality from my perspective. So does it help if I run it at around the same delay like I do on windows or ? e.g 50 vs 192 or 100 vs 192? I mean they do seem to sound the same but can't be too sure and said to ask as well even tho I understand that I use 2 different versions of rvc from 2 different people but wanted to know if it gives any benefit running on the same delay even tho I could go lower.

feral sierra
#

hope it will work again soon, i like kaggle more than lightning ai

viral mason
#

I cannot even use lightning ai it's too confusing

#

and colab is dookie with only 4 hours max

feral sierra
#

their verification process is weird though

quasi condor
odd shale
quasi condor
odd shale
quasi condor
#

same speed on colab even tho its using two t4 gpus

viral mason
quasi condor
feral sierra
#

its really strange

odd shale
feral sierra
#

yeah i get the same popup

viral mason
#

hmpf

simple ore
#

when does the error happen?

viral mason
#

kaggle has an obvious power button to turn it off

odd shale
#

Tested it myself to see if i can recreate the error and it seems it's true.

simple ore
#

tf is power button

feral sierra
simple ore
#

????

#

tf you guys are smoking

simple ore
#

install runs

viral mason
simple ore
#

Why are you clicking it

#

instead of clicking install |>

quasi condor
viral mason
#

no matter what I press in kaggle I get that same error off it not working

#

start does it, that button as well

#

run all

simple ore
#

just run install first, when it is finished run the other cell

odd shale
viral mason
#

someone needs to call mr Applio

simple ore
#

dunno, I just tried and it is fine

#

create a new notebook, then import applio/main/ kaggle thing

viral mason
#

?

quasi condor
#

here's how u turn it off

viral mason
#

that's too confusing it should be always on the front without any extra steps

#

ty tho

quasi condor
viral mason
simple ore
#

maybe there's a space limit or somethng

viral mason
#

so I just do nothing and let my model die if something bad happens

odd shale
#

No errors popped after switching

viral mason
#

o

#

how odd

simple ore
#

nothing to do with Applio

odd shale
simple ore
#

make a support ticket or something

odd shale
odd shale
#

Myself i never had issues with that

viral mason
simple ore
#

@viral mason

simple ore
vale harbor
viral mason
vale harbor
viral mason
#

The main thing needed to change is pitch and block size

dire light
#

i followed the guide perfectly and the thing doesn't work (using lightning cloud w-okada)

edgy violet
#

Anyone got the TTS Ivy voice?

leaden iris
analog obsidian
# leaden iris oh i meant like tips for things to improve outputs i already know how to use the...

you can try merging epochs of different training runs
for example, you train the same dataset two times; one with batch size 4 and another with batch size 8, then you merge both of them and see if the quality increases

besides that, you could also try using different batch sizes for the same dataset, maybe batch size 8 is going to work better for that specific one than lets say 4

also,ive seen some use lower lr for the discriminator but honestly everytime i change the default lr things don't sound as good as with 1e-4

perhaps a different pretrain can help too https://discord.com/channels/1159260121998827560/1235952130855010365

leaden iris
analog obsidian
# leaden iris ooh noted noted, tyvm! ill definitely be trying those i did see pretrains and th...

pretrains are RVC models trained using many hours of audio
each model learned to reconstruct a spectrogram in a different way; for example, the original pretrain learned to reconstruct the spectrogram based on the VCTK dataset combined with pitch shifting.
when you load the original pretrain, what you’re essentially doing is continuing the pretrain’s training with a new dataset.
so if the pretrain has a good understanding of the human voice/spectogram, by continuing its training, your model will inherit all that previously learned knowledge.

well, when you load a pretrain different from the original, you’re continuing the training of another model that learned to reconstruct spectrograms in a different way, you get the idea

#

you can enable the usage of custom pretrains by enabling this

#

but first you have to download a pretrain G and D files (generator and discriminator)
and place them in rvc > models > pretraineds > custom

analog obsidian
#

u can try these since they're the most recent

#

everything has to match, like, for example, legacy core 1.6 requires you to select 32k, hifigan and contentvec in the training tab, otherwise it wont work

leaden iris
#

huh, that is really friggin cool actually! thank you for being so informative with everything

analog obsidian
#

lr is how fast the model learns

#

the discriminator is what forces the model to produce realistic audio

#

i personally don't like to change neither of them (1.0 is fine imo)

analog obsidian
# analog obsidian the discriminator is what forces the model to produce realistic audio

sometimes it gets too strong and prevents the generator to produce better quality results, thats why sometimes decreasing how fast it learns can produce better results

i know everything i said is confusing but they're actual ways to potentially improve a model lol

one thing to keep in mind is that rvc doesn't make miracles, if a dataset sounds bad, is going to sound bad always after training it

obtuse summit
#

hi

#

i need help making an rvc, i have the latest version of applio, i have windows 11 home, 32 gb of ddr4 ram, a ryzen5 5600x and an nvidia rtx 5060 from MSI, im trying to make an rvc for caine from tadc, since in the rvc list there isn't an italian version of it. my dataset is 1:20 minutes long but i don't know what settings to put, i modified it in audacity to make it sound better and exportted it in 48k

obtuse summit
#

asp che sto pulendo il dataset

#

ok ho fatto

obtuse summit
proven hill
obtuse summit
proven hill
obtuse summit
#

c'è scritto che ha più quality

#

epoch faccio 300 e salvo ogni 30?

proven hill
proven hill
obtuse summit
#

io ho solo applio per fare rvc

proven hill
#

e allora va bene

obtuse summit
#

io ho iniziato a trainarlo, spero che si senta bene, il mio primo rvc, quello di N si sentiva male anche dopo 5 tentativi diversi

simple root
#

Any 1 has suggestion on running AI channel

#

which AI is best for Animation / frictional Character

#

<@&1159293204038955078>

delicate cradle
#

@quasi condor hi ^-^

#

i don know how to get the full image :P

wraith flume
#

what is the software called for these voice changers

proven hill
wraith flume
#

3050

wraith flume
proven hill
#

follow this guide

wraith flume
#

k ty

wraith flume
proven hill
wraith flume
#

the vc cable thingy and stuff

proven hill
wraith flume
wraith flume
#

i watched a tutorial video

wraith flume
proven hill
#

you should only follow the text guide from the docs

wraith flume
proven hill
#

scrivimi in dm

wraith flume
#

poquito

viral mason
#

bro was so confused they switched languages

proven hill
whole mason
#

crazy

viral mason
#

if you're still interested

boreal creek
#

Can I upload and rvc model I made?

viral mason
boreal creek
#

asks like if i used hifigan or refinegan

viral mason
#

you used Hifigan if it was a year ago

#

refinegan is somewhat new

boreal creek
#

alright i made a submission, was i supposed to upload the model somewhere for someone to test or

#

what exactly

#

oh

#

i got a message

#

im not sure what embedder i used lol

#

is the embedder model architecture version?

viral mason
#

embedder is most likely contentvec

boreal creek
viral mason
#

yea you should upload it to huggingface

#

make a sample as well

boreal creek
#

i gave it the link to my huggingface upload

viral mason
#

the model is not uploaded properly, pth and index should be in one zip file

boreal creek
#

lol

boreal creek
viral mason
boreal creek
viral mason
boreal creek
#

it worked

#

thank you for helping

viral mason
#

you're welcome!

viral mason
#

Like I said in chat, if ur gpu is Nvidia download Vonovox and if it's AMD download tg fork

#

Butt

#

Why don't u wanna say

#

You have no need to be embarrassed of why u want to use the voice changer to be w girl

#

No judgement here, unless it's to troll

#

Very judgemental there

#

So you're not gonna troll right?

#

Good

#

Here ya go

#

Was looking for the links

#

Virtual audio cable, connects the voice changer to games or discord

#

Nobody can hear it without that

#

You're welcome

viral mason
#

Nah, for the virtual audio cable just run setup64 then install driver

#

then Vonovox just run start

hallow thistle
#

start.bat

plucky crown
#

Helppp

#

So i got the 3090 pc

#

Its in windows 11

#

Which is the voice softwares in the server more comfortable with? Windows 11 or 10??

hallow thistle
#

I use Windows 11. KikuriHappy

low shard
# plucky crown Helppp

This is a General AI Discord Server, please elaborate:

  • what are you trying to do: TTS, AI Covers, E Girl Trolling / Catfishing or Roleplay
  • the tutorial link used
plucky crown
verbal fractal
low shard
verbal fractal
#

ah ok because i grad it then followed YouTube tutorial telling me to go to a different place and then I saw the warning at the top

low shard
#

it would be best you reply to the questions so i can help you get better tools

low shard
verbal fractal
#

That I'm pretty sure has the legit way of doing it the other one I just Google searched to find

low shard
#

the link you sent was about the original wokada, not wokada deiteris fork which the video talks about, i think you just clicked the first link

low shard
#

and wokada deiteris fork isn't suggested either, it's more "archived"

verbal fractal
low shard
#

those are needed, there isn't a one singular best app that works for everyone

verbal fractal
#

It's ok now that I know it's safe and I didn't screw up somehow

4060it
win10
Roleplay/dnd stuff

low shard
plucky crown
verbal fractal
#

Roleplay for dnd stuff

low shard
plucky crown
#

thats two , do i pick one or both is needed

final bough
#

Hi need some help how do i stop hearing my self from the voice changer

low shard
low shard
proven hill
dire sluice
#

its not working for me

low shard
# dire sluice its not working for me

What's not working? This is a General AI Discord Server, please elaborate:

  • your pc gpu
  • your pc os
  • what are you trying to do: TTS, AI Covers, E Girl Trolling / Catfishing or Roleplay
  • the tutorial link used
boreal creek
#

i consistently heard people say it produces terrible models

low shard
#

Perhaps you confused with an issue that there was on the descriminator of an older version of Applio, or the Overtraining Detector which didn't work well

boreal creek
#

never tried with applio just because what i heard

hardy yew
#

I wonder where you heard about applio being terrible because that seems just wrong to me

boreal creek
boreal creek
low shard
low shard
whole mason
minor bramble
#

why is the voice changer so laggy

edgy minnow
#

What the audio cable im supposed to use? I just reinstalled windows and I do not remember.

low shard
low shard
minor bramble
low shard
bronze socket
#

Good morning. In rvc as anyone had the issue of noise suppression 2 stopping all audio output? My second PC is offline now and for whatever reason i can't get it to output voice

#

I had it working a few days ago but idk what its doing now

bronze socket
#

It def works normally when I have suppression 2 disabled

proven hill
bronze socket
proven hill
bronze socket
#

I'm sorry I'm really not sure

#

Its the browser based one

proven hill
#

you can either download vonovox (better for nvidia gpus) or tg developed fork (more like okada)

#

-rt

patent trellisBOT
# proven hill -rt
🔊 Realtime RVC

Guides for Programs that use RVC Models in Realtime for Calls/Games

• Vonovox

A Realtime Voice Changer with similar performance to Wokada Tg-Develop Fork, with extra features, but it supports only Nvidia GPUs on Windows 10/11 unlike other options that have wider support. and without cloud options

• Wokada Tg-Develop Fork

A personal fork (modified version) of Wokada Deiteris Fork, it just adds some Quality of Life improvements to it like supporting Spin Embedder and Audio Effects. Don't expect too much about it since the creator made it originally as a personal project.

• Applio Realtime

A Realtime Voice Changer with similar performance to Vonovox & Wokada Tg-Develop Fork, with extra features.

• Wokada Deiteris Fork

Deiteris' fork (modified version) of wokada that doesn't get updates anymore.

⛔ Outdated/Discouraged

These options are not recommended for use.

• Original Wokada

Not suggested, older versions in youtube tuts are even way worse. GUIDE

• RVC GUI Mainline Realtime

The program is worse compared to the ones above, and much less updated. GUIDE

bronze socket
lilac peak
#

guys what does "epochs" mean

viral mason
viral mason
#

you're welcome!

proven hill
#

like in real life?

#

(what are you using? whats the gpu you got?)

#

please respond to the question i sent and provide me a screenshot of the program soni can guide you

#

yea dm me the screenshot

viral mason
#

that is extremely outdated

fringe heron
#

This isn’t a basic usage question, i would like to share my idea of why converting a speaker e.g. (A) most of the time it deosnt sound really that good no matter the dataset, ive read some docs and it doesnt seem they cover this that well, if i missed something i am sorry, please dont hate me 🙏

#

I’ve been digging into the RVC training code and experimenting with different pretrained models (like Titan), and I keep hitting the same wall

B to B conversion (same speaker) sounds fantastic (where b is target)
A to B conversion (my voice to target) sounds glitchy, unnatural, and full of artifacts in many cases, even when my voice is similar to the target.

I think I’ve figured out why (might be very wrong)

if i understood right, the rvc's generator consists of four main parts, Textencoder takes phonemes + pitch and outputs a content latent, posteriorencoder + flow encodes the real mel spectrogram into a posterior latent then the flow transforms it to match the prior from the text encoder, decoder takes the latent and the speaker embedding to produce audio.
During training, all these blocks are updated together using the same optimizer

when you fine‑tune a generic pretrained model on a single target speaker B, every part of the generator adapts to that speaker s data including the textencoder.
Even though the textencoder only sees phonemes, it learns to produce a latent representation that’s subtly tailored to B’s vocal tract. the flow and the decoder expect that latent to decode into B’s voice. The encoder is never exposed to other speakers during fine‑tuning, so it becomes speaker‑biased.

The result
B to B works great because the encoder s B bias
A to B breaks your voice A goes through an encoder that outputs a B biased latent, creating a mismatch with the content, and the decoder struggles to produce clean audio

The solution I’m planning
Pretrain the generator on a large multi‑speaker dataset roughly 60 hours of my own voice (speaker a) and 50 hours from 15–16 speakers forcing the encoder to extract only content

then freeze the textencoder and finetune
i still havent tried this and i am curious of thoughts of people that are probably more experienced than me and maybe already tried this.

And why isnt freezing the text encoder a native option?

viral mason
#

the only pretrains I'd use currently are Legacy core 1.5 and 1.6, and the new PABP one although it is a bit new and could be noisy

fringe heron
viral mason
#

titan, ov2, rin3 are very old and cause harmonic distortions

fringe heron
#

its the way you fine tune it

fringe heron
viral mason
#

freezing the text encoder isn't native as it could cause issues to casual users who don't know much on rvc training

#

how do you clean your datasets btw and what are the usual lengths?

analog obsidian
#

freezing te is not a bad idea, finetuning is a bit agressive in rvc and you want to preserve a lot of things from the pretrain

#

i have tried freezing it before but i didn't notice improvements tho

fringe heron
viral mason
fringe heron
viral mason
#

ah kk

fringe heron
#

as almost always they need no preprocessing

analog obsidian
#

multispeaker models have natural timbre leak sadly

#

og pretrain and every other leaks

fringe heron
analog obsidian
fringe heron
#

my idea was to from scratch train a multi speaker pretrained that has 60% of its dataset my voice

#

to see if on my would sound less ai fart due to latent mismatch

#

(my first big explenation msg might be wrong so correct me if i say somethign wrong )

analog obsidian
#

even if your voice is dominant

#

it still sees the other speakers

fringe heron
#

but if you divide speakers by speaker id?

#

like is that bad

#

the leakege

analog obsidian
#

its quite bad sometimes yeah

#

when you finetune the pretrain that gets fixed since every speaker in the pretrain gets override by that singular speaker

#

but the actual pretrain itself leaks

fringe heron
#

and using only my voice wouldnt be such a better idea either i think

#

from scratch

analog obsidian
#

thats what im actually trying, training a singular speaker (ljspeech) from scratch, then finetune other speakers

#

but i havent finished the training bc i dont have money lol

fringe heron
#

but see what i think is that then if you do A to A alone the latent leakes like crazy no?, to make the dec a easier job

analog obsidian
#

with that experiment i confirmed that yeah without any other sids, ljspeech doesnt leaks

odd flower
#

Hi, so elevenlabs using the same RVC for speech to speech? as its S2S is amazing with the same dataset.

analog obsidian
#

like, finetuning the same speaker found in the pretrain?

fringe heron
#

no

#

like

#

if i train from scratch a model with lets say, 70 hours of my speech then the latent is mostlikely leaking my timbre so that the dec has a easier time, so when i finetune it on a speaker b to get a better A to B then the dec has to fight the enc

viral mason
fringe heron
#

but its a possibility in my head

analog obsidian
#

but i haven't tried it with a fully trained model from 0

fringe heron
#

but i assume lj is different from your voice right

analog obsidian
#

im not using nsf-hifigan as the decoder/vocoder tho, and we found that vocoders do have an impact in leakage

analog obsidian
fringe heron
#

sure feading you voice can may not leake but actually being preceptually good and realiable

analog obsidian
#

true hmmm

fringe heron
#

thats why thought about my plan of training a single speaker or multi speaker from sctratch so the text enc is good on my voice and also is hopefully invariant

#

so then i freeze it and fine tune the rest

#

the flow and dec are forced to map my voice to target

analog obsidian
#

you should try your first plan, even if the multispeaker have some leaks, it still knows your timbre

fringe heron
#

also you said that vocoders have their impacts, what do you suggest?

#

i tried refinegan but never really chould say if its really better than the others

analog obsidian
fringe heron
#

okay

#

thanks

analog obsidian
#

no problem and good luck, i really never managed to ever train a functional nsf hifigan from scratch, the model always got stuck at some point where it refused to do any changes to the quality

#

so 1m steps and 1.5m basically sound the same

fringe heron
analog obsidian
fringe heron
#

i feel your pain

analog obsidian
#

the original pretrain was trained for about 2.5m steps

fringe heron
analog obsidian
#

and batch 16

fringe heron
#

okay

fringe heron
analog obsidian
# fringe heron solid

maybe you should try finetuning the original pretrain first and see if your new "finetuned" pretrain + freeze TextEncoder helps

fringe heron
fringe heron
proven hill
fringe heron
analog obsidian
fringe heron
fringe heron
#

i dont think that would help

#

maybe with a low lr

#

but that would make the enc biased to my voice wich is good but then the enc might start leaking my timbre in the latent as that is the easiest path to reconstruct my voice

#

so the dec fights enc anyway

analog obsidian
fringe heron
#

it would need to undo my timbre then

#

maybe you can do single speaker pretrianed if you augment data?

fringe heron
analog obsidian
#

never tried that, idk

fringe heron
#

i mean it makes sense

#

i dont know add like formant, eq and pitch shift (must pay attention on that tho)

#

and also change sids if you augment

#

and with those experiments bye bye 50kW

proven hill
#

@analog obsidian have you ever tried my pretrain?

fringe heron
proven hill
fringe heron
#

i used it

proven hill
#

how was it

#

be honest

#

i know it wasnt good

fringe heron
#

its alright

#

depends thought

#

i had some voices that where kinda trash

#

but never had leakege

#

honestly the main issue is still always the same and its that on my voice is not reliable

proven hill
#

why?

fringe heron
#

not becouse it makes a wrogn timbre but becouse of latent mismatch

fringe heron
#

holy i cant write right

#

sorry for eventual mispelled words

mild ridge
#

Heya. im new to this hub . i tried researching first before asking here. using gpu 4070ti/windows 11. my question : i have decent experience using rvc, trained my own models and used the old wokada for the longest. and after a decent amount of tries i hit a plateau and im looking to improve my model. right now im trying to use the applio to train my model instead of rvc v2. which are the best options to pick from sample rate/ vocoder/embedder. custom pretrain etc.

cloud abyss
#

[gpu_processing] cv2.cuda.GpuMat exists but missing: createGaussianFilter, resize, cvtColor – falling back to CPU.

proven hill
#

well yes me of the past, hence why the pretrain theory

fringe heron
#

like

proven hill
fringe heron
#

oh

#

no

fringe heron
#

i use studio quality

proven hill
#

oh well

fringe heron
#

tried even datasets also similar to my voice, similar acent and all

#

never fixed this issue

cloud abyss
# fringe heron no

deeplivecam error, its using my cpu instead of rtx 4070, i used to use this and get 30 fps now i get 3

fringe heron
#

indeed

proven hill
#

hes a catfisher

cloud abyss
#

wtf

viral mason
#

pretty obvious

cloud abyss
#

my gf is in my pfp

#

calm down

#

i just like being elon musk

fringe heron
cloud abyss
proven hill
fringe heron
#

i see

#

cant help thought, i know nothing

proven hill
fringe heron
#

@analog obsidian for the pretrained multispeaker i might have an idea to no leak and still train to be very good at a specific voice, lets say i have 60 hours of voice A and idk 50 of other 15 speaker, since my goal is to have a pretrained that is very good at voice A but also invariat, maybe if we modify the batches to load idk, 1000 clips of A 1000 of B and so on for all speakers even if speaker F has like 3 hours compared to A since the batch has the same amount of each if the model tries to learn specific speaker it would hurt the other speakers and since gradients have same weight might work better.

analog obsidian
fringe heron
bronze socket
stone jolt
#

[VCClient] wait web server...340 http host wouldnt load

viral mason
gray dagger
#

im trying to train a model using replay (i have a 3060 12gb and im using windows 11) and i get this error

Error creating model: CUDA error: unspecified launch failure Search for cudaErrorLaunchFailure' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information. CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with TORCH_USE_CUDA_DSA` to enable device-side assertions.

#

how would i fix this

#

i have auto epochs turned on

#

replay still gives the same error with auto epochs turned off

#

im on version 8.7.2