#✨│ai-help

1 messages · Page 338 of 1

fringe heron
#

Acent?

plucky crown
#

Yes he also has an american accent and im using a different language and his accent is seeping in to my pronounciation

fringe heron
#

Wdym sleeping

#

Baked in?

plucky crown
#

Well in the recording im speaking in a different language and when i convert it with the model and i play the converted it made the language im speaking like its being spoken with a foreign accent to it instead of native

fringe heron
#

Whats you acent?

plucky crown
#

American accent speaking my southeast asian language

plucky crown
#

It sounds like this

#

But in my recording im speaking like a native i have no accent

#

And the result has accent when i convert it

fringe heron
plucky crown
#

Now what hahaha im done for

fringe heron
#

You convert and get the target acent

fringe heron
#

Maybr you are in luck

plucky crown
#

How

fringe heron
#

I am working on a way to fix this or to atleast make it way less

#

But it requires quite some work

#

Manual work

fringe heron
#

But the goal would be to first teach the model how your voice is

#

And my method is not possible to do in the vanilla applio, i had to modify it by myself

#

But without changing how the model works, so it can be ran in any rvc client

#

I still need to fully test this

brazen holly
fringe heron
brazen holly
fringe heron
#

And speaking style if given enough dataset and time

fringe heron
brazen holly
#

ye

fringe heron
#

Even glitches

brazen holly
#

ye

#

how do I make it better then?

fringe heron
brazen holly
fringe heron
brazen holly
#

but I know Lua, a bit of c php java

#

I'm not new to programming

fringe heron
brazen holly
#

would gladly appreciate

fringe heron
#

So the model is not a single "brain"

#

The file you save is made out of 5 "brains"

#

The first one is called text enc and sees pitch and phone (the actual linguistic content and speaking style (dominated by pitch)

plucky crown
fringe heron
#

The second one is called post enc and its job is to see the real target speaking style so it has access to every speaker dependend info

plucky crown
#

I searched online and most voice clone in github its mostly text to speech and not speech to speech aa

#

But you can look into their github's code and reference the folder path in google ai to read the code on how they make accurate voice cloning in very little audio

fringe heron
#

The third is called flow, The flow during training is taught to convert the post enc output into text enc one

#

The flow is reversible, (during training it will convert text enc output to match what the post enc would have been

#

Then there is the decoder wich takes thr post enc output and tries to make audio out of it

plucky crown
fringe heron
#

Then there is the emb g wich conditions everything as a switch, if 0 speak like person 1 if its 1 speaker like person 2 and so on

brazen holly
#

mhm

fringe heron
#

At inference the post enc is not used btw

#

As its not needed

#

Now there is a loss that forces the text and post to be close, do this means the text will change its behavior to match the post wich sees real speaker dependet stuff

plucky crown
fringe heron
# brazen holly mhm

If trained for enough the text will learn to encode a different acent than the starting point

#

So to fix this, or partially fix this

#

We train first the model on our voicr first but not like the original way wants, we lower the loss that make the text close to post and we sample the imput for decoder from text without using the post

#

Now we can backpropagate all the losses truh the text too (before we chouldnt becouse the dec was getting post)

#

Now given time and data we will have a model pretty much 1:1 to our voice

#

Now time for target

#

Now we freez the text, flow and emb g lower and let only the post (wich is useless in this setup) and decoder update

#

So the dec is forced to work only with how our own voice sounds and try its best to mstch target

#

This wont fully remove the target acent but atleast it wont glitch the voice

#

Or have some wierd mismatches

#

And very important, train eith a precision bf16 or f32

brazen holly
#

dang

plucky crown
#

So we have to create a new version with new code added ? So it adjusts no matter what language we have?

fringe heron
#

But still testing

fringe heron
plucky crown
#

Do we wait first or you want to fiddle with the code, i can give you google antigravity pro for free for a month if it helps getting our problems fixed cuz elevenlabs i looked it up they dont accept voice files freely from any sources without verification

fringe heron
#

Ill go for some time so ill see yall later

plucky crown
#

Hopefully your version works aaa we done for

#

Who is the creator of vonovox and applio can they be contacted

#

Are they in this discord server

plucky crown
#

Or nah empty handed like we are

fringe heron
fringe heron
# brazen holly dang

To be clear i wrote that in a hurry so if its not precise i am sorry, also its semplified 🙏

fringe heron
#

@brazen holly @plucky crown after finetuning the finetuned pretrained (finetuned on my voice) on a voice that has a completly different acent i can tell you that the voice still pretty much keeps the original acent (mine) and speaking style, mostly important it doesnt glitch

#

I listened to a sample now as i am back home

wicked fern
#

hello im searching for the best uncensored ai
i tried grock and its good but there arent a lot of free messages

fringe heron
#

:/

fringe heron
# brazen holly dang

sorry for the bad explanation, now that i have more time i re read it and its ass, let me provide you a better one,

the first one is called the text enc wich sees pitch and phone the actual linguistic content and speaking style dominated by pitch
second one is called the post enc its job is to see the real target speaking so it has access to every speaker dependend detail
the third one is the flow, during training the flow is taught to convert the post enc output into the text enc one, the flow is reversible so later at inference you can run it backwards and turn text enc output into something that looks like what the post enc would have produced
then theres the decoder, it takes the post enc output and tries to make audio out of it (but remember at inference the post enc is not used we feed the decoder from the text enc instead, after the flow has done its thing backwards)
finally theres emb g wich conditions everything like a switch if its 0 speak like person 1 if its 1 speak like person 2 and so on
now theres a loss that forces the text enc and the post enc to stay close, becouse of that loss the text enc will change its behaviour to match the post enc wich sees real speaker dependend stuff, if you train like this for long enough the text enc learns to encode a different acent than the starting point
so to fix this (or partially fix it) we do things differently, (and all trainings must be done with bf16 or f32)

#

we train the model on our own voice first but not like the original way, we lower the KL loss (manually from the config) that is the loss that makes the text enc and post enc stay pretty close together, and instead of feeding the decoder from the post enc we sample the input for the decoder directly from the text enc without using the post enc at all
now we can backpropagate all the losses truh the text enc too (before we chouldnt becouse the dec was getting post)
given enough time and data this gives us a model pretty much 1:1 to our own voice
now time for the target
we freeze the text enc, the flow and emb g, we keep the KL loss low (same value as before) and we let only the post enc (wich is pretty much useless in this setup) and the decoder update
this way the decoder is forced to work only with how our own voice is and try its best to turn that unchangable input into the target, this wont fully fully fully remove the target acent but atleast it wont glitch the voice or have wierd mismatches.

viral mason
#

Vonovox only has two main settings, block size and pitch

#

Unless u don't have the most recent beta

west temple
viral mason
#

I've never had an issue with that on my end with any software

#

the 5070ti has 16gb

#

Vonovox shouldn't be too stressful on ur end either tho

hardy yew
#

TBH if Kuru wants to minimize VRAM usage for whatever reason, then sure, just pick the app that uses less VRAM

#

Deiteris/Tg develop are probably the same, but Vonovox has various stuff replaced, like different RMVPE etc

#

So I guess VRAM usage might differ. I wouldn't expect it to be a signifcant difference, but I've never checked

stoic moon
#

how do i make realistic anime videos

crystal rivet
#

-colab

patent trellisBOT
# crystal rivet -colab
📒 Google Colab Notebooks

Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

• **Applio**

by IA Hispano
Google Colab

• **UVR5 NO UI**

by Eddy
Google Colab

• **UVR5 UI**

by Eddy
Google Colab

• **Wokada Tg-Develop Fork**

by Tg-Develop
Google Colab

• **Wokada Deiteris Fork**

by Deiteris & Hina
Google Colab

• **RVC-AI-Cover-Maker-WebUI**

by Shiro & Eddy
Google Colab

• **FaceFusion UI**

by Nick088
Google Colab

• **FaceFusion NO UI**

by Nick088
Google Colab

• **Music Source Separation Training (Inference)**

by Jarredou & Makidanye
Google Colab

viral mason
winter mason
#

trying to cosplay micky mouse
1660 super
windows 11
need help in server settings audio

#

althought client one is not wroking

severe egret
#

I have a question for you lovely people. Do W- Okada still work for these voice rvc models?

viral mason
severe egret
viral mason
severe egret
#

Oh no so what do I do?

viral mason
#

you probably have an old voice changer what gp do u have? (nvidia or amd)

viral mason
#

is it like, a 3070, 1660, specifics would help better ^^

plucky crown
severe egret
viral mason
plucky crown
severe egret
plucky crown
#

Yeah im not sure maybe theres something wrong with how i put the audio file in but i dont think so?

severe egret
plucky crown
#

@viral mason is there a tutorial on vonovox maybe im doing it wrong

viral mason
plucky crown
#

Can i see your settings? Can u make a short Video tutorial pleaseee like what to import then settings? Huhuu

severe egret
plucky crown
#

@viral mason can u also send a video on how to do it for @severe egret

#

So i can watch it too and can be added whenever anyone asks about vonovox

viral mason
severe egret
viral mason
severe egret
fringe heron
fringe heron
viral mason
#

Oh

fringe heron
#

The converted voice sound wierd

viral mason
#

Yea that makes sense, heavy accents when speaking English or even a odd way of speaking can cause models to sound off

amber raft
#

ooh yall are figuring out sound ai? I need to get into that...

fringe heron
amber raft
#

tts?

viral mason
#

I'd say I'm pretty good at making and working with ai voice models

#

Personally what I make is for realtime use or singing but can also work for tts

amber raft
#

lol

fringe heron
amber raft
#

wait whats rvc XD

viral mason
#

The best tts software I have seen so far is between fish audio and eleven labs

amber raft
#

im new too

fringe heron
amber raft
#

yeah like born 2 mins ago new

viral mason
#

I love this person

fringe heron
amber raft
#

i mean im not new to ai tho

viral mason
#

Rvc is basically just voice cloning stuff

fringe heron
amber raft
viral mason
#

Training a audio file of anything like SpongeBob or just some random person or even sound effects so that it can be used as a voice changer or to replace a singer in a song

amber raft
viral mason
#

Or for tts as well

amber raft
fringe heron
#

Pretty much

viral mason
#

Ye it does both

amber raft
#

how much vram for running like training (on a voice) and running an expressive tts model?

fringe heron
#

@viral mason may i ask you how you train your models?

viral mason
#

Oh I use Applio, specifically on the Kaggle website

#

Applio tho can be downloaded and used locally on your pc if it's good enough

amber raft
#

dear Sapphire... MY BAD 😭

viral mason
#

Ignore Sapphire it's just set to do that forever

amber raft
#

lol yea

viral mason
#

Unless u need actual help the bot is kinda just annoying

fringe heron
#

Or no

#

Settings

viral mason
#

But mostly I go for batch 4

#

8 for bigger models like 40+ minutes

#

Models that are just sound effects like the Minecraft ones I go for batch 1

amber raft
viral mason
#

I'm not entirely sure but u can read up here on the whole thing

#

-rvc

patent trellisBOT
viral mason
#

The applio docs are quite nice

amber raft
#

lol thanks!

fringe heron
plucky crown
amber raft
#

well guys if yall ever need ai image help or anything lmk... i have like 40 more hours of prep before training my model, so have fun!

fringe heron
#

Once that is done the next one is loaded

amber raft
fringe heron
amber raft
#

cool

#

thanks!

plucky crown
#

Also im about to buy voicemod,voicewave,dubbing ai subscription cuz im about to give up

fringe heron
plucky crown
#

Give up on local ones

fringe heron
#

I believe

plucky crown
hallow thistle
plucky crown
#

Really

fringe heron
#

You need to teach the model your voice style first

plucky crown
viral mason
viral mason
fringe heron
plucky crown
# hallow thistle They are.

Did u try them or have a subscription i like joined these 3 voice changer's discord servers and asked on the chat to tell me if its convincing and no direct yes yet

fringe heron
#

7 hours dataset eated in 1 day, 1105 epochs

fringe heron
#

You can edit cool stuff

#

Such as segment, loss weight, hops and more

fringe heron
#

Thats step 1 no matter what

plucky crown
hallow thistle
#

I'm quite clueless.

fringe heron
#

You train a model on your voice

#

First

fringe heron
plucky crown
#

Then what next

hallow thistle
#

But I already know how to train a voice model. Ethel

plucky crown
#

So we train the applio with our voice and make a .pth file and index file?

fringe heron
#

Then the result you train it again over a target, in a specific way

plucky crown
hallow thistle
#

I run Applio RVC on Kaggle, by the way. The first try I tried to train a voice model, it worked for me. RyoMoffster

hallow thistle
plucky crown
#

Woah you recorded yourself for 78 hours?

fringe heron
plucky crown
#

What do we say in the recordings though

fringe heron
fringe heron
plucky crown
#

What are you saying in the recordings

fringe heron
hallow thistle
fringe heron
#

Be you

fringe heron
#

So i am asking, the target, the speaker you train the model on has the same acent as you

#

?

plucky crown
# fringe heron Random stuff

Do we need to be in many emotions like shouting,angry,screaming,happy etc etc in the recordings? Since in applio like u guys recommend the dataset should have many emotions

hallow thistle
#

Do you mean like "accent"?

fringe heron
hallow thistle
#

These are voice models I made, by the way.

fringe heron
fringe heron
hallow thistle
plucky crown
fringe heron
plucky crown
#

Since youre a support of this server i bet this is like very high quality

hallow thistle
hallow thistle
fringe heron
plucky crown
fringe heron
fringe heron
plucky crown
plucky crown
# fringe heron No

Btw you trained in applio too right ? Is it ok if my applio exe is in an external ssd? I saved up for a 2tb ssd and got it just for the local voice changer's files and dataset

hallow thistle
#

Bro.

fringe heron
plucky crown
#

My c drive is almost full though

fringe heron
plucky crown
fringe heron
hallow thistle
#

I was going to tell you about NVMe SSD, but I just doubt because you were focusing solely on RVC.

plucky crown
#

Kingston fury 2tb i got it for equivalent 65usd back then. Now i look up the prices it shot up

plucky crown
#

Its kindof very full now aaa

#

Well python is in my cdrive

fringe heron
#

By external at this point idk what you mean

fringe heron
#

The external

plucky crown
#

External

#

Wait ill take a pic of it

fringe heron
#

I remembered just now that applio has its own runtime

plucky crown
#

Here is the exact model

#

The one with the circle but i take it out of the enclosure and put a handheld fan over it its the only way to decrease the temp that is 77C

#

The ssd goes down to 42C if i do that

#

Thats also why i want it to stay to external ssd

fringe heron
plucky crown
#

Since i can lower the temp while its training since its running in external ssd with a handheld fan pointing to it

fringe heron
#

@hallow thistle over 70 degress ssds goes in termal throttling no?

hallow thistle
#

Applio RVC can run from any drive, I tried, as long the batch file can detect PATH inside the folder. But anything else, running Applio RVC from C drive is perceived as more stable. NoaNervous

fringe heron
hallow thistle
# plucky crown

Is this just a closure? And which NVMe SSD did you put in it?

plucky crown
#

And also the dataset is here too when i was training but python is installed in c drive

fringe heron
hallow thistle
plucky crown
#

I dont get it is it not meant to be in external like it affects the voice aaa

hallow thistle
#

I asked you what SSD you did put to that enclosure, something like WD Blue SN5000 or Kingston Fury Renegrade, and you answered me "it's where you put data to it". BocchiPlushStare

plucky crown
#

Kingston fury renegade yes

#

@fringe heron also what else did u noticee did it match your accent noww

#

After training your voice have you tested with short target voice like the latest tts voice clone that only needs small voice to clone it?

fringe heron
fringe heron
#

Of target

#

The accent was american and mine is italian

#

No problem

#

But I was limited by the quality of the voice model of myself, which was just trained only on 24 hours as a test

plucky crown
fringe heron
#

I still raccomand atleast 60 hours

fringe heron
#

I got just applio, changed 1 file, and trained

viral mason
fringe heron
#

Not the target

viral mason
#

Oh are you guys talking about pretrains?

fringe heron
#

Finetuning a pretrained on your voice

#

In a different method

viral mason
#

Oh ok, I got no clue on those since I've never made one

fringe heron
#

Wait

plucky crown
# fringe heron I got just applio, changed 1 file, and trained

Just try to check the code of the text to speech voice cloning opensourced githubs (that require short amount of target audio to clone it) for their voice training code maybe we can make it adapt for voice to voice changing in applio code where it can do that in lesser target audio length? There are latest ones released ill send the links wait upo

viral mason
#

Long ah message that I'll read in the morning

fringe heron
#

Modify 1 file

#

And train

#

Thats it

hallow thistle
#

For some voice models I made, I used different "pretrain" models. arissip

plucky crown
fringe heron
plucky crown
#

Is the change you made like massive and suchh

fringe heron
#

100 lines

plucky crown
#

Also lemme know beciful maybe you can make it better i still can give you a google antigravity pro acc if you plan to experiment more on the applio code ill gladly help

hallow thistle
#

I don't need Nitro from you.

fringe heron
#

I have modified the code already

hallow thistle
#

Generally, I do not need to modify any Python file in any Python environment. I simply run as it, unless a fellow model maker or an engineer member tells me to tweak something. Meowl

fringe heron
#

Here is needed for people that need max quality for their voice

#

Like singing

plucky crown
#

I thought you were ai back then

fringe heron
#

:|

hallow thistle
plucky crown
#

Since like for months you refused to answer about my gpu question and i was second guessing maybe im talking to an ai lol

fringe heron
plucky crown
fringe heron
#

No

#

The only thing I did is modify 1 file

plucky crown
fringe heron
plucky crown
# fringe heron Like singing

Also if you sing in the recording can the voice model sing too? Since if u asked singing i assume the quality isnt good still?

fringe heron
#

I used krisp for noise

fringe heron
#

So yea later it should

fringe heron
#

You might want to do them without krisp

plucky crown
#

Did u try singing,shouting,etc what did it sound @fringe heron is it still convincing?

fringe heron
#

Vocal fry too

#

Random noise from mouth

plucky crown
#

How about burping or coughing

fringe heron
#

Any type of noise that is doable from you

fringe heron
#

But you need a lot for breaths and cough

plucky crown
#

In voice changers some when its just electric fan noise it confused it as a voice

#

It doesnt do this right like you tested

fringe heron
#

As long as its not in the training files it should be fine

fringe heron
#

I also had hiss in the back ground

#

In half of the dataset

#

Also fine but if possible remove it

fringe heron
#

So you dont talk to a wall

fringe heron
# plucky crown Me during recording

Also let me tell you as for now with me worked but you might get different results, this is to improve the result a lot but might not be perfect okay,

#

So in case of it not being perfect as you might think I am sorry this is still in testing for me too

#

Should work very well and this to me is worth the try

void flume
#

The rules and policies on this discord server requires helpers to ask you "How you are planning on using the voice changer.". That is what the 'goal' refers to.

void flume
#

(edit: removed a number of posts)
I see

#

I'm sorry, I'm not allowed to help you any further, since we have rules against using AI to troll others.

fringe heron
#

@plucky crown if havent done already also download audacity, you will need it

void flume
#

I feel like a jerk now. >.>;

#

sigh

fathom blade
#

do you think pharma actually has potential on AI?

#

because idk

#

privacy

fringe heron
fathom blade
#

because realistically

#

you put too many sensible data on ai

fringe heron
fathom blade
#

sure

#

im not referirng to that

#

they've trained models for years now

fringe heron
#

True

fathom blade
#

wait sei italiano?

fringe heron
#

Si

fringe heron
#

About privacy idk but i dont count on that

proven hill
sacred temple
#

Guys,it's been months,I was making voice model trainings,and now the rvc2 disconnected doesn't work on google colab.What should I do?

viral mason
#

Kaggle is the easiest option

#

Colab is ok but not as good as it only gives like 4 hours max

sacred temple
#

Not having gpu at the moment

proven hill
dim shuttle
#

What's up, guys I was looking for a voice changer in real-time that is also an open source and has the ablity that make me put a voice model btw i have an integrated GPU so any recommendations?

viral mason
#

integrated gpu means u basically can't use it locally, only option in using one on browser

dim shuttle
#

like google colab

#

?

viral mason
#

Kaggle is better but yea

#

this is the best I can offer since I have zero knowledge using any voice changer not locally

dim shuttle
#

thanks

viral mason
#

no problem! if u need help @ the helpers role

void flume
#

||hopefully its something normal... x.x;||

dim shuttle
#

oh just trolling

viral mason
dim shuttle
#

yeah something like that. I'm not like taking it professionally

viral mason
#

ok so ur normal

#

and don't want to use those weird egirl models

void flume
#

pranking is fine, so long it's not harassment, psychologically abusive or erp basically

dim shuttle
#

I'm not into those sadistic things

void flume
#

sorry, trolling can have a range of meanings with people and... unfortunately people do use voice changers for those reasons (and even post videos of their actions on socials)

#

Some just have fun with friends (sudden spongebob voice for example), others ... oof

viral mason
#

most people say trolling and mean they want to use the egirl models to mess with people and then either scam them or catfish them

#

it's basically required to ask what the person means when they just say "trolling"

#

due to the bad rep

dim shuttle
#

fair point

plain cloud
#

greetings to my fellow sentient clankers , i have questions regarding GPT-soVITS please ping me twice if you have any idea about it

fluid harbor
#

how can i use the voice models etc

marsh tapir
#

has anyone used Kimi on desktop? is there a way to get it to access my files please? for some reason it comes it cannot access my files

#
  • Goal (e.g., TTS, AI Covers, Roleplay): Get Kimi to access my filesystem
  • Specific Issue: Kimi is not using my filesystem
  • Full GPU Name: RTX 3080Ti
  • Operating System: Windows 11 Pro
  • Tutorial Link used: none.
viral mason
whole carbon
#

How to make an AI Cover model?

clear vine
#

hey guys is there a tutorial on how to use the voice models on a discord vc

grave oasis
#

whats the best settings for the Every Song of The Album "Thriller" By Michael Jackson
Link: http

plain cloud
#

I need more accurate sound

zinc sequoia
#

Hi guise can n1 tell me how use Kimchi on CloudCod

proven hill
plain cloud
proven hill
plain cloud
#

Huh as I said vonvox does not sound accurate

#

So I am now going to use the best tts

fringe heron
plain cloud
fringe heron
plain cloud
#

Uh sorry but could you rephrase that uh

#

For games ? Like role play / cosplay type

fringe heron
#

Like singing

plain cloud
#

??

plain cloud
#

General talking

fringe heron
fringe heron
fringe heron
plain cloud
#

Yesss

fringe heron
#

May i ask you what the model was

plain cloud
#

Ultimately the input is the problem

fringe heron
plain cloud
#

I made them

fringe heron
plain cloud
#

I believe gpt thing won’t be as fast as vonvox but for me quality beats speed

#

I just need to figure out what transcriber to use to convert my voice to text

#

Then constantly fill gpt with the text from the transcriber

#

And the it converts and outputs it in the fly

fringe heron
#

Please wait

#

Sorry

plain cloud
#

Take your time it’s alright

fringe heron
#

Will be quick

fringe heron
fringe heron
#

But I dont know for sure

drowsy hedge
#

Has mmvc update or is it still the same? i haven't used this for years and i don't know if i have to update it or not

fringe heron
fringe heron
drowsy hedge
fringe heron
#

I am not on pc so i cant just check

plain cloud
fringe heron
#

You get accent of target?

fringe heron
plain cloud
plain cloud
#

Quite different actually

#

That’s why i wish to switch tts

ocean maple
#

whats the best for amd gpus

#

i got a 9060xt

fringe heron
ocean maple
fringe heron
hallow thistle
ocean maple
#

i just wanna be able to use like all the models i find interesting

plain cloud
proud wing
#

hey

#

can yall help me set up my voice changer

#

im intel

void flume
# ocean maple i just wanna be able to use like all the models i find interesting

It is simply policy and part of the rules, something we basically have to ask: 'How are you planning on using the voice changer and those voice models?'
This rule is in place because people use the tool for notorious reasons, some even for hostile ones. We have rules against those. Feel free to glance at the rules in id:guide
If your plan is normal, it's fine.

iron aurora
#

Hey everyone, I'm searching for the right settings to get that AI 'Baby Voice' from Playboi Carti. I'm using the same model as BABY BOI, but when I upload my audio file, it sounds terrible. Any tips? I use Applio.
I use my MacBook M1 Max

viral mason
#

what kind of rp? the video is weird..

dawn gazelle
#

So I am now using the AI voice app called VoiceBox to make AI voices. Is it like with Weights and that I can make a model and use it safely just as long as I don't try to sell anything?

viral mason
#

use Applio to make voice models

dawn gazelle
viral mason
#

I've made a video guide on how to use it via Kaggle which is a browser site

#

are u wanting to make covers and stuff or realtime voice changing?

dawn gazelle
#

Kinda. Using it for content that I do not wanna sell.

viral mason
#

ah okok

dawn gazelle
#

Like having 2 different characters having a conversation.

viral mason
#

definelty Applio then

#

it can use already existing models for that

dawn gazelle
#

But can Applio make a voice model out of a character?

viral mason
dawn gazelle
#

I have the audio of the character and/or model.

viral mason
void flume
#

I guess I'll remove the posts, I just wanted to let you know you don't need to wait for them basically

#

Universe is fine at least ;p

viral mason
#

oki

tawdry shell
#
  • Goal (e.g., TTS, AI Covers, Roleplay): AI Cover
  • Specific Issue: Trying to figure out why my AICoverGen is doing this weird thing where it'll error out in the middle of processing despite no errors on the cmd side of things. It starts doing "connection errored out" after the error trying to process the result.
  • Full GPU Name: NVIDIA GeForce RTX 5070 Ti
  • Operating System: Windows 11
  • Tutorial Link used: https://www.youtube.com/watch?v=pdlhk4vVHQk
quartz moon
#

hi
i wanna use a voice changer but im not sure which one to choose(maybe if there is a cloud one for linux i know there is one for windows)

im on cachyos i have an rx 570 4gb and an i3-8100
uh my goal is just to play games with it

void flume
void flume
#

No

quartz moon
#

ah mb i missunderstood u

void flume
#

I mean the question may seem odd

#

There is a guide on AI-Hub Docs for linux users, but it is sorted per voice changer. So you just have to pick one.

void flume
#

If you're going for the cloud option, then any of them would do. Local, you can only pick one of the w okada ones due to having an AMD card.

quartz moon
#

hi
i ran this command
pw-loopback --playback-props='media.class=Audio/Source node.name=AI_Voice_Cable node.description="Virtual_Microphone"'

to make a virtual device but it doesnt appear in rvc(cloud)
im on linux cachyos
it does appear in discord

#

it also appears in other websites that use microphones so thats weird

#

it appears in input but not output?(even the monitors appear in input but not output)

#

nvm i used some other command and now it works but whenever i press start server it says select an audio input device

#

even tho i did select one

sacred mica
#

hey I was wondering if I want to do TTS, should I go with Fish s2? Currently running a gguf version of it on a 2080 ti 11gb vRam; I don't mind spending money on runpod or smt to train models but I'm trying to accurately create TTS voices that have human emphasizes voice acting like almost
Its js getting the prompt right for it seems impossible is that the best way to goa bout doing it
also is there a reverse speech to text? that can tell me what tags a certain audio would fall for
so i can learn wat i need to do
tyty

simple ore
#

for tts there are plenty, fast and decent - kokoro

#

if you need voice cloning there are other options

simple ore
#

google Whisper ASR

sacred mica
#

im trying to replicate a certain style w fish s2 model

#

so i was tryna see if

#

a stt model

#

could detect

#

the tags yk you can do

balmy cradle
#

If I wanted to use the ai model for a documentary/youtube channel sort of thing, which is the best to use? or just the best to use overall

tawdry shell
#

am wondering if the rtx 5070 is too new for most of the ai builds cuz rn am struggling to get the GPU to run a kernel for voice changing

rustic pumice
#

bruv most of the rvc models i want are from weights.gg

#

none of them are in this server

#

yes

proven hill
#

whats your goal

cyan spear
#

Im SO confused, im trying to get mmvc for my computer, I have an rtx 5070 so im trying to do the thing, everytime I extract the zip it says I

#

Ive tried extracting it to different places but it does the same thing or says

valid arch
#

I wanted to share my project in the project showcase. I know I'm just new here, but what do I need to do to get permission to share my project in the showcase forum?

valid arch
proven hill
valid arch
# proven hill chatting

Thanks. I think the template may not apply to my question either.

I’m not asking for technical support, troubleshooting, model setup help, GPU help, TTS, AI covers, roleplay assistance, or anything illegal/NSFW.

The project is a published Amazon maze book series created with a heavily AI-assisted human production pipeline: concepting, layout assistance, algorithmic maze generation, editing/proofing, cover/art workflow, and publishing prep.

#

🧘‍♂️

proven hill
#

what do you need it for? trolling? catfishing?

hardy yew
#

Smooth xd "if this purpose is against your rules, then I'm doing it for a different purpose"

#

Chill, I'm not even sharing my opinion whether that's OK or not, just pointing out it's quite funny

broken urchin
#

guys

#

im having some trouble

#

how can i make it so the virtual audio cable isnt heard through screenshare

#

whenever i share my screen on discord to someone and turn "Allow to hear sound"

sand iris
#

We still using RVC for voice conversion ? Nothing relevant appeared since ?

broken urchin
#

the Virtual Cable from the voice changer is heard through the desktop sounds

sand iris
#

damncat_seriously

shrewd whale
#

so I got an RVC folder inmy pc but i forgot how to set it up

#

go-realtime-gui.bat or go-web.bat doesnt do anything

tawdry shell
#

hello! I tried installing uvr5 but it's been hanging every time I've tried to process anything on it(Like it would sit at 10% for 2hrs with no change while trying to harvest vocals). Any help would be appreciated!

GPU is rtx 5070ti, I don't plan on using the CPU to process anything.

shrewd whale
#

i reinstalled windows, i forgot how to set them up

hallow thistle
shrewd whale
#

last used RVC in 2024 iirc

shrewd whale
hallow thistle
shrewd whale
#

i think i just need to run the python stuff which i have forgot

shrewd whale
hardy yew
shrewd whale
#

i even have voice models already installed in the logs and weights folder

shrewd whale
shrewd whale
#

i use both before

#

i used to run them on 1060 3gb now im on 4060

tawdry shell
shrewd whale
shrewd whale
#

Apologies, I'm an older user of RVC, last used in 2024, I have just forgot how to set them up considering I have reinstalled my windows and upgraded from Win10 to Win11, and haven't touched RVC again until now

hallow thistle
shrewd whale
#

I remembered I have copied some codes and run them using python I think before downloading the stuff for RVC

hallow thistle
#

Nah, you should go for Vonovox if you mean solely realtime. Mangio RVC is outdated; Applio RVC is better as non-reatime AI cover maker. Mahiro_stares

shrewd whale
#

Some of my voice models are custom trained too

shrewd whale
hallow thistle
#

Mate.

shrewd whale
#

Applio?

hallow thistle
#

Let make things simple, ok? If your voice models are "RVC", then they should work either in Vonovox or Applio RVC anyway.

shrewd whale
#

and the rest of my downloaded ones are downloaded from the old ai hub server and some from weight.gg too

shrewd whale
shrewd whale
hallow thistle
#

-rvc

patent trellisBOT
shrewd whale
#

W-Okada, Vonovox for realtime Applio RVC for not realtime?

proven hill
shrewd whale
proven hill
#

whats youe goal?

shrewd whale
west valley
#

Hi, where can I find the free TTS or voice cloning bot in this server to make a voice text-to-speech? Thanks!

void flume
shrewd whale
west valley
#

I want to upload a 14-second audio sample of a voice, and then type a text for the bot to speak it using that cloned voice (Text-to-Speech).

shrewd whale
#

Like is there any other way for me to use them besides realtime and voice conversion?

west valley
#

@Rumi Yes, exactly. I just want to clone a specific voice from a 14-second file and use Text-to-Speech. Is there a free bot inside the server that can do this now?

shrewd whale
#

beta 17 11 is the one to download as thats only from 3 months ago right? no need to download patch or fix?

void flume
#

depeding on the usecase, you may need different tools, etc

shrewd whale
#

i will also use the virtual audio cable

void flume
shrewd whale
#

Is it better to run the precompiled setup for nvidia gpu or the manual setup for vonovox?

void flume
#

precompiled is easier

shrewd whale
west valley
#

"Thanks for the honest answer, Rumi! I appreciate your time and help."

void flume
shrewd whale
void flume
shrewd whale
void flume
#

oh ok

#

yeah, sorry for asking. It's just, because of the widespread identity fraud, harassment, etc. It's basically against discord's guidelines, our server rules. I'm just hoping people use it for normal reasons, and not those kinds.

shrewd whale
#

I supposed I can now delete my old RVC folder, but make a backup for my existing voice models right?

#

and do i need to keep both the files in logs and weight folder of my RVC?

void flume
#

I never used vonovox, so idk.

shrewd whale
#

Okay, no worries! Ill just test them out

void flume
#

If you wait a little, someone else may be able to help out. (I should be checking the logs and such anyway)

hardy yew
shrewd whale
#

So I have run the start bat but it doenst do anything? Or amn I missing something?

#

I just downloaded the precompiled setup

hardy yew
#

probably need to wait a short moment at first startup

shrewd whale
#

Oh okay!

#

I dont need to install python or smth?

#

Ahj the guii appears now

#

Thanks guys!

hardy yew
patent trellisBOT
#
AI HUB | Technical Support Desk
📋 Required Help Template

To receive assistance, you must provide your system details. Copy and paste the block below into your reply and fill it out.

⚠️ NO INFO = NO HELP

👉 Fill this form:
- Goal (e.g., TTS, AI Covers, Roleplay):
- Specific Issue:
- Full GPU Name:
- Operating System:
- Tutorial Link used:
📍 Quick Checklist Before Asking

Check Docs: Many fixes are in the AI Hub Docs.
Be Specific: Say "RTX 3060 12GB", not just "NVIDIA".
English Only: Keep all discussions in English.

⚖️ Community Guidelines

• No assistance for NSFW/Porn or ANY Illegal Activities.
• Read the [Full Guidelines](#1402790586028789830 message).

shrewd whale
proven hill
shrewd whale
proven hill
shrewd whale
shrewd whale
#

how would in know what to use in the embedder

#

The models that im using was made back then

#

also it seems that the pitch is always on 12

#

by default

hardy yew
#

and default pitch of 12 is just due to male->female conversion being the most common use case..

#

if you want a different pitch value, just change it

shrewd whale
#

that's like one of the earliest models back then

#

weights.gg isnt up anymore, it keeps prompting me to downloa replay

shrewd whale
proven hill
viral mason
#

Only place I know of since weights died

shrewd whale
viral mason
#

Neither

#

That's just how many times the AI went over the audio until the model sounded good

#

It's all up to listening and testing when training

gusty moth
#

-colab

patent trellisBOT
# gusty moth -colab
📒 Google Colab Notebooks

Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

• **Applio**

by IA Hispano
Google Colab

• **UVR5 NO UI**

by Eddy
Google Colab

• **UVR5 UI**

by Eddy
Google Colab

• **Wokada Tg-Develop Fork**

by Tg-Develop
Google Colab

• **Wokada Deiteris Fork**

by Deiteris & Hina
Google Colab

• **RVC-AI-Cover-Maker-WebUI**

by Shiro & Eddy
Google Colab

• **FaceFusion UI**

by Nick088
Google Colab

• **FaceFusion NO UI**

by Nick088
Google Colab

• **Music Source Separation Training (Inference)**

by Jarredou & Makidanye
Google Colab

viral mason
shrewd whale
#

so some index files arent working? what do i do? just let the pth file do its thing?

#

warming up voice conversion takes awhile now?

#

i think my gui bugged out awhile ago i closed the app and then now its stuck on warming up

viral mason
#

Wdym some index files don't work?

viral mason
shrewd whale
viral mason
#

That's weird

#

All index files are the same besides the information that they contain

#

Shouldn't work any differently

tall bobcat
#

how to clone my own voice

viral mason
proven hill
viral mason
viral mason
# proven hill really?

though it has been fucked lately where it won't upload any of my models anymore automatically

#

and the quality of the outputs stinks

#

so there are major flaws

viral mason
proven hill
#

wow

viral mason
#

the discord server for it is very sad to look at, no helpers only one guy running it and he's super inactive

#

I dunno if he even runs the site tbh

proven hill
viral mason
#

at least we know how to help people make models and use the voice changer :D

#

we're not a total mess

proven hill
viral mason
shrewd whale
#

Why some voices are hard to match, is it because of the mannerism of the voice or its just the pitch isnt matching (male to male)

viral mason
shrewd whale
viral mason
#

true yea

nova isle
#

Dumb question but is custom pretrained in applio, gone or

viral mason
#

you may have glanced over it

nova isle
#

im using applio

viral mason
#

one sec lemme start it up

nova isle
#

alr

tacit ermine
#

I created a web gui with kraggle and ngrok however when i upload my first model i get this error "No empty model slot available. Please clear a slot or manage existing ones. " What should i do?

nova isle
#

nvm I saw it

#

brah

tacit ermine
viral mason
#

srry it took so long lol

viral mason
#

are u using applio? or maybe wokada tg fork?

#

such little info is confusing

tacit ermine
#

I cant upload screenshots but after I spun up the ngrok server i went to the url. I continued with the tut and clicked the plus on the left to import model and i entered the pth file in model file

#

But when i click save, it gives the error above "No empty model slot available. Please clear a slot or manage existing ones. "

viral mason
#

I'll assume you're using wokada tg fork then

#

what gpu does your pc have?

tacit ermine
tacit ermine
#

all the details

viral mason
#

yes but if you have a decent gpu it would be better to use Local instead

tacit ermine
#

my gpu is not decent lol

viral mason
#

I'm not familiar with the cloud versions of the voice changers btw so I can't really help here

#

<@&1159293204038955078> give this man some help

nova isle
# viral mason

how do I fix this

Pretrained model sample rate (40000 Hz) does not match dataset audio sample rate (32000 Hz).

void flume
nova isle
#

My pretrained model is 32K bra

viral mason
nova isle
#

yeah

viral mason
#

hmm

nova isle
#

its 32000 Hz

viral mason
#

and you selected 32000 in applio?

nova isle
#

yes

void flume
#

oh ok, then it's fine. (sorry for the question -- just has to do with the rules)

tacit ermine
#

i wish i could upload screenshots but to describe it with text when i click upload model at the top right it says no empty model slot

viral mason
nova isle
viral mason
#

hmm

dense drift
#

How to make ai covers on mobile?

#

-rvc

patent trellisBOT
viral mason
dense drift
dense drift
viral mason
proven hill
dense drift
proven hill
dense drift
nova isle
proven hill
nova isle
#

wait for?

viral mason
#

someone who knows how to fix your issue

analog obsidian
analog obsidian
dense drift
proven hill
dense drift
proven hill
nova isle
#

oh wait wrong one

proven hill
nova isle
#

sorry

#

😭

proven hill
#

oh it was the download one lmao my bad

nova isle
analog obsidian
#

then click train and see what happens

nova isle
#

it works now

#

k ill sleep while training this bai

proven hill
#

not even a thanks misc_sob

nova isle
dawn gazelle
#

Well as I asked before. If I make an AI character voice model. And use it to make content without selling anything. Would I be safe?

tacit ermine
dawn gazelle
#
  • Goal:TTS
  • Specific Issue: Using character voice models. I wanna see if it's safe to make a model of a character and use it for content. I don't plan on selling anything.
  • Full GPU Name: NVIDIA GeForce RTX 3050 Ti Laptop GPU
  • Operating System: Windows 11
  • Tutorial Link used:
dawn gazelle
proven hill
#

you have bad intentione?

dawn gazelle
dawn gazelle
proven hill
#

use applio

dawn gazelle
dawn gazelle
proven hill
dawn gazelle
#

Sure it takes some space but still.

proven hill
tacit ermine
#

My cloud version is working the only issue is the virtual cable. I get nothering audio wise from it. When i switch output to speaker, i hear the audio. When i switch it to my virtual cable to use it, the audio dies. So its a virtual cable issue, does anyone have a fix

unborn goblet
#

Can I improve the results I get from AI by speaking first-order predicate calculus instead of natural language?

patent trellisBOT
# proven hill -realtime
🔊 Realtime RVC

Guides for Programs that use RVC Models in Realtime for Calls/Games

• Vonovox

A Realtime Voice Changer with similar performance to Wokada Tg-Develop Fork, with extra features, but it supports only Nvidia GPUs on Windows 10/11 unlike other options that have wider support. and without cloud options

• Wokada Tg-Develop Fork

A personal fork (modified version) of Wokada Deiteris Fork, it just adds some Quality of Life improvements to it like supporting Spin Embedder and Audio Effects. Don't expect too much about it since the creator made it originally as a personal project.

• Applio Realtime

A Realtime Voice Changer with similar performance to Vonovox & Wokada Tg-Develop Fork, with extra features.

• Wokada Deiteris Fork

Deiteris' fork (modified version) of wokada that doesn't get updates anymore.

⛔ Outdated/Discouraged

These options are not recommended for use.

• Original Wokada

Not suggested, older versions in youtube tuts are even way worse. GUIDE

• RVC GUI Mainline Realtime

The program is worse compared to the ones above, and much less updated. GUIDE

proven hill
#

use tg developed okada

unborn goblet
unborn goblet
#

Okay. Thanks for your help anyway. I'll wait for another responder

unborn goblet
#

You're not allowed to use AI to cause harm to your friends in the form of trolling

#

That's a harmful behavior to troll

#

Well we still have to obey ethical principles

#

But you can use it for having fun with your friends. Or joking to your friends

#

Is that what you need it for?

#

Okay. I'm not a specialist in that area. Is there anyone who can help Zack

void flume
#

I need to process my disappointment again.

shrewd whale
#

another question, what should i use for training voices?

clever gull
#

Hey everyone, I'm 15 and I’ve been building with local LLMs lately. I got frustrated with coding agents leaving messy // TODO stubs and structural duplicates, so I built an AST-based CLI that hooks into Ollama to auto-patch that debt. I'd love to get some feedback from people who know more about LLM-integration than I do.
​Here's the repo: https://github.com/zenapta/BloatHunter

shrewd whale
#

is GTX 1660 applicable for the older nvidia gpu under vonovox like the extra steps for gtx 10 series?

nova isle
rotund dew
#

Hello! When trying to do inference this happens:

Exception in callback _ProactorBasePipeTransport._call_connection_lost(None)
handle: <Handle _ProactorBasePipeTransport._call_connection_lost(None)>
Traceback (most recent call last):
File "C:\Applio\env\Lib\asyncio\events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "C:\Applio\env\Lib\asyncio\proactor_events.py", line 165, in _call_connection_lost
self._sock.shutdown(socket.SHUT_RDWR)
ConnectionResetError: [WinError 10054]

rotund dew
nocturne field
#

Hi my gpu is 1660 super.
im using RVC Deitris fork
windows 11
Goal : roleplay

I'm getting my voices back like it just lags and next person can hear my audio but rigged/lagged/echoed in a robotic form...

Need help in it.

nocturne field
proven hill
nocturne field
#

no like theres a charac nezukos thats one

nocturne field
void flume
#

Oh ok

proven hill
#

no donations

#

what is your gpu?

#

whats your goal? catfish? trolling?

#

not allowed, we will not help, @void flume <@&1159293140440723499>

#

for trans people and roleplay

#

@viral mason this convo might interest you

viral mason
#

<@&1159293140440723499>

#

keep an eye for anyone weird joining

viral mason
void flume
#

I have role mentions enabled by the way

proven hill
proven hill
viral mason
proven hill
brisk trail
#

Goal : Roleplay
Specific Issue: Choppy, distorted and robotic voice

my audio keeps sounding choppy and distorted, and I'm pretty new to vonovox any help would be appreciated here's my settings my CPU is also Core i5 13400K
and idk what to do I want the most quality AND the least delay that I can get, what should I change?

broken urchin
#

yo

#

when will realtime voice changer become perfectly realistic with real emotions

#

i mean like laughing etc

#

i can't wait for that for real

hardy yew
#

not exactly, there is some people that do tons of experiments and put in lots of effort to improve it further

#

obviously probably still far from any breakthroughs

void flume
#

I see

void flume
broken urchin
#

what

#

so it wons progress no more?

#

i don't believe you at all lmao

void flume
#

maybe I should have removed the post. According to capybar there is progress

hardy yew
#

I mean, i just clarified that there is some people that try to improve it. But RVC has been more or less the same for years now so it's not like any changes are likely to happen in nearest future

#

That's what I meant, sorry if the first comment was misleading

brisk trail
#

the character HUNK y'know

void flume
#

yeah

brisk trail
#

that one, i don't do erp at all so yeah 😃

void flume
#

sorry for the question. I'm glad it's something normal 🙂

brisk trail
#

that's completely fine friend! you're doing your job and I deeply respect that!

void flume
# brisk trail Goal : Roleplay Specific Issue: Choppy, distorted and robotic voice - Full GPU ...

Yeah, idk how to resolve the choppiness, but Delay is mostly determed by Block Size, you want to make that as small as possible, but not too small- (you'll notice when your resource usage goes up too much or it starts to get truly choppy). You can increase the Extra (ai-hub generally recommends 2.0) for better quality. I'm not too sure on vonovox to be honest, but I'm sure someone with more knowledge on it may pop up online soon
Anyway also try to adjust the crossfade to be somewhere between 0.08 and 0.15 (higher is better quality wise)

#

The audio sounds more or less like it has been 'pushed up' to a limiter; which may be normal for the model, no clue.
the micro choppiness might be due to an aggressive noise filter

#

I recommend trying to see if it happens without any noise filter enabled, and also disable your noise gate.

analog obsidian
#

dr recommends using the default block size value of 0.3

dim wharf
#

yo i'm not much updated by how often okada devs update but anyone knows when is it gonna release support for rtx 50 series? mine is 5050

hardy yew
#

realtime or offline inference/training?
(anyway, in both cases RTX 50x0 are supported)

dim wharf
#

wait what? it didn't work for me though

hardy yew
#

realtime -> Vonovox/tg-develop w-okada fork
offline/training -> Applio

dim wharf
#

is there a guide how to install anywhere? i might have installed it the wrong way then

hardy yew
#

which one?

dim wharf
#

ill check rq

hardy yew
#

i mean, what purpose

dim wharf
#

prob realtime

hardy yew
dim wharf
#

alr ty

hardy yew
#

just download, extract, then just run start.bat, no setup needed

dim wharf
#

tysm bro

hardy yew
#

(except for the index fix as mentioned above)

dim wharf
#

W support

frigid ore
#

Hey all, I just installed hermet with codex and it keeps missing the first byte? Claude was also giving me greif

#

Sorry, Hermes*

#
  • Goal (e.g., TTS, AI Covers, Roleplay): I just wanna get Hermes working reliably on my T480
  • Specific Issue: Keeps missing the first byte?
  • Full GPU Name: Not relavent, It's the top model of Lenovo T480 Laptop
  • Operating System: Ubuntu
  • Tutorial Link used: Not relevent, haven't used one
#

File "/t...]
⚠️ No first byte from provider in 45s (codex stream, model: gpt-5.4). Reconnecting.
⚠️ API call failed (attempt 1/3): APIConnectionError
🔌 Provider: openai-codex Model: gpt-5.4
🌐 Endpoint: https://chatgpt.com/backend-api/codex
📝 Error: Connection error.
⏳ Retrying in 2.6s (attempt 1/3)...

manic hearth
#

hey can anyone help me pls

#

i got an ai conversion model

#

and it says

#

tuple index out of range

#

any idea on how to fix this

tall bobcat
#

is there a free software that can clean audio? bcs all these websites want subscriptions

proven hill
#

but you need a good gpu

#

what gpu do you have

proven hill
manic hearth
#

it was the one quinton used

proven hill
manic hearth
proven hill
manic hearth
#

They sound opposite and very robotic / static

manic hearth
#

Never knew that was a thing

proven hill
#

roleplay? whatever

manic hearth
proven hill
manic hearth
#

How do I get it working then

#

Also preciate the help

#

I been stuck on figuring this out

proven hill
manic hearth
#

Ty

#

Goat

latent kraken
#

heyo,
you guys know any other ai voice websites that isnt weights.GG

because I lowkey am struggling

grave light
latent kraken
#

i am on task manager

proven hill
latent kraken
#

I have an NVIDIA GeForce GT 1030

#

and if apps don't work then are there any websites i can use that will allow voice cloning and isnt a paid membership because i am broke

latent kraken
proven hill
proven hill