#✨│ai-help

1 messages · Page 253 of 1

simple ore
#

it has onnx in its name

tight bane
#

How can I look for a voice model?

forest vector
#

anyone know how to fix this ?
2025-07-03 22:02:27,643 ERROR [VoiceChangerManager] CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

running deitris' w-okada voice changer, on a gtx 1070.
happens at a random point in time, after 2-7 mins.

simple ore
#

running out of VRAM most likely

forest vector
#

I thought so too, but via task maneger it does not use more than 3/8 gb of dedicated memory

#

i feel like it might be inhibited by something, but cant tell why or how

simple ore
#

anything else running with hardware acceleration? discord / browser / some 3d game?

forest vector
#

discord, 1 browser with a video, and voicemeeter

simple ore
#

any overclocking/undevolting?

forest vector
#

none

uncut eagle
#

can i get help for voice changer ?

outer frigate
#

i want to thank every one who helped me it works

#

@simple ore @hallow thistle

steady otter
#

I used voice.ai it was good enough for what I needed

past juniper
#

do i just leave the embedder on default (hubert_base_112) i also see contentvec and whisper?

#

I have a 4060 and a AMD Ryzen 9 7950X3D

kindred fern
#

My mic is picking up the audio but it says "pipeline is not initliazied"

keen coyote
#

any specific settings that make a huge difference in the voice model? cant tell if the models bad or if its my settings

smoky solstice
#

Trying to run Deiteris' W Okada on an M1 Macbook Pro and getting the following error even after doing the proposed fix of using "xattr -dr com.apple.quarantine" to fix it. On Sequoia 15.2. Anyone have any ideas on what the isuse is?

simple ore
#

I think

#

"
This attribute is added so that it can ask for user confirmation the first time the downloaded program is run, to help stop malware. Upon confirmation the attribute should be removed automatically, and then the program will run normally.
"

smoky solstice
simple ore
#

okay

#

but you did not post anything else

smoky solstice
#

I think the issue might be an outdated MacOS version as a previous person has posted here but I'll post an update after ive updated.

smoky solstice
#

yeah that didnt work i fear 😞

opaque scarab
#

-colab

patent trellisBOT
# opaque scarab -colab
📒 Google Colab Notebooks

Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

• **Applio**

by IA Hispano
Google Colab

• **RVC Mainline**

by Hina
Google Colab

• **UVR5 NO UI**

by Eddy
Google Colab

• **UVR5 UI**

by Eddy
Google Colab

• **Wokada Deiteris Fork**

by Deiteris & Hina
Google Colab

• **Hina's Modified Original Wokada**
• **RVC-AI-Cover-Maker-WebUI**

by Shiro & Eddy
Google Colab

• **FaceFusion UI**

by Nick088
Google Colab

• **FaceFusion NO UI**

by Nick088
Google Colab

• **Music Source Separation Training (Inference)**

by Jarredou & Makidanye
Google Colab

knotty moth
#
  • extra and "force fp32" in advanced settings may do fine grain quality improvement but dont expect if the model itself sounds not good
#

check some models in mvsep.com like some multi stem SCnet & BS roformer, then the drumsep models

placid holly
#

do i need download that 3 file (zip) for RTX 5000 W-Okada?

i only know 7z usually use 01 02 03, but dont know about Zip

simple ore
placid holly
viral mason
#

it's peak

fleet cedar
#

how do i install the voice changer

#

i use an rtx 5070

edgy topaz
#

How do I make AI voice unnoticeable not like in real time like record in weights

simple ore
edgy topaz
viral mason
#

anything else :3

reef yacht
#

can someone tell me why is the voicechanger super delyed?

hasty gust
#

I had realtime voice changer client for 2 years now. is there a new update or a new client??

viral mason
stoic bronze
#

Hey, I’m using AMD and I’m wondering what I should be downloading. VCC is really acting slow on my pc and I don’t know how to fix this - it lags

#

Previously I downloaded a light and working VCC but I can’t remember where it’s at

fleet cedar
simple ore
#

Download all 3 files, then extract the .zip file, it will automatically extract ALL 3 FILES into one. Then open the MMVCServerSIO folder and run MMVCServerSIO.exe (or called MMVCServerSIO if you don't have extensions activated).

viral mason
edgy topaz
knotty moth
edgy topaz
#

So Like I use the voice in weights

knotty moth
edgy topaz
#

So what else do I do to not make Ai recorder or anything unnoticeable

knotty moth
knotty moth
#

there are some post processing effects you could do

#

so try searching it

naive glen
#

Hello, sorry for bothering and excuse my English but it is not my first language, but I have a question about how parrots are made in an alternative case with collab, why I tried to install kohya locally using pinokio, which did work but I don't know why an error occurs that I put all the parameters all the necessary folders to create a Lora but it tells me that the folder has not been found where one puts the images or that path does not exist I tried to do it in a thousand ways to verify that it existed And if it exists but it does not I know why it doesn't take it, so I don't know if anyone knows how to fix that error or in the worst case the truth is I don't know how to make loras in collab, why didn't I find updated links or links that currently work because at least all the ones I looked for gave me an error or something like that, so I would like to know if someone could help me or know something

edgy topaz
#

YASSSSSSSS

earnest forge
blissful lily
steady otter
blissful lily
pliant lily
#

Can someone help me? whenever I have to mix two models, I get an error

pliant lily
#

but I tried, with two 48k models, I tested with several models

#

Can you send me links to some that work?

#

I've already tried running it locally and via Google Collab

knotty moth
#

so try using mainline rvc

pliant lily
pliant lily
#

gg

hallow loom
#

'VoiceChanger' object has no attribute 'resampler_in' what does this mean?

wanton lion
#

everytime i try to launch the start_http.bat file its crashes

hearty dome
#

guys which ai voice changer is good most ive seen are 1 year old are there any up to date ones

low shard
#

start_http.bat is apart of original wokada, ALL VIDEO TUTORIALS ARE OLD, DONT TRUST THEM, wokada deiteris fork is better

low shard
low shard
glacial charm
#

Hello i just want to make a pokemon song but i want to change the lyrics how can i do that

jaunty marten
simple ore
low shard
# jaunty marten Hello, would it be possible to make a voice like this? https://youtu.be/JupFhvq3...

You can search rvc ai voice models at:

if there isnt one, you can:

earnest muskBOT
edgy topaz
#

What's the name?

viral mason
edgy topaz
#

I am buying an computer soon

viral mason
viral mason
edgy topaz
#

Yes

viral mason
#

Yeah you're pretty limited on options until u get the computer

viral mason
#

I don't know of any websites besides weights to record your voice and have it output as an ai voice model

#

Sorry.-.

#

Maybe some helpers or mods know tho

edgy topaz
#

Oh is okay

viral mason
#

Or some QC (idk what it stands for)

analog obsidian
viral mason
#

They're smart tho

#

Like uh

#

Noobies

#

Idk if they're even a QC lol

analog obsidian
viral mason
#

Red or Blu side

#

What's his load out

fair oasis
#

guys may i ask: my regular mic works fine but my virtual cable mic somehow has my computer audio bleeding into my mic. its been messing up my voice ai as a result, what can I do to fix it?

solid sequoia
#

i have installed a model from #1175430844685484042 , no matter what i do though the model won't use my rx 6600 xt instead uses my cpu which kills the performance a lot any way i can fix this?

toxic zenith
#

How an I make a custom decadriver sound?

#

Like the announcer

low shard
low shard
#

what's ur pc gpu?

solid sequoia
low shard
#

!give-meida-perms 1h @solid sequoia

solid sequoia
#

and for tutorial link are you refering to youtube links?

low shard
solid sequoia
#

realtime voice changer for calls and ingame voice chat

low shard
#

theres many different ai programs

#

RVC = Retrieval-based-Voice-Conversion, the best Few Shots Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models. Technically, Mainline RVC does have a go-realtime.bat (aka RVC-GUI), but it's pretty messy and outdated so it's extremely not suggested for realtime.

Wokada = uses RVC for realtime inference. There's 2 main versions, Original made by Wok, and the most suggested one is Deiteris Fork (modified version)

#

do you need wokada or rvc?

low shard
solid sequoia
low shard
#

that tutorial uses an over year old original wokada lmfao

#

I wrote it in the comments

#

you just wasted time using that tutorial basically

#

delete the program, and delete vb audio cable too

solid sequoia
#

it seemed to work perfectly fine though, i just need it to use gpu instead of cpu

solid sequoia
#

which link can i find a newer version on

low shard
#

plus that version sucks for amd

low shard
patent trellisBOT
# low shard -realtime
💻 Local Realtime RVC

Guides for Programs that use RVC Models in Realtime for Calls/Games

• Wokada Deiteris Fork

Most suggested. GUIDE

• Original Wokada

ONLY the latest alpha comes close to the Deiteris Fork performance, older versions in youtube tuts are way worse. GUIDE

• **RVC GUI Mainline Realtime**

Unavailable, the guide is outdated and the program is worse compared to the ones above, and much less updated

low shard
#

read the 1st link

solid sequoia
#

alright thank you

low shard
#

wokada deiteris fork

low shard
solid sequoia
# low shard let me know

it works so much better now both low cpu and gpu usage, more responsive as well thank you, 1 last question though my "echo" "sup1" "sup2" settings seem to be disabled i couldn't find anything about it in the audio setup page you sent me

pine star
#

youtube just keeps saying "Audio renderer error. Please restart your computer"

crude flame
pine star
#

yeah it works whenever i dont have voicemeeter potato on

#

im pretty sure that i followed every step it told me

crude flame
#

Try clicking a1 on both the things in voicemeeter

pine star
#

like this?

#

i clicked on a1 for voicemeeter input and aux1

crude flame
#

And in the first one if that doesn't work

pine star
#

it still dosen work

#

i have light host on too, idk if thats the issue

crude flame
#

Huh that usually fixes it

pine star
#

should i remove the b1 and mono in stero input 1?

crude flame
#

Try turning off the denoiser you have on in the first column

pine star
#

wait it picks up sound

#

but youtube is still broken lol

crude flame
#

So it's picking up sounds just not outputting them?

pine star
#

yeah

#

i have my system input and out put my headset

crude flame
#

In hardware out a1 you have that set to your headphones right

pine star
#

yeee

#

wait how does the imput work?

#

cuz when i talk it dosent input anythign

#

i dont see any instructions to put my headset mic input anywhere

crude flame
#

Not in voicemeeter

#

In w-okada you use your mic input

pine star
#

oh so the wokada output will be the line 1 (virtual cable)?

crude flame
#

Then the output gets put into voicemeeter then into light host then outputs into discord or whatever

pine star
#

Like this?

#

"Once you have completed all of the above steps you can now go into anything and set the mic input to "Voicemeeter Out B2"."

crude flame
#

No

#

Input is your headphones

#

Wait

pine star
#

oh i got confused with the instructions

crude flame
#

I'm confusing myself lol

pine star
#

this is where i got it from

crude flame
#

So input is your mic output is line 1 and monitor you can leave empty

#

I'm going to redo that guide when I get home

pine star
pine star
#

so people know what they are actually changing

#

maybe have a bracket next to instuctions saying what it does

viral mason
#

are these still the best settings for the t-de-esser 2

pine star
#

@crude flame just realised that when i downloaded virtual cable lite it automatically set everything to line 1 and thats why everything is breaking lol

#

apprently on windows 11, there system -> sound input output

#

and theres also system -> sound -> volume mixer input output

viral mason
#

that's confusing

pine star
#

yeah

#

so much inputs and outputs

carmine mica
#

program to use the templates?

low shard
low shard
# carmine mica program to use the templates?

This is a General AI Server, we won't be focused on voices anymore

Elaborate:

  • your PC GPU
  • your operative system
  • what you want to do
  • what tutorial link are you using
  • a screenshot of the program
carmine mica
#

for voice models to create TTS with rvc

swift thunder
steady otter
swift thunder
#

like that of Mvsep

steady otter
swift thunder
brittle wing
#

Hey i have a question, is there an AI tool for generating subtitles? I want to show something to my friend but he doesn't understand it cuz it's in my native language and not english

crude flame
# pine star yeah

updated the guide anime_pray also found a way to not route system audio to voicemeeter so you dont have to deal with missing audio

i just tested it on a fresh version of voicemeeter so it should work

simple ore
#

depending on the quality of the audio and language you may get something decent... or not

#

you can upload the audio to youtube and let me make a transcript

simple ore
#

norton antivirus?

fleet cedar
#

i dont have any antiviru

simple ore
#

weird

brisk grove
#

@simple ore what do u recommend for me to use for the ai voice in games?

fleet cedar
#

bitdefender

#

help

fleet cedar
simple ore
fleet cedar
simple ore
#

you need to download all 3 files, use 7-zip to unzip

#

dont use windows BS

#

it is the worst

fleet cedar
#

can i use winrar

brisk grove
#

what do i download for the voice changer?

simple ore
#

winrar should be able to handle the split archive too

#

or winzip

fleet cedar
#

ill get winrar

#

or 7zip

#

@simple ore now it works after i installed 7zip

brittle wing
#

how do i fix

#

this

brisk grove
#

@fleet cedarwhat do u have that u put the voice to?

brisk grove
#

to be able to use the voice

#

like voicemod

fleet cedar
#

@simple ore can u give me the right settings for my GPU

simple ore
brisk grove
fleet cedar
#

idk

fleet cedar
#

its obviously the AI rvc voice

brittle wing
brisk grove
#

@fleet cedar

simple ore
ionic oak
#

Hello, I don’t know if this is the right place to ask, but I don’t know how to change my voice in real time. Which app should I use? Because I have several models that people sent me, but I don’t know how or where to use them?

simple ore
#

-rt

patent trellisBOT
# simple ore -rt
💻 Local Realtime RVC

Guides for Programs that use RVC Models in Realtime for Calls/Games

• Wokada Deiteris Fork

Most suggested. GUIDE

• Original Wokada

ONLY the latest alpha comes close to the Deiteris Fork performance, older versions in youtube tuts are way worse. GUIDE

• **RVC GUI Mainline Realtime**

Unavailable, the guide is outdated and the program is worse compared to the ones above, and much less updated

brisk grove
#

@fleet cedaralr i got it setup now how can i get it to actually work in game and in discord?

amber verge
#

some reason it doesn't open

dapper rain
#

Is this channel for dedicated RVC assistance?

low shard
low shard
# amber verge some reason it doesn't open

This is a General AI Server, we won't be focused on voices anymore

Elaborate:

  • your PC GPU
  • your operative system
  • what you want to do
  • what tutorial link are you using
  • a screenshot of the program
dapper rain
#

Gotcha but you guys will still keep up RVC troubleshooting support still?

low shard
#

if you need help, pls elaborate

plain pumice
#

Yo

#

My voice changer is stuttering

#

It sounds so bad

#

Its like tweaking

#

My words are transfering but its static

low shard
# plain pumice My voice changer is stuttering

This is a General AI Server, we won't be focused on voices anymore

Elaborate:

  • your PC GPU
  • your operative system
  • what you want to do
  • what tutorial link are you using
  • a screenshot of the program
plain pumice
#

Oh

low shard
# plain pumice Oh

voice changer is too generic, there could be over 100 programs classified like that lol

#

you need to elaborate more to get help, else we dunno even how to help

plain pumice
#

XD

#

I cant screenshot here

#

Can I just dm?

low shard
#

!give-media-perms 1h @plain pumice

low shard
#

also it would be better u elaborate everything

#

all the infos i asked are crucial

plain pumice
#

So basically

low shard
plain pumice
#

Im trying to use a voice changer on

#

Yes

low shard
#

you're using an over year old version of original wokada

#

its the same as using windows xp in 2050 basically

plain pumice
#

LOL

low shard
#

also vb audio cable has been reported to use issues on windows

low shard
plain pumice
#

How do I still use it

low shard
#

you can simply uninstall everything and forget you even watched it basically

low shard
#

you need a better one

plain pumice
#

Like?

#

Whats the new wakada

low shard
plain pumice
#

Gpu?

low shard
plain pumice
#

Intel core i5

low shard
#

cpu = central processing unit

plain pumice
#

Oh

low shard
#

gpu = graphics processing unit

#

this isn't chatgpt, it runs on your hardware and its way more intensive and complex

plain pumice
#

Oh

low shard
#

gpu does all the complex tasks like gaming, 3d, and AI

low shard
# plain pumice Oh

You can check your pc gpu on Windows via:
ctrl+shift+esc (task manager) -> Performance tab -> GPU

#

so yeah, don't expect a 1 click experience, AI is not really user friendly

#

chatgpt runs on everything bc it runs on cloud, remote good pc

#

while this is a program that runs on your hardware, not a rich hardware by someone else

plain pumice
#

Intel (r) uhd graphics

low shard
#

check gpu 1 and gpu 0

#

the one you mentioned is integrated graphics, it's literally too weak to do any type of AI and to even get recognized as a GPU lol

plain pumice
#

Oh

low shard
#

don't expect AI to run on bad hardware, it's more intensive than gaming

low shard
plain pumice
#

How do I check?

low shard
#

you should have gpu 0 and gpu 1 maybe

plain pumice
#

Alr

low shard
plain pumice
#

Ok

low shard
plain pumice
#

I know I may be wasting ur time

#

(I definetly am but not on purpose)

#

Im just if you call it.. A caveman when it comes to checking ur pc and allat stuff

#

@low shard

low shard
# plain pumice

yep you don't got got any other GPUs

You got 3 options:

About Cloud, there are different services:

#

so yeah that's why it was laggy, other than being an old version with worse performance, you also got bad hardware, so not a good combo

#

the best options are just either using cloud or buying a better pc

plain pumice
#

How do I use cloud?

low shard
#

reminder that it's more complex and you got limited free time btw

#

and you also need to give your phone number, as it's a google service and they dont want you to use alt accs

plain pumice
#

Uj

#

Ik

#

Google is always like that

#

Like a really secure software

low shard
#

This is a General AI Server, we won't be focused on voices anymore

Elaborate:

  • your PC GPU
  • your operative system
  • what you want to do
  • what tutorial link are you using
  • a screenshot of the program
#

rvc and wokada do 2 different things

#

RVC = Retrieval-based-Voice-Conversion, the best Few Shots Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models. Technically, Mainline RVC does have a go-realtime.bat (aka RVC-GUI), but it's pretty messy and outdated so it's extremely not suggested for realtime.

Wokada = uses RVC for realtime inference. There's 2 main versions, Original made by Wok, and the most suggested one is Deiteris Fork (modified version)

#

they do not want u to use alt accs lol

exotic elm
low shard
#

-realtime

patent trellisBOT
# low shard -realtime
💻 Local Realtime RVC

Guides for Programs that use RVC Models in Realtime for Calls/Games

• Wokada Deiteris Fork

Most suggested. GUIDE

• Original Wokada

ONLY the latest alpha comes close to the Deiteris Fork performance, older versions in youtube tuts are way worse. GUIDE

• **RVC GUI Mainline Realtime**

Unavailable, the guide is outdated and the program is worse compared to the ones above, and much less updated

low shard
#

1st link, wokada deiteris fork

long obsidian
#

Anyone know realistic ai photo generator which can use your face to make a picture

toxic zenith
# low shard what? u mean model?

yeah, I mean custom text to speech, I see decade driver models here but i don't know how to convert text to speech into my style

long obsidian
#

Thank youu

long obsidian
#

And can i feed it data to create pictures from it

runic thunder
#

hii

#

how can I create an ai voice?

long obsidian
#

@simple ore and also which out of these do u think is the best omnigen2/fluc/SDXL

raven barn
#

literally have no idea how to train with a 5070

#

it just doesnt wanna work

toxic zenith
#

how can I make my own text to speech?

knotty moth
#

and do manual install with latest pytorch (2.7) and cuda 12.8

#

the original rvc and even mangio won't work at all

raven barn
#

i did that

#

like so many times

#

i dont know what i did wrong

knotty moth
#

if not the latest release

#

then just double click run-install.bat

#

it should include torch 2.7.1 which is needed for RTX 50-series

raven barn
#

okay ill try this

long obsidian
#

@knotty moth can u check dms for a sec i asked u a question there i can send it here too if u want

long obsidian
#

I saw flux and onnigen2 are good ones but which would you recommend me

raven barn
#

just incase i may have

raven barn
#

i think i found it

toxic zenith
#

where can I train AI voice using google colab

long obsidian
#

@knotty moth so what do you think

raven barn
#

currently trying it rn

#

downloaded pytorch 2.7 and cuda 12.8

#

didnt do anything

#

proof

#

idk but im tempted on just getting out my 4070 and using that

#

cuz ik it will work

#

is it more or less the same thing?

#

ill look into it when i wake up

#

been trying to get this to work for ab 10 hours

#

rvc doesnt work

#

appolio doesnt work

#

i hope but thxsob_pray

simple ore
simple ore
long obsidian
simple ore
simple ore
raven barn
raven barn
#

I got that line straight from their website

#

Even installed cuda 12.8 or wtv

simple ore
#

i require a screenshot of how you done it

#

because there are 20+ who said it works fine

raven barn
#

The only version of pyhton I could get to work with that version of PyTorch was 13.11.9

silver lynx
#

hey is there still like a list of what settings to use with which gpus

raven barn
silver lynx
knotty moth
#

you were trying on 2.7.0

raven barn
#

Oh

knotty moth
# raven barn Oh

if you run through run_install.bat it should have installed torch 2.7.1

raven barn
#

I’ll look at it again when I wake up later

simple ore
#

RVC1006Nvidia requires a small manual fix

low shard
# runic thunder how can I create an ai voice?

This is a General AI Server, we won't be focused on voices anymore

Elaborate:

  • your PC GPU
  • your operative system
  • what you want to do
  • what tutorial link are you using
  • a screenshot of the program
hybrid flame
#

Hello, can you please advise if there is an up to date guide to install and use RVC on a pc with AMD graphics card (7800xt) for real time voice changing? Thanks

limber siren
#

Hi, did Weights remove the option to log in with github?

hybrid flame
#

I don't understand, since I'm not a developer, I'm just here to use it.

hybrid flame
willow cipher
#

why does echo and suppresion can't be turned on?

#

nvm figured out

long obsidian
#

@simple ore hi i got a question about the installation my gpu is nvidia 5060ti and my cuda version is 12.9 on the pytorch the newest one is 12.8 but the flash attention doesnt support the pytorch version of 2.6.0 with this cuda version what should i do? Should I download the 12.4 one and the standard flash attention? (this is about the omnigen2), will it work like that despite my gpu being newer version

simple ore
#

as long as you have torch cu128 either 2.7.0 or 2.7.1 it is fine

#

download cp version that matches your python install (3.10, 3.11, 3.12)

long obsidian
#

@simple ore so i need to install with this: pip install torch==2.7.1+cu128 --index-url https://download.pytorch.org/whl/cu128 and install one of the flash attn versions corresponding to my python (3.11) uve listed?

simple ore
#

you install torch and torchvision pip install torch torchvision --upgrade --index-url https://download.pytorch.org/whl/cu128

#

then you download cp311 flash wheel and use pip install the_name_of_the_downloaded_file.whl

long obsidian
#

i just found out that the python version ive downloaded is 3.13 if i download 3.11 will it work?

simple ore
#

nope

#

3.13 is not recommended for anything, too many incompatible libraries out there

long obsidian
#

how can i swap it to 3.11, just delete it?

simple ore
#

install 3.11, make sure you select 'add to path' checkbox on the install screen

long obsidian
#

alr imma do this rn thanks

simple ore
#

then remaking the virtual envrionment using 3.11

#

py -3.11 -m venv venv

long obsidian
#

there are multiple versions of 3.11 (after the 11 which version should i get)

long obsidian
#

ty im downloading the stuff rn

long obsidian
#

is it normal to be this slow for a simple image

#

also @simple ore how can i train models for it (can i actually do it)

#

im seeing this for the past 5 min

simple ore
#

did you make sure you have cuda torch installed?

#

check device manager/performance / memory and vram use

long obsidian
sinful mango
#

Hello all !
I am a newbie here and not a developper at all, I dabble a bit and mostly just surf, read, and do lots of trials & errors.
I am currently working on mods for Cyberpunk 2077 for my private usage (not for sharing on nexusmods or else for copyright issues).
I saw a lot of "voice ai" swaps for the main character and wanted to create my own with a voice actor I really appreciate (a french dubber for a character in a tv show).
I recently tried Zonos to create TTS audio with a sample of the french dubber and the result is quite good.
But that is just the beginning, now I am in front of the hardest part :
Take all the audio files of the character in the game , and create modified audio files of those source files but with the cloned voice I get on Zonos.
And so I have two options :
Either I get all the text of those audio files and script something with python to batch generate the audio files using Zonos.
Either I find a tool allowing audio-to-audio by using a cloned audio reader (is that even possible and does it exist ?)
My configuration is : i9 14900 / 64gb ram / RTX 4090 / Windows 11 Pro

Any help/pointers would be deeply appreciated ^^ (and I repeat : I am not a developer, I can dabble and am willing to learn but consider me an utter noob)

long obsidian
#

from here i have done the steps till 2

#

then i deleted the pip files and followed ur instructions

#

i didnt do the 3.2 tho

#

i will try doing that rn

long obsidian
#

also this is the requirments txt file

torch==2.6.0
torchvision==0.21.0
timm
einops
accelerate
transformers==4.51.3
diffusers
opencv-python-headless
scipy
wandb
matplotlib
Pillow
tqdm
omegaconf
python-dotenv
ninja
ipykernel
wheel
triton-windows; sys_platform == "win32"

simple ore
#

install the requirements, then upgrade torch

viral mason
long obsidian
#

i changed the text file to

torch==2.7.1
torchvision==0.22.1
timm
einops
accelerate
transformers==4.51.3
diffusers
opencv-python-headless
scipy
wandb
matplotlib
Pillow
tqdm
omegaconf
python-dotenv
ninja
ipykernel
wheel
triton-windows; sys_platform == "win32"

and i will check if it works now

#

i copied the torch and torchvision from the cmd from the installing step before it

#

its still sitting on 0/50

#

and doesnt move

#

can u type step by step what i should do to fix this (sorry if im being to annoying

simple ore
#

You likely have cpu torch

#

reinstall cu128

long obsidian
#

with this command? pip install torch torchvision --upgrade --index-url https://download.pytorch.org/whl/cu128

simple ore
#

activate the environment

#

then run that

long obsidian
#

okey

#

it says this

simple ore
#

or venv\scripts\python -m pip install torch torchvision --upgrade --index-url https://download.pytorch.org/whl/cu128

#

i dunno why you're using conda

long obsidian
#

should i delete it

#

and start anew

simple ore
#

regular python venv

long obsidian
#

and if i delete conda and the file itself (conda create -n omnigen2 python=3.11
conda activate omnigen2) i change the conda part here to venv?

simple ore
long obsidian
#

oh okey

#

and after making this enviroment do i have to activate it or just go on?

simple ore
#

after that is done, pip install torch torchvision --upgrade --index-url https://download.pytorch.org/whl/cu128

#

and pip install flash.whl

long obsidian
#

i will try that rn

sinful mango
viral mason
#

Alrighty I can help out in dms since these two are doing nerd stuff 👍

toxic zenith
#

I mean using custom models, not pretrain models

viral mason
#

I think they're talking about something like uberduck

#

Like how it had TF2 tts

long obsidian
toxic zenith
#

yeah, for text to speech

#

i want to make my own sound on decade driver

viral mason
#

I dunno how the rvc works I just know how to clean datasets and how to read graph

#

Trust

#

Whoever is typing RN your name is all rectangles lmao

#

It's scary

#

I am proof that anyone can make a voice model as long as they try

analog obsidian
#

TTS are zero shot, no dataset needed, only a few seconds
rvc requires training, 10 mins minimum for increased consistency

#

fun fact: rvc core component (vits) is 'hacked' in order to do speech to speech conversion instead of tts

#

yup actually rvc first "name" was just vits

toxic zenith
#

ok

#

so how can I create an AI voice model?

#

rvc pls

analog obsidian
long obsidian
#

@simple ore i tried reinstalling and following your order again but it stays the same just 0/50 0%

hallow thistle
analog obsidian
simple ore
long obsidian
#

rn im trying running it without the flash-attn to see if it will work

simple ore
#

would it be easier for you to just use comfyUI?

tribal cosmos
#

for live voice changers, is w okada still the best or is there something new?

patent trellisBOT
# simple ore -rt
💻 Local Realtime RVC

Guides for Programs that use RVC Models in Realtime for Calls/Games

• Wokada Deiteris Fork

Most suggested. GUIDE

• Original Wokada

ONLY the latest alpha comes close to the Deiteris Fork performance, older versions in youtube tuts are way worse. GUIDE

• **RVC GUI Mainline Realtime**

Unavailable, the guide is outdated and the program is worse compared to the ones above, and much less updated

long obsidian
#

i will try rq to reboot pc and see if it will work

tribal cosmos
simple ore
#

read the guide above

tribal cosmos
#

oh thank you

long obsidian
tribal cosmos
simple ore
#

use that

tribal cosmos
hallow thistle
tribal cosmos
hallow thistle
#

Don't.

hallow thistle
tribal cosmos
toxic zenith
#

ok

#

may be i try to tts in another app

tribal cosmos
toxic zenith
#

then use rvc to emulate decadriver

hallow thistle
toxic zenith
#

I think none of u all know about kr decade

hallow thistle
toxic zenith
hallow thistle
#

While Applio has TTS feature built-in, it's edge-tts, the RVC itself isn't TTS.

toxic zenith
#

by training models using sounds from csm toys

#

and tv series using UVR5

hallow thistle
hallow thistle
toxic zenith
#

i don't mean audio driver fr

#

i mean changing sound in my decade driver bootleg toy model

tribal cosmos
#

or do i just straight do the exe

viral mason
#

The exe

tribal cosmos
#

oh ok thx

#

also

#

how do i uninstall my old one

#

i had w okada

#

like 2 years ago

viral mason
#

Just delete all files related to it

tribal cosmos
#

thats it??

viral mason
#

Should just be in that folder unless they got replaced by the new stuff

viral mason
tribal cosmos
viral mason
#

I don't think so

tribal cosmos
#

oh wtf

viral mason
#

Pretty sure the files would've already been replaced by default unless you chose to skip or smth or didn't get that option after extraction

tribal cosmos
viral mason
#

Wdym

#

I'm slow

#

:3

tribal cosmos
#

nahh its okay lmaooo

#

yo i have another quesiton

viral mason
#

Yah?

tribal cosmos
#

i havent doen any ai voice stuff in years

#

but when i was making models

#

i used to use appolio

#

is there a new one ppl are using now or is it still apollio

#

becuase i feel like alot has probbaly changed

viral mason
#

There's applio and mainline

#

And local rvc stuff which idk anything about

#

But applio still exists ya

#

Nothing new I know about

long obsidian
#

@simple ore i tried doing the comfyui one u suggested but im always getting this error

simple ore
long obsidian
#

i did that

tribal cosmos
tribal cosmos
long obsidian
#

here for an example

tribal cosmos
#

is this bad?

viral mason
#

Uhh

simple ore
#

sweeet mother of god

viral mason
#

Try running it again

tribal cosmos
#

okk imma try

long obsidian
simple ore
#

works fine

long obsidian
simple ore
#

with the files placed into right places

long obsidian
#

i have same settings same stuff

silver lynx
tribal cosmos
viral mason
#

Btw if u wanted we could move the convo to dms

tribal cosmos
#

ohh okayyy

viral mason
#

Ye

long obsidian
#

@simple ore is it possible for me to screenshare for us to fix the omnigen2 local version in vc here?

#

pls i wanna kym at this point 😭

simple ore
# long obsidian <@155030383648440320> is it possible for me to screenshare for us to fix the omn...

cd OmniGen2

py -3.11 -m venv venv

venv/scripts/activate

pip install -r requirements.txt

pip install torch torchvision --upgrade --index-url https://download.pytorch.org/whl/cu128

pip install https://huggingface.co/lldacing/flash-attention-windows-wheel/resolve/main/flash_attn-2.7.4.post1%2Bcu128torch2.7.0cxx11abiFALSE-cp311-cp311-win_amd64.whl

python inference.py --model_path "OmniGen2/OmniGen2" --num_inference_step 50  --height 1024 --width 1024 --text_guidance_scale 4.0 --instruction "The sun rises slightly, the dew on the rose petals in the garden is clear, a crystal ladybug is crawling to the dew, the background is the early morning garden, macro lens." --output_image_path outputs/output_t2i.png --num_images_per_prompt 1```
long obsidian
#

i will try it now

simple ore
#

show vram use

#

add --enable_model_cpu_offload parameter to inference call

long obsidian
simple ore
#

that parameter should help

long obsidian
# simple ore add `--enable_model_cpu_offload` parameter to inference call

let me try this, so i add it at the back of this python inference.py --model_path "OmniGen2/OmniGen2" --num_inference_step 50 --height 1024 --width 1024 --text_guidance_scale 4.0 --instruction "The sun rises slightly, the dew on the rose petals in the garden is clear, a crystal ladybug is crawling to the dew, the background is the early morning garden, macro lens." --output_image_path outputs/output_t2i.png --num_images_per_prompt 1

simple ore
#

at the end

long obsidian
#

alr

long obsidian
simple ore
#

yes

#

can try the other too

long obsidian
#

can u send it

#

as code

shadow orbit
#

help

#

mic is working but i can't hear myself

#

on okada voice changer

#

:((

forest vector
#

for some reason, the "2025-07-03 22:02:27,643 ERROR [VoiceChangerManager] CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
"- error is fixed when I run a game of sorts on the same PC.

forest vector
simple ore
#

and/or --enable_sequential_cpu_offload

shadow orbit
#

no sound at all

#

when i hit passthru i hear myself tho

forest vector
#

probably the voice isnt converting

shadow orbit
long obsidian
hushed thicket
#

how do i add a voice model

low shard
#

This is a general AI server, we can't know which program you're talking about, so that's why we need more information on the questions I asked you please

hushed thicket
#

oh i downloaded the realtime voice changer and a voice that I want and im confused on how to add it

#

is there a tutorial video?

low shard
#

and your pc gpu and operative system is crucial too

low shard
hushed thicket
#

yea i followed a old one

low shard
#

AI runs at sonic speed, youtube tuts aren't the best for ai programs, they get outdated easily

low shard
#

it's best you also forget everything they tell you in it, they also tell outdated info like using "crepe"

low shard
#

also, we don't endorse anything that duckus does

#

duckus is an horrible person

hushed thicket
#

ooh i didnt know that

#

i just wanted a simple tut

low shard
#

he makes money off catfishing people for the pure fun of it

#

AI should be used for good and fun, not for catfishing

hushed thicket
#

yeah i agree

#

I want to use it to troll my friends and some randoms

low shard
patent trellisBOT
# low shard -realtime
💻 Local Realtime RVC

Guides for Programs that use RVC Models in Realtime for Calls/Games

• Wokada Deiteris Fork

Most suggested. GUIDE

• Original Wokada

ONLY the latest alpha comes close to the Deiteris Fork performance, older versions in youtube tuts are way worse. GUIDE

• **RVC GUI Mainline Realtime**

Unavailable, the guide is outdated and the program is worse compared to the ones above, and much less updated

low shard
#

read up the 1st link, wokada deiteris fork

#

this is the only updated tutorial

#

wokada deiteris fork got various improvements in performance and quality

hushed thicket
#

ok i installed it

#

and the virtual cable

low shard
hushed thicket
#

how do i input a voice

low shard
hushed thicket
#

yeah

low shard
#

also, if you share a screenshot of your wokada, i can help you with settings

shadow flax
#

can someone give me a simple video on how to install RVC :,) ?
(NVIDIA, WIN11)

low shard
shadow flax
hushed thicket
low shard
#

RVC = Retrieval-based-Voice-Conversion, the best Few Shots Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models. Technically, Mainline RVC does have a go-realtime.bat (aka RVC-GUI), but it's pretty messy and outdated so it's extremely not suggested for realtime.

Wokada = uses RVC for realtime inference. There's 2 main versions, Original made by Wok, and the most suggested one is Deiteris Fork (modified version)

#

I think what you want is actually wokada deiteris fork, right?

low shard
hushed thicket
shadow flax
low shard
#

-realtime

patent trellisBOT
# low shard -realtime
💻 Local Realtime RVC

Guides for Programs that use RVC Models in Realtime for Calls/Games

• Wokada Deiteris Fork

Most suggested. GUIDE

• Original Wokada

ONLY the latest alpha comes close to the Deiteris Fork performance, older versions in youtube tuts are way worse. GUIDE

• **RVC GUI Mainline Realtime**

Unavailable, the guide is outdated and the program is worse compared to the ones above, and much less updated

low shard
#

read up the 1st link, wokada deiteris fork

#

it has better quality and performance

#

and it supports the rtx 50 serie

shadow flax
low shard
#

AI moves at sonic speed, dont trust yt tuts for everything

low shard
hushed thicket
golden walrus
low shard
hushed thicket
golden walrus
#

I have EQ band and compressor enable

low shard
golden walrus
shadow flax
#

do i need to download all 3?

low shard
#

also, you didn't set up the settings, show an entire screenshot

shadow flax
golden walrus
#

ye, sharp decline in quality

low shard
analog obsidian
golden walrus
#

emoji_141 i know it has something to do with the bitrate

hushed thicket
#

what settings do u recomend?

golden walrus
#

on it, sire

low shard
golden walrus
#

wait, is it the same with bad mic effect?

mystic sky
#

What is the most reliable or update version of this AI in real-time?

analog obsidian
low shard
golden walrus
analog obsidian
golden walrus
analog obsidian
#

but the docs recommends one app

#

1 sec

shadow orbit
golden walrus
#

ah Kiloheart

analog obsidian
#

yea just get kiloheart bitcrusher

golden walrus
#

cat_pawbite hmmm, i hope i can listen playback so i can adjust the stat to get cleaner

shadow orbit
#

can someone help with models upload..

shadow orbit
shadow flax
golden walrus
#

POPOcat damn, the playback sound so good but discord just bonk me hard

shadow flax
#

whats this even meaaan 😭

#

im actually gonna crashout

#

why can i unzip like everything but not this particular zip

#

dw i got it extracting using a different tool

viral mason
#

that simple

#

no need to read

#

just

#

do that

#

in fact

#

that video should be pinned

#

it's easy

#

videos like these should be in the server for simple people like me

raven barn
#

i already have it installed

#

but it still doesnt work

analog obsidian
viral mason
raven barn
viral mason
#

lemme look

shadow flax
simple ore
viral mason
simple ore
#

altough I dont like your ( ) folder

shadow flax
raven barn
viral mason
shadow flax
#

where can i find the command line?

analog obsidian
raven barn
#

ive never had this issue

shadow flax
#

whys everything not worky

#

idk if i leak my IP by sending that or not lol

viral mason
#

let's go to this guy's house and fix it for him personally

raven barn
#

okay i got it to work

simple ore
raven barn
#

i fixed it

simple ore
raven barn
#

is this even doing anything wtf

simple ore
simple ore
shadow flax
raven barn
simple ore
raven barn
#

both of them?

#

or just hte last one

simple ore
#

or just use a different path c:\training\data

#

put your files there

shadow flax
#

why does it not want to take the model? (does it have to be in the actual WebUI/app directory?)

#

it literally just does not want to take the model

#

its even in the model -> 0 something directory

simple ore
viral mason
simple ore
# shadow flax

illegal combination is using WASAPI input and MME output

#

use both WASAPI

shadow flax
raven barn
shadow flax
simple ore
#

input and output devices

#

use both WASAPI type

shadow flax
#

also how can i delete the "saved" audio

long obsidian
#

if i have 32k sample voice can i use text to speach and save the wav file to extract the voice and train it to higher samples?

long obsidian
long obsidian
simple ore
#

just use 32k audio

#

no, use a proper denoise / vocal extraction

long obsidian
#

whats the best app that can isolate podcast sounds and make it pure voice audio

#

and whats the best app to download youtube videos as audio files

brittle wing
#

how can I fuse 2 models in applio ?

narrow sun
#

I have so much ms can someone help me

neat grove
#

ok so will gtx 1650 sup + ryzen 3 3100 will be good on w-Okada cause i hear a lot of background even if i put steelseries gg mic or when i talk on my native language i don't think it's trained on my language cause like there is some word that ai can't say it and it will be obvious to anyone that it's ai

forest vector
#

maybe if you do male to male conversion, and u already sound a bit like the model itself

pine star
#

where do you guys usually get sources from to train ai for rcv?

neat grove
neat grove
forest vector
#

what setting do you do your crossfade at

torpid turtle
#

hii all

#

just found a perfect live walpaper but its not well looped

#

you can clearly see that the vid replays each time

#

is there an ai tha can help with this?

shadow flax
#

could someone share with me what the optimal settings are for Wokada Deiteris Fork?

viral mason
#

kaggle applio keeps doing this

#

it's pissing me off

neat grove
forest vector
#

id say 0,12-0,15s

#

depending on the results u get

forest vector
#

pretty sure, or everything I heard till now wasnt the best there is

umbral frigate
#

Tbf i do think the model sounds realistic

forest vector
#

are breathing, and other misc sounds also realistic ?

umbral frigate
#

But anything other than that is cooked LOL

forest vector
#

ahh ... okay thought I missed something

umbral frigate
#

So in my experience ive learned to just adapt to it

forest vector
#

that helps alot ye

umbral frigate
#

Bc ive been using the same one for a while i kinda know how to speak with it

neat grove
forest vector
umbral frigate
forest vector
#

stage fright huh

umbral frigate
forest vector
umbral frigate
forest vector
analog obsidian
#

obviously people that dont know about ai will not immediately fall for this, but give them a couple of minutes talking with u and they're gonna spot its rvc quickly

pastel oak
brittle wing
analog obsidian
#

that was literally how i learned about rvc, some random dude was talking with me and i started to notice his voice was weird

pastel oak
#

i can giggle and moan with ai u guys got nothing on me

umbral frigate
#

Idk why but my microphone works fine but it wont convert my voice everythings silent

torpid turtle
#

chat

#

help please

umbral frigate
torpid turtle
#

just found a perfect live walpaper but its not well looped
you can clearly see that the vid replays each time
is there an ai tha can help with this?

umbral frigate
#

like hahaha?

neat grove
forest vector
low shard
umbral frigate
analog obsidian
umbral frigate
#

ive been told if you make the mic shittier it sounds much more believable

analog obsidian
#

younger me would also notice it's ai

crude flame
crude flame
umbral frigate
umbral frigate
analog obsidian
umbral frigate
#

Shes speaking like a voice actor

analog obsidian
#

i only heard the second one and i assumed the first one was also rvc but unedited lmao

crude flame
brittle wing
forest vector
neat grove
analog obsidian
crude flame
umbral frigate
#

LOL

crude flame
#

one of them is real

#

and dont skip to the laugh

#

that will make it to easy

forest vector
#

if the 2nd one was rvc id say we actually should gib up trying to recoqnize

analog obsidian
#

sorry i cheated

#

XD

brittle wing
crude flame
#

smhhh

analog obsidian
pastel oak
# crude flame vewy easy

dunno if you pay attention to it like i do but can tell the 2nd is the real one based on the pronounciation of "I have" in the beginning joe_weird

#

and if the first one is real then i question my existence

analog obsidian
#

people have to learn rvc can't be realistic no matter what u do

forest vector
analog obsidian
#

razer always share that audio

forest vector
analog obsidian
#

rvc is 2023 tech

#

its not a voice cloning ai

#

what the ai is trying to do is literally reproducing mel specs and pitches

#

is not trying to clone expressions or shit because its not meant for that

#

all results are flat asf

crude flame
#

vits2 rvc when?

analog obsidian
#

good rvc models can fool the more casual side of the internet

#

but they will find out it's ai, rvc will glitch in any moment

forest vector
analog obsidian
#

i have been training models since 2023

low shard
analog obsidian
#

i know whats inside rvc, i know how it does the conversion

#

and i know it cant do everything

forest vector
#

maybe not that specific way u have in mind it cant, I agree.

crude flame
#

pretty much anything non verbal rvc cant do

analog obsidian
#

the embedder is trash

forest vector
#

but somewhere down the line I believe it will become hard to recoqnize

crude flame
forest vector
#

now its ... meh

analog obsidian
umbral frigate
forest vector
#

but really impressive compared to what we had back in the day

crude flame
#

no one cares about ai audio so nothing is happening to it 😔

forest vector
#

but when I hear real voice actors im like "what am I doing 💀"

analog obsidian
#

we do have real voice cloning ai

narrow sun
analog obsidian
crude flame
umbral frigate
analog obsidian
# umbral frigate elevenlabs?

eleven, chatterbox, yeah they're real voice cloning ai because the ai is actually learning and reproducing expressions

#

rvc doesn't learn expressions

umbral frigate
#

Would be pretty nuts

analog obsidian
#

when you give rvc an audio, it'll extract the mel spec and the pitch data alongisde the features of it

#

then it'll try to reproduce them afterwards

analog obsidian
umbral frigate
#

Is there even an estimate for how far in the future until we get updates on rvc

analog obsidian
#

rvc is SOTA

umbral frigate
#

SOTA?

analog obsidian
#

like the best sts

umbral frigate
#

like no updates

analog obsidian
#

the reason why it doesn't learn emotions is because is using a pretty old architecture named vits

crude flame
umbral frigate
#

Why tf

analog obsidian
#

because tts are superior

#

arch wise

#

they can learn emotions and non verbal sounds better

umbral frigate
#

yeah but that means sts is neglected

#

if everyone on rvc went to tts

crude flame
#

no one cares enough about sts to update it

umbral frigate
#

i think sts is cool

analog obsidian
#

updating sts it's a really hard task

analog obsidian
#

rvc interally it's a hack of a tts architecture actually

#

rvc-boss took vits, and did a couple of changes in order to "convert" it to sts

#

if we remove those changes, it'll be regular tts vits

umbral frigate
#

So if there were to be strides in sts would they have to build it from the ground up?

#

Or can rvc as it is rn be improved

#

and its just shitty architecture

analog obsidian
#

yea no the arch is too shit to be updated

umbral frigate
#

yeah damn

analog obsidian
#

some devs of here have tried