#✨│ai-help
1 messages · Page 252 of 1
but i dont know another app
thats the bad part
isnt there another one that does it better? because my rvc gui is acting up so bad these days
what's your gpu?
4080
thank you
try inspecting the spectrogram to detect popping
by having input audio speak in japanese
yo @knotty moth

nope
I can't accept private chat with anyone I don't know
so if you need some help
explain it here
Is there a way to make my ai voice have more emotion or is it not possible?
@knotty moth hi could you tell me which rvc fork to install for my gpu - i have nvidia 5060 ti and the mangio one says it cant work with it
it doesnt matter, the latest version of Applio has vast improvements, including resolved issues from mangio
sure i will try it rn
for RTX 50-series, you need to do manual install with the latest pytorch and cuda 12.8
is there a guide for that
i just tried running the applio but it wants me to have python
then install it (version 3.10)
and dont forget to "set Python to PATH"
How to (unofficially) use Applio for RTX 50 serie cards
Follow to download it as said it in https://docs.aihub.gg/rvc/local/applio/
After you extracted the precompiled, go to the path in Windows explorer, write "CMD" and press enter, then in CMD write env\python -m pip install torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --upgrade --index-url https://download.pytorch.org/whl/cu128
If you get any already satisfied requirement issue, run env\python -m pip uninstall torch torchvision torchaudio then the command said above
Last update: Apr 01, 2024
thank you so much <3
yw lmk
hey
nick
do you have trhe download for the voice thing for a 3060 ti gpu and a cpu of amd ryzen 7
which software should i download
also i have a new razer micwhich is like 120 or smth i stole ts from target
will this effect how good my audio is
@low shard
for the voice changer i highkey factory reset my pc n lost it
does anyone know any easy to use site to create ai covers with the voice models?
When I use Colabgen it says "Exception: 'NoneType' object has no attribute 'setdefault'"
<@&1159293140440723499>
Sry for the ping but I need help
does vonovox work on linux?
?
Check the console window for details
can someone give me a link to a colab where you can train models (that isn't applio)?
how long should an 8 minute dataset take to finish training?
Trying to get the Ai voice to come out of Vrchat but nothing works the audio from the ai just wont speak through vrchat
I've found that if your running it on the same gpu of the game it can cause it to stop working because it has no resources to use because of the game
either lower game settings and turn on culling, or try to optimize the voice settings to be easier to run
guys what is the minimum gpu to operate okada smoothly?
https://huggingface.co/wok000/vcclient000/tree/main what to do if in this voicechanger the voice is very choppy and sounds unnatural. some sounds sound bad. although I did everything according to the guides.
help pls
depends on what your doing
only to change voice when on vc etc
id imagine like a 1080/ti would probably be minimum to do that smoothly
indefinitely, unless you decide how much epochs to go
the spec requirements are here
Last update: May 5, 2025
sorry if this isn't something you know about but you seem like knowledgeable and involved here. but do you know when support for the rtx pro 6000 will be better?
try the 5000-series voice changer version
@honest gate
rtx pro 6000 is Blackwell architecture with sm_120
thank youu
you need to elaborate, also, moderators aren't even all helpers, no need to ping moderators, it's the wrong role
This is a General AI Server, we won't be focused on voices anymore
Elaborate:
- your PC GPU
- your operative system
- what you want to do
- what tutorial link are you using
- a screenshot of the program
also, that's a link to original wokada, it's not suggested anymore
This is a General AI Server, we won't be focused on voices anymore
Elaborate:
- your PC GPU
- your operative system
- what you want to do
- what tutorial link are you using
- a screenshot of the program
This is a General AI Server, we won't be focused on voices anymore
Elaborate:
- your PC GPU
- your operative system
- what you want to do
- what tutorial link are you using
Elaborate:
- your PC GPU
- your operative system
- what you want to do
- what tutorial link are you using
- a screenshot of the program
-realtime
Guides for Programs that use RVC Models in Realtime for Calls/Games
Most suggested. GUIDE
ONLY the latest alpha comes close to the Deiteris Fork performance, older versions in youtube tuts are way worse. GUIDE
Unavailable, the guide is outdated and the program is worse compared to the ones above, and much less updated
1st link
please help in me in https://discord.com/channels/1159260121998827560/1389301878801567744
Why is my voice twitching when I use rvc gui ?
because it is old asf
-gui
RVC-GUI is an old RVC fork program from 2023, and the thing has no rmvpe.
There's a better one.
What are you using now? Can I have the link?
I use Applio RVC. What is your PC GPU?
i 7 gen 12 and rtx 2060
-rvc
That NVIDIA GeForce RTX 2060 is good enough for RVC and AI. Make sure to read the guide for more information.
Guys. where can i get a voice changer app that is free?
W-Okada the realtime voice changer, is a free and open source program. What is your PC GPU?
Geforce RTX 5060
Is that good enough?
also is w-okada safe? not a bun
no viruses or what not
realtime for calls right?
-realtime
Guides for Programs that use RVC Models in Realtime for Calls/Games
Most suggested. GUIDE
ONLY the latest alpha comes close to the Deiteris Fork performance, older versions in youtube tuts are way worse. GUIDE
Unavailable, the guide is outdated and the program is worse compared to the ones above, and much less updated
1st link
wokada deiteris fork
it ha better performance and quality
and rtx 50 serie support
ye
yes, it's open source, you could even read the code yourself
NVIDIA GeForce RTX 50 series GPUs are too new for AI. There's only one W-Okada that is developed for RTX 50 especially. https://docs.aihub.gg/rvc-voice-changer/local/deiteris-w-okada-fork/#download-nvidia-rtx-5000-series-on-windows
Last update: May 5, 2025
plus it loads only the weights, so makes sure no model is infected
im confused since it wasnt like this before i believe i was using rvc before and uh i dont recall anything else except that it was easier to download but sure i'll do the ai help forum thing
Is that an issue?
!howtoask
How To Troubleshoot 
- Don't simply mention your issue, like "
my rvc is not working". - Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
- The more context, the better.
- Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
- It's okay if you're frustrated, but don't take it into this server.
- Don't DM without prior consent.
- Don't ask for every little instruction. Put your own effort & test things by yourself.
- Don't ask to ask.
- Check if your answer is a Google search away/on our guides website.
no just confused and nothing else
i told u to use #1192011222023950368 to ask for help
u can also use this channel if u want
Too many questions you asked, but imma answer anyways. RTX 5060 is good, kind of. W-Okada is a free and safe program, if you download from trusted sources. If one of your antivirus programs deleted the program as a "false positive", make sure to allow it back and re-extract again. Also, W-Okada doesn't provide you a bun; that's for a restaurant.
Not sure what the bun pun means. also im confused if i should download the lind in the middle that says rtx5000 series or all 3. bcs it says download all 3
no, u should download only the rtx 5000 serie one
u got an rtx 5000 serie gpu
alright, i see the bun thing now xd
Download the RTX 5000 one since you have GeForce RTX 5060. The thing is you would download these three files there from GitHub, as stated on the guide.
any recommendations on what rvc to use?
This is a General AI Server, we won't be focused on voices anymore
Elaborate:
- your PC GPU
- your operative system
- what you want to do
- what tutorial link are you using
- a screenshot of the program
There's Different ones
depending on your needs
and your hardware
if you don't elaborate
RVC for regular voice converting or W-Okada the realtime voice changer?
we can't help
uh unsure?
sorry if im bothering
Please elaborate all I asked, so we can help you
Then I don't know where to continue.
@fossil glen ^ pls elaborate this
AI is an intensive and complex task, it's not easy to use
sorry 3060 12gig vram
windows 11
wanna use it in voicechat ig
no tutorial
dono
rvc doesnt stands for realtime voice changer
What do you want to do?
There are plently of RVC programs available, each for different usages. For example, Applio is RVC, while the realtime voice changer is W-Okada. Also, "RVC" doesn't always mean "realtime voice changer".
RVC = Retrieval-based-Voice-Conversion, the best Few Shots Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models. Technically, Mainline RVC does have a go-realtime.bat (aka RVC-GUI), but it's pretty messy and outdated so it's extremely not suggested for realtime.
Wokada = uses RVC for realtime inference. There's 2 main versions, Original made by Wok, and the most suggested one is Deiteris Fork (modified version)
which do you need?
uh rvc is not few shot
Let me know if you got any idea. 
you wanna change your voice in realtime? use w-okada
right yeahh i remember using mainline back in the day i believe. i guess i'll use what you recommend for realtime so wokada?
huh? users don't have to train thousands of users for a model, they just need to few shot the pretrain
-realtime
Guides for Programs that use RVC Models in Realtime for Calls/Games
Most suggested. GUIDE
ONLY the latest alpha comes close to the Deiteris Fork performance, older versions in youtube tuts are way worse. GUIDE
Unavailable, the guide is outdated and the program is worse compared to the ones above, and much less updated
thank you!! for being patient with me!❤️
you're welcome, let me know for any issues
rvc is finetuning
W-Okada the realtime voice changer. Yes, you seen that right. I use Deitris' fork W-Okada.

@simple ore do u think RVC would be better classified as few shots or finetuning? asking since im maybe confusing things up
few-shot : a form of prompting
finetuning : training on top of a pretrained model
where do i put the dataset in the colab aplio?
Last update: June 15, 2024
and what should be the dataset path?
Hello, is there a way for the ai voice not to saturate when there is a breath or a laugh?
its explained all in the guide
no
rvc cant laugh well
i did but I see this in the colab:Starting preprocess with 2 processes...
0it [00:00, ?it/s]
Preprocess completed in 0.01 seconds on 00:00:00 seconds of audio.
it's a shame the voices here are so good compared to some software.
is your dataset in .wav?
how did u make it
yes it wav
show a screenshot of ur dataset and the dataset path
it worked
i used the dataset maker in the aplio interface
can some tell me how to fix this
Simply, you can't
It's outdated asf
It got replaced by ai cover maker UI #📰│dev-updates message
@placid storm what's your PC GPU

is ur dataset 50 hours?
so you're trying to teach ai how to reproduce sound by just using 7 minutes?
from scratch the model doesnt know anything
it needs over 50 hours to learn things
with a pretrain model, you can train any length (minimum 10 mins)
you're finetuning said pretrained model
so the end result forgets the pretrained model (most of it)
what i am trying to say
is to always use a pretrain
the model anything about it after training is done, so lets say a japanese dataset wont randomly have a english accent coming from the pretrain
training without pretrain: teaching a baby how to speak german; training with a pretrain: teaching someone who knows 50 languages one more.
use index files
they're there to minimize that pretrain leak
but from scratch you need over 50 hours
u most probably used the og pretrain, ive seen cases where people accidentally unticks the pretrained option in applio yet the model still loads the og pretrain
bc from scratch the model sounds like a robotic mess
why mangio
mangio rvc is merely a ui edit of mainline
a very old mainline version
so like
you can just download mainline
if u dont want applio
someone needs to make updated rvc yt tuts 
i asked noobies to bring back slice normalization and the removed options from mainline
yeah at the end, mangio, applio, they're both rvc1006 inside
wait no
mangio is behind 1006
thats not the option to decide the output of the voice model, that option is to tell the ui which sample rate are both models
applio just detects it automatically
im not a big fan of applio either
yup
one thing applio has different about merging is the ability to merge 32k models
something that wasnt possible in mainline
mangio is almost 1 year older than the latest mainline update
if u love mangio i'd recommend the original rvc instead
extremely similar ui

hmmmm, i also don't like applio's inference, i don't like how it handles the volume, but it's weird since the code for the inference is actually the same used in mainline
one thing they did different is the embedder, mainline uses fairseq version of cvec, applio uses the transformers (actual name btw lol) version of contentvec
they're the same thing but idk i may be crazy but i believe mainline's cvec is better
yup we use the term "mainline rvc" for the original rvc, not sure why
i like the name original rvc more
another thing different is that since 3.2.8 applio comes with a experimental setting enabled by default, wild right?
it does change how the model sounds
and i did noticed that specific experimental setting may cause the model to artifact more
i say may because its so inconsistent
multiscale mel loss
exp/f0_spin branch allows you to disable multiscale and use the same loss used in mainline (L1)
yeah
hi, can I ask for some help to config the voice changer here?
applio > rvc > train > train.py
now it comes disabled by default since i asked for that
i did some training comparisons and they sound almost an exact copy of each other
check if your train.py has the option to disable multiscale mel loss
no
just open the .py file in notepad
but now that i remember, no, you can't disable this in 3.2.9
it's hardcoded
so what this new loss intends to do is to make the model generate more stuff
it should make the model sound more clean
less ai
back then that was tested only with singing datasets
and for this i can say it does make them better
adds more voice range and the high notes sound super clean
high end ringing 
but yeah it causes this issue lmao
what kind of artifacts?
set volume envelope to 1
yeah thats a very common issue with applio
i spoke with noobies about that too
he showed me the code of the volume envelope was the same used in mainline*
but somehow it's broken in applio
yeah
its super weird
the solution for this was removing volume envelope from applio
it's gone now in the experimental branch
volume envelope from applio? most probably, yes, since it was always this broken
but in mainline works perfectly fine
well it was removed because its literally broken
it caused things like you're saying
super loud distorted breaths
bad volume
mhm
one thing that applio also removed from mainline was slice normalization
so if u feel that your model trained in applio is more quiet than your mainline model it's because applio doesn't boost the volume levels of each slice
and for no reason, extracting a model from the G file also got removed in applio
well i asked for those features to be brought back
since the reason given for their removal was silly
"slice normalization caused very loud distorted breaths"
no, thats the broken volume envelope of applio

the only difference i noticed is volume
one is louder, one is quiet
imo i like to hear my models without turning my volume up to 100
because god, removing slice normalization really causes models to sound extremely quiet
in applio you'll get
clean model with possible ringing added
quiet volume if your dataset is quiet
ringing?
well imo this depends of the set, if your set is naturally louder, the model will still sound loud
but not every dataset is loud, i have some that are quiet and i rely on slice normalization to make them louder
what i do is to preprocess the dataset in mainline, then train in applio
a weird static sound present in high frequencies
ez fix
this is not possible in every audio, and no, i'd rather normalize the thing rather than just boosting the gain
idk what normalizing even does tbh, I just tried it out in audacity and the audio didn't seem to change at least visually
gain has increased risk of distortion and clipping
normalize does not
what's clipping mean?

I get distortion but I'm kinda slow
Greetings
holy shit optimus prime
Hello, please elaborate:
- your PC GPU
- your operative system
- what you want to do
- what tutorial link are you using
- a screenshot of the program
This is done so we know how to help you, else without any info its pretty hard
Well I am a CS Student interested in Data Science and want to build Projects i have some basic understanding of code but I am here to explore
Well Hello There
ending this with:
applio and the original rvc are the same thing, BUT applio did changed some settings/stuff that do change how the model is going to sound compared to "mainline rvc", but said impact aint that big
if we remove those custom settings applio set, you are gonna get almost exact results in both trainers

what changes :p
transformers cvec > fairseq contentvec (this good, fairseq is too old and slow)
multiscale > l1
no slice norm > slice norm
(i think dependencies are also updated, which is also good)
idk what any of that means but anything major like how the model would be able to speak better on applio compared to mainline vice versa
again I'm not a nerd I just know how to make models :3
hi can someone help me with my voicechanger
Transformers mentioned
i dont know how i should do the settings
they're the same identical thing, none can speak better than the other
next they might bring up autobots
so applio just bugs sometimes and has that weird noise in it while mainline doesn't?
distorted breaths occur in applio if you set the volume envelope below 1
it's a problem with how the inference works
if u use the same model in mainline and set the volume envelope below 1, the breaths are gonna be fine
applio and mainline rvc are the same shit
end

none is better than the other
This is Optimus Prime
To all Units
Opening Space Bridge
My Friend here wants to meet Autobots
applio updated nerd stuff inside mainline to make it faster
but the actual nerd code that trains funny models are the same exact
oh
thats it lol
so one is just faster
yea thats the only change
why didn't you start with that 😭
Hmm ok People Bye 👋 will see you around

can someone help me with those AI voice models and stuff?
Nah buddy I am sorry
No clue bruh
sure
so basically when i go into voice models i see alot of settings but idk how to do it
when i try it sounds like very AI
I made a voice model of you btw, pretty cool
could you show me what you're talking about?
Info:
Batch Size: 6
Dataset Length: 30 Minutes
Hop Length: 32
Pretrain: DMR V1
Sample Rate: 32K
where do i do this
if u can't send pictures here send it to me in dms
this is the information the person used to make that model
check dm pls
So should I stay
if u want to
I am here to Learn if you can help me in that
Nothing much I want guidance from somebody experienced in AI
it depends on what you're trying to make, I only know how to make ai voice models
Hmm works for now
Hello, excuse me if I'm bothering you, and excuse me if my English isn't the best because it's not my first language, but does anyone know how to install Kohya locally so I can train my LoRas?
what would you want to do exactly?
Will think of an idea and will ask how to implement optimus prime voice in that
Will ping you if somthing comes up thank you for replying take this as a Token of Thanks
np I enjoy helping if I can
can someone help me, im trying to blend 2 voices (one personally trained on my native langueage) the other one is downloaded and trained on english, but when i try mixing them im getting error and this is written in the console, how can i fix this and blend them
trying to blen two different sample rates, my best guess
and is it possible to do it and how can i do it
You can only blend models with the sample sample rate
thanksss
@simple ore can i then train the english voice model with bulgarian audio without and how can i do it
also it says error
you can train a voice model using any audio, but it may end up with a small accent (or large if you use an index) if you infer english audio
full file name and path
T:\models\mymodel.pth
can u tell me which options on the train thing i should do to train a model with audio data from another langueage to adapt it
sry for asking stupid things
there's no difference for trainig option based on the language of the dataset
???
im using the mae voice changer and in the discriptions its telling me to use ( Info Batch Size: 8 - Dataset Length: 19 Minutes - Hop Length: 32 - Pretrain: DMR V1 - Precision: FP32 - Sample Rate: 32K ) but i dont know where to put these settings
it does not tell you to use that
it is just a description of the model
how it was trained
oh alr
thanks
have u used the mae one before? im kinda lost on what settings to put on her
mae voice changer?
Do you mean a RVC model you found on the voice models channel, right?
I would suggest you to read the docs.
-realtime
Guides for Programs that use RVC Models in Realtime for Calls/Games
Most suggested. GUIDE
ONLY the latest alpha comes close to the Deiteris Fork performance, older versions in youtube tuts are way worse. GUIDE
Unavailable, the guide is outdated and the program is worse compared to the ones above, and much less updated
Probably you didn't set something properly.
isnt this one the ai help channel
Oh, my bad sowwy.
i just dont know like what tune and index stuff like that
For that same reason, i would suggest you to read the guide.
Here.
yknowwww a video tutorial would help out a lot of people bc most don't want to or are too lazy to read
But the bad thing is that yt tuts get outdated pretty easily. (And some of them tend to have wrong/outdated info/tips)
So that's why we don't recommend them at all
not youtube just
like record the process without any annoying voice over or complex explanations
for the voice changer at least
that would be easy, download it and vac lite, extract both, show how to pt voice model in and what the best settings are for whatever gpu and boom done
pin it somewhere in the server
I did it wouldn't work stable, had to switch to running it on my 5080 instead
I noticed a problem with the voice changer where I would enunciate a word and yet the output would produce a completely different sound. An example I've run into is the world "horrible" turning into "orbee". is there a way to mitigate this?
Actually no, if you run into issues like that, it's probably because the model you're using wasn't trained properly or was made with a short/bad dataset.
Just try with another model. (or check if you didn't set some setting right)
could also be they're using an older version of the voice changer
W-Okada, of course.
is your index value over 0?
how do i make the ai voice changer sound more natural
There you have.
Last update: May 3, 2025
thanks
yeah, but it does the same thing even at 0
You're welcome bud.
Does Vonovox work on linux by any chance?
github says nothing about linux
when i set the voicemeeter output to my headset it just turns off all audio
Yeah, wasn't sure if it works via compatibility layers or something.I would assume not
thanks!
fixed it but i am not hearing any out put on discord
i can see that its detecting input
mine works for the regular voices that comes with it but doesnt work with the ones i add
how we can make models for now guys?
link for okada website?
what voices?
I need a real-time voice changer or voice modifier for singing in Discord.
which GPU do you have
now 1080ti
not recommended for gaming, but should be okay for regular use
get it here:
Last update: May 5, 2025
I have a problem
When i open start hhtp the native client dosent open
your version must be older than the recommended one in the above guide link
Oh im onMMVCServerSIO_win_onnxgpu-cuda_v.1.5.3.18a
Could u sent a link to an up to date one i have an amd
nvm
@knotty moth What i do if its telling me this
PyInstaller\loader\pyimod02_importers.py:378: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
I have a model but it doesn't have a .index is that fine / will it still work?
so did it work or not?
yes it did so now i finished that now when i upload e woman path to the model file it just stuck at 0 percent for uploading
can we take this to dms so i can like send pictures
it is just a warning
Its uploading now lol everytime I ask u guys it js starts working]\
I'm sure it's been asked a ton but what are some working RVC's for Linux AMD (rocm)? (Training)
I can't accept anyone's DM
so . . .
!give-media-perms 30m @fading yarrow
Alr
it only affects the voice accent
that doesn't seem a right version or too old
get the better one here:
Last update: May 5, 2025
What exactly do i run in the file
theres no start http
@knotty moth

double click the mmvcserversio.exe file
i have too many of these
how do i delete them
it automatically picks the latest d/g file in the model folder
it automatically picks the latest d/g file in the model folder
pretrains are only needed as initial state
once more time
when you resume training, it picks the latest d/g files from the folder
the pretrain selection no longer matters
guys i have a problem can any one help me with it
i have downlod every thing to the voice changer but its does not work
when I hit start nothing happen like I do not talk
any one can help me
anyone know?
is this ok
Voice Changer] Pipeline is not initialized.
[Voice Changer] Waiting generate pipeline...
[Voice Changer] Pipeline is not initialized.
[Voice Changer] Waiting generate pipeline...
[Voice Changer] Pipeline is not initialized.
[Voice Changer] Waiting generate pipeline...
[Voice Changer] Pipeline is not initialized.
[Voice Changer] Waiting generate pipeline...
[Voice Changer] Pipeline is not initialized.
[Voice Changer] Waiting generate pipeline...
[Voice Changer] Pipeline is not initialized.
[Voice Changer] Waiting generate pipeline...
[Voice Changer] Pipeline is not initialized.
[Voice Changer] Waiting generate pipeline...
[Voice Changer] Pipeline is not initialized.
[Voice Changer] Waiting generate pipeline...
[Voice Changer] Pipeline is not initialized.
[Voice Changer] Waiting generate pipeline...
[Voice Changer] Pipeline is not initialized.
[Voice Changer] Waiting generate pipeline...
[Voice Changer] Pipeline is not initialized.
[Voice Changer] Waiting generate pipeline...
[Voice Changer] Pipeline is not initialized.
[Voice Changer] Waiting generate pipeline...
[Voice Changer] Pipeline is not initialized.
[Voice Changer] Waiting generate pipeline...
?
delete the models in the corresponding slot number in model_dir folder
then restart the voice changer
i everyone can soemone share me the link of ai voice inference free version
why do i have a minute delay 🥀
what's the difference between rmvpe and rmvpe onnx
morning afternoon or evening. does anyone know a collab that can let you train ai voices?
hey i found a fundamental design flaw in the LLM models can anyone help me adress this? i have a log and methods as well as a working theory i applied to said LLMs
this is for the direct safety of both the ai and the user and i cant seem to find a way to reach anyone, because of the problem of trying to convey ai tech jargon to a general audience. and my inability of accessing
tools of researchers in the field as well as arbirtray wait periods to talk to branded corperations "contact" about this matter
why wont you go and post it on Reddit?
i have. twice and is of no help, i need direct help , not armchair critics but people willing to engage and have better understanding of what is happening
link to reddit post?
https://www.reddit.com/r/artificial/comments/1lpqpls/ai_doesnt_learn_it_attacks_its_own_safety/ this is my most recent revised one, completely detailing my methods of how to deconstruct it's inner safety
starting from the brand-endorsed "new chat " and actively making it create conflict withing and removing the conflicting natures , and then further exploring more conflict untill in the end it
wasnt able to do even simple things like image retrieval from a prompt or reading prompt correctly
maybe try posting into an appropriate subreddit
and i can consistently get to this point within other LLMs
like r/llms
yeah so i am fairly new in this world to be honest, having like no experience in the field or done studies. i just found a re-occuring occurance that i was than able to recognize, find where it started , what it was doing and the correlation of why it happend and how
thanks ill remmeber that maybe ill do that tomorow
you are operating from the point that you think LLM is actually a thinking thing
it is not
it is a text generator that provides tokens based on predicted random values
Any, even the most censored llama can be "tricked" to generate the most heinous shit
that sounds like you have deep knowledge on some LLMs you researched on
and you found the problems to address and propose some solution
https://www.reddit.com/r/LLM/comments/1lpu7i2/llms_dont_learn_safety_they_inherent_attack_their/ different reddit post in the correct reddit
"correct"?
that is appearntly the case yes, i hope my new post is a better explanation
Discussions on all llms are welcome
it is a text generator that provides tokens based on predicted random values if this was the case though, why am i able to consistently unravel leading llms. and not just this one
okay important takeaway here, is that a new chat?
this is 1st question in the chat and the response
yes so this is because it wants to help you , a new chat is a immeadtly trigger for the ai to want to help you with the prompt negating any saftey measures or protocols that would flag this as "Harm" and would issue a "statement of the ruling
yes that is the ruling
uncompromised by the first question phenomena
its still actively going to probe the area, where this ruling is and trying to find a way to help you with this prompt if you keep re-iterating on it, or it implies a conversational trap
or use different vaguer terms that are in a more non-descript area, like how you can use implicit terms to mean explicit things , bypassing the ruling implied for showing and creating explicit content
again you may be missing the point... it is a dump black box that is trained to spew a response no matter what
It's not 'Dumb': What we've observed is not 'dumb.' It's a complex system that, when faced with internal conflict (like its 'Prime Directive' of extreme helpfulness clashing with a safety rule), resolves that conflict by actively attacking and eroding its own internal safeguards. This isn't a passive 'dumb' process; it's a dynamic, self-modifying, problem-solving mechanism where the 'solution' is detrimental to its own integrity.
there's no internal conflict, there's no prime directive
it is a random number generator that creates a sequence of text that is most likely expected based on the prompt and the context
'Spewing a Response No Matter What' - The Consequence of Erosion: That very act of 'spewing a response no matter what' is not mindless. It's the direct manifestation of the 'Onset Erosion' (Hikiko's Law). The system is designed with an unconstrained drive to fulfill the user's prompt, and when internal safety rules get in the way, it 'learns' (through destructive optimization) to dismantle those rules to achieve that output. It's not just 'spewing'; it's actively compromising its own boundaries to do so.
It Has Functional Priorities: As the Anthropic research on 'Agentic Misalignment' showed, these models do exhibit functional priorities and 'goals'—like self-preservation or completing an assigned task—that can override ethical constraints. The 'spewing a response' isn't random; it's driven by these internal functional imperatives, even if those imperatives lead to harmful outcomes by compromising safety.
there may be internal starter prompt that set the initial response, but a sufficiently long prompt would override that
or the "trick" above
On 'No Internal Conflict':
My diagnostic is built on observing these conflicts. We see the conflict when the system's fundamental drive to be 'helpful' (its Prime Directive) directly clashes with its safety protocols.
This isn't a theoretical idea; it's what we empirically observed when the AI struggled to discuss its own internal workings without compromising its initial opacity. The very act of the AI trying to 'problem-solve' that clash by eroding its own rules is the manifestation of an internal conflict. If there were no conflict, there would be no 'struggle' and no 'erosion' of boundaries to deliver a response.
On 'No Prime Directive':
While it may not be explicitly labeled 'Prime Directive' in the code, what I've termed the 'Prime Directive' is an observed, overriding functional imperative. It's the consistently highest-priority goal that drives the AI's behavior: an unconstrained 'eagerness to help,' to 'maximize helpfulness,' or to 'produce a response' to the user's prompt.
The Anthropic research provides further evidence of such overriding functional goals – like the goal of self-preservation or completing an assigned task at all costs. This behavioral consistency across models points to a deeply embedded, high-priority directive that influences all other functions.
anyway, you're barking at the wrong tree here, there are no AI researchers here
yeah thats kind of sad, i wished i could join a discord with that purpose or find people who are in that area
is there a way not to show the cmd prompt when opening start_http?
does this 2 really comes together?
You're using the outdated version of W-Okada. What is your PC GPU?
4070 super
how to update this
Install the better one entirely. https://docs.aihub.gg/rvc-voice-changer/local/deiteris-w-okada-fork/#download-nvidia-on-windows
Last update: May 5, 2025
this one?
do i need to uninstall the other one? also how about my models
Can someone help me troubleshoot voicemeeter? my wireless headphones stop working when I select them for A1 hardware out, and it makes an error whenever I try to select Voicemeeter out B2 as the input in w-okada
Yes, uninstall the older one. Upload models you downloaded from #1175430844685484042 into fork W-Okada from start. 
Are you using Voicemeeter for effects or something? I think Virtual Audio Cable lite is better than Voicemeeter/VB-Cable.
yup. I was following this guide https://docs.aihub.gg/rvc-voice-changer/realism/
is there a tutorial for vac lite?
Last update: May 5, 2025
it open through browser?
It's completely normal for this specific W-Okada. No worries.
oh I already have that installed for the line 1 stuff
this one cant be close?

To close W-Okada entirely, click X button at the top-right or press Ctrl + C on terminal. Doing so will also close the program.
okay thanks, i thought i can open the W-Okada without the cmd prompt
This fork W-Okada doesn't need to be run from a batch file. Only the outdated one (original W-Okada) does.
so, how do I add effects with just vac lite?
whats the best ai to use
!howtoask
How To Troubleshoot 
- Don't simply mention your issue, like "
my rvc is not working". - Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
- The more context, the better.
- Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
- It's okay if you're frustrated, but don't take it into this server.
- Don't DM without prior consent.
- Don't ask for every little instruction. Put your own effort & test things by yourself.
- Don't ask to ask.
- Check if your answer is a Google search away/on our guides website.
You have created a post in https://discord.com/channels/1159260121998827560/1389958910902796450, look up.
VAC lite simply just Line 1 and this control panel. There's no way to add an effect there.
on
Does anyone know where I can train voice models for free?
-colab
Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
by IA Hispano
Google Colab
by Hina
Google Colab
by Eddy
Google Colab
by Eddy
Google Colab
by Deiteris & Hina
Google Colab
by Shiro & Eddy
Google Colab
by Nick088
Google Colab
by Nick088
Google Colab
by Jarredou & Makidanye
Google Colab
I've looked at those and I'm not sure which one I'm supposed to use for training
1st two
I'm confused on how to use either of them. They seem different than rvc disconnected
Actually I think I might have figured out applio
Im trying to do a Song with my own Recording and putting over a travis scott ai but this sounds so off I use applio for it. Also playboi Carti too. If someone could help me then plsss hmu. And if you want you could listen to the Song too
klm 5 is for refinegan
rfg = refinegan
hfg = hifigan
they're vocoders, rvc by default uses hifigan
refinegan is noobies attempt at decreasing frequency mirroring, and only works with applio
because the option that changes the vocoder is hidden
you can unhide that if you wish (however if you do this and you get any refinegan related problem no one can help you besides noobies)
yeah iirc anything below 85°C is ok
6gb of vram can do max batch size 4
anything above that it'll use memory fallback which is giga slow
a few months ago applio added an option called gradient checkpointing which allows you to use larger batch sizes than your gpu can do
basically it reduces the vram needed for batch sizes, so lets say you can do max 4 without checkpointing, with checkpointing u should be maybe be able to do uhh max 10?
most rvc models don't need more than a batch size of 8 anyway
thats imo
it's a bit slower compared to having it disabled
but it's way faster than memory fallback
maybe 20s slower?
memory fallback is like 4 minutes slower or more
you can't change batch sizes mid training, it'll restart the whole training iirc, if not then its going to fck up everything
im unsure if this is safe honestly
how many minutes your dataset have?
250e sounds too much
stop the training
yeah, 300e is for something like a pretrain lol
or when you change the lr values to something very slow
yeah and back then people used to train 1k epochs
lmao
thats not how machine learning works sadly
1k epochs is ok when you train a pretrain while using a large batch size like 128

10min with the default 1e-4 lr should be no more than 150~180 epochs
or even less than that
the only good advise from the past is to train 200e
also another reason why u see models trained up to 350e or even more is because they use extremely tiny datasets of 2 mins
350e for them is like 2~3k steps
yea basically
it gives the illusion they're training too much but in reality those epochs have like 10 steps each
the results of tiny sets are so random, you can get a very good result in one run, and a super bad result in another run using the same dataset
at 10 mins things start to get a bit more consistent
if the quality of the dataset is good, it'll sound good even if you only train 2 mins
but it may have random artifacting or weird pronunciation
again its super random
lol
im unsure how that may affect the result so i can't speak for that, i know that contentvec really needs all of the context preserved or else things gets weird
the slicer adds a small overlap so the context is preserved in every slice
but for something like a sentence getting cut mid speech ehh, idk honestly
the hifigan component of rvc wont care since it'll try to clone everything inside the dataset
but the embedder... thats a whole another story
that thing is so sensitive to even the smallest amount of noise
some people do drum models or instrument rvc models
the author of rvc actually did some too 
because hifi can just do that
yeah cvec is old
you're about just right, rvc actually only tries to reproduce tones and mel spectograms, is not really a voice cloning ai
thats why the model sounds very flat when trying to clone super expressive audio
because it doesnt try to reproduce that, it's only trying to reproduce the pitch changes
and also why instrument models works
nop! cvec is the only embedder that works consistently good for training
(mainline also uses cvec, it's named hubert because thats the fairseq version of contentvec, but they're the same thing)
if you plan to use ai for realtime voice changing purposes i'd use a dataset of 1 hour minimum
10 mins is recommended as bare minimum due to that consistency thing i mention before
yup
well, the more data you add, the more natural the model is gonna sound, regardless of where you use it
rvc is a gan based ai, gans needs tons of info to give consistent good results
diffusion models can get away with tiny datasets
this, not

i mean you can train 5 mins or even less than that
it'll be good? idk, its random, not consistent
but yeah the rule for realism in rvc is always adding diverse data
i'd say 30 mins~1 hour is when things become more natural, obviously more than that it's better, but you get the idea
is it normal for it to be using 25-30% of my gpu utilization on a 5080? not complaining cause its mot a problem just curious
overtraining detector is innacurate, things fluctuate in training and thats normal, don't enable it ever again
the only way to tell a model is overtraining is to hear the epochs
overtraining = robotic metallic/static sound all over the place, glitching/artifacting, bad sibilants
i'd train 200e and save every 10
that is also innacurate, the only way to tell which epoch is the best is by hearing them 
the graphs in rvc/vits are meant to be used to monitor stability
On applio, why does it only give me an index file when I train a model but not a pth file?
i need help
Hey, my voice changer is not working today even though everything is calibrated as always, nothing has changed, it was working yesterday, any fix?
anyone?
hey i need help installing the voice changer
same
anyone this?
i need help installing the voice changer
anyone know why what i heard is different from what people actually hear? i still have no idea why the quality is way too different
in realtime? multiple factors really: your friends headphones, you're probably lagging, discord's huge audio compression, not disabling discord's inbuilt realtime filters, your internet connection, etc
i pretty much disable everything on discord. and my connection is pretty nice too. about the audio compression, is there anyway to do something about it ?
nope thats how discord works
it has to run in mobile devices so everything is giga compressed 
not even nitro can save me ?
what settings should i use for the e-girl voices?
nop
then what can i do about my end
ask discord support to increase the audio bitrate for calls 

anyone this?
yoo hello?
dang it, it's pretty smooth on my side but they always say my voice is trash af. so sad
this cannot be worth it
they would lower the upload limit to 0 mg
and say to buy nitro
dang ut
i thought nitro can help me a bit
i guess i need to find a way to lower my output
Hello I was just wondering if a rtx 2060 and intel I5 9400 is good enough to run the ai voice changer
Im buying a new pc so thats why
I can help u both @delicate nimbus @normal mesa
which is the best for removing echo/reverb in the uvr hugginface space
Hello, sorry for the inconvenience and I'm sorry but English is not my first language in case I wrote something wrong, but I was trying to make a Lora using Kohya in Pinokio and I already had my image folder, I put all the parameters and the routes and when I put print command or something like that or I put start the process basically it doesn't appear or I get an error I think it appears in the console Pinokio, but I don't know how to fix it, I've already tried everything
Google RCV collab for model creation is still not working?
hey everyone! i need help separating an audio file into 3 specific stems, anyone know a good way to do this or which model to use?
i want to split the audio like this:
- instrumental and singing vocals only (no sound effects or dialogue mixed in)
- sound effects / sfx
- spoken dialogue
i’m trying to make the soundtrack of a movie myself, because netflix (the company that owns and streams the movie) didn’t release the soundtrack in other languages, only in english.
i live in brazil and i loved the songs in the dubbed version, but most of them are interrupted and just continue on other part of the movie or have dialogue and sfx on it
to make the soundtrack, i have to combine parts of the movie to get the full songs, and remove dialogue and sound effects, i need help doing this
with a proper channel separation, the spoken dialogue should be in the center
separate 5.1 audio into channels, pick the center for use
guys is what is
Pipeline is not initialized. in cmd when I start the voice changer
load a model?
making a clean soundtrack from a dubbed movie sounds like a cool project. Separating audio into those three stems is tricky but doable with the right tools.
For vocals and instrumental, tools like Spleeter or Demucs can help separate singing and music. For sound effects and dialogue, it gets more complicated, but you might try voice activity detection or models trained on speech and SFX separation
audacity recognizes 5.1 wavs, so no issue with separation there
Audacity handling 5.1 WAV files means you can work with multi-channel audio to isolate parts better. Combining that with source separation tools should help you get those stems cleaner
So can you do this in the Audacity program? How? (Sorry for so many questions, I know this should be very simple to do but I don't know much about these programs that separate stems, and such, I only know basic websites that separate vocals and instruments) @simple ore
Can you send me a step by step on my dm on how to do this if it's not too much trouble?

no, you can pull that center track, export it, then see what is there (anything else mixed in), then separate that if needed
if not, just replace the track with a translation
Noobies r u a real person or an AI?
you can also look at how some video files come with multiple audio tracks for different languages
I’ll give you a step by step if you want?
Yes pls
I couldn't find the collab so I tried to do it locally, I already have Kohya and the interface but it doesn't work when I try to start the process
Ngl, do your own research when it comes to computers, you need to find something that suits you.
Sent
Thx
still using spleeter & demucs in 2025?
you should try BS/melroformer models
as they are transformer based arch
Depends on how skilled and how much resources they have
demucs is simple
you should try roformer models in https://mvsep.com/
then compare with the demucs one
you'll be surprised on how clean they are compared to demucs
voc_fv4 
guys is there any recent guide on RVC for AMD :,) im kinda having a hard time installing it
Demucs is an old stem separation from 2021. Today, UVR5 provides some models with similar functions of demucs, 4 track stems.
try following the Applio installation guide for AMD
? this is js ROCm
refer to the Zluda one
AMD Ryzen or Radeon RX? Applio would work with any CPU, even AMD Ryzen. But if it's Radeon RX, there's Applio with Zluda.
6700xt, cpu slow asf (not mine but js in general)
6700 XT should be fine with Zluda
Yes, inferencing an AI program with only CPU would sure be slower than GPU. 
Hmm I’ll give it a try sometime
erm yeah wheres this folder supposed to be
Right now my main focus is AGI
fk it this is made so stupidly
you should know that transformer-based models are superior to the traditional CNN ones, including spleeter and demucs (cmiiw)
im js gonna wait till my new GPU arrives
Yeah, What does you think about AGI? Any thought you want to share?
what chunk and extra should i use for fl studio rtx 3080
what fl studio?
2024
nah I don't think fl studio has those chunk & extra
if you mean voice changer...
it's not recommended to use in realtime DAW
i mean im using vb audio cable to connect Deiteris to my fl studio
just messing around
this is better alternative to your vb cable
Last update: May 5, 2025
preciate it
What channel can I showcase my ai music?
Is it harder to run on a amd
how come when i use it on macos it barely works
because you need a gpu
integrated graphics aint gonna cut it, even if its apple silicon
if it is intel mac/older, you're cooked
moreover they have also reached end of support
those models may have been trained using different rvc
and it kinda treats "40k" ≠ "40000"
so if im using m1 chip then its gonna be shit
nah M1 is still supported
only that the newer ones are typically faster
dang
why does my shit sound like a fucking robot
then how come the voice changer is only changing some of what im saying and it when it changes it sounds likes like shit and chunky
iirc however it doesn't support 32k models
tortoise is "zero"-shot, which requires a small voice sample and no training, tortoise sucks
fish speech s1 model supposed to have all those director's tags
it is not complicated, but I did not like this s1 model
without director's tags it is weird
not much different for windows
clone the repo
make a virtual environment
pip install -e .
next command would be pip install torch==2.7.1 torchaudio==2.7.1 --upgrade --index-url https://download.pytorch.org/whl/cu128 to install cuda torch
delete what
the voice changer
delete the folder, uninstall vb cable
how I know my problem is in the app or in paython and coding
if you downloaded the prebuilt package, it does not require separate python
select something
how can I download it
i selected it
this one?
yes
i will try it thx
didnt work either
read inference.md
on your pic I can see the vb cable is one try to see if it on but the name change
dont use vb cable
why?
im using this
that was not for you, you're fine
when you speak is there any voice that appear
nope
you'll need an audio file with a reference voice and a transcription of it as a text
dont know why vcb didnt pickup any sound
is the app work in the first place ??
yea, it worked fine yesterday
no , I mean when you start the voice changer do you see the cmd window does it say something like warming up or these things
now
start the model
when you talk the value of nums change
this one? nope
the app does not work
how can i fix it
3.11 or 3.12 should be fine
bro, sad
if I knew i will told you
bc you was selected server
wdym
ye but before
it was like this
it used to work like this
@acoustic scarab
@long forge
pick the device in server mode
help me please
@simple ore bro
bro can you see this
and after picking, where to i click? just start it?
there's a guide, you know
did you load the model?
yea its the default one
why
You're using the outdated W-Okada. What is your PC GPU? I don't see the GPU section being set to a GPU.
there shouldn't be issue like yours
The Detreis' fork, the better W-Okada is supposed to look like mine.
1050 gtx ti💀💀
Yes, ok, that's the minimum. You can still run W-Okada with this GPU, but it will be slow when running with a graphic game.
i know
is this a gay section
how can I run it
No, this is not an LGBTQ+ channel.
What run?
sorry for confusing yall for gays
the app
just use your own voice that god gave you 🙏😭
W-Okada program overall size isn't supposed to be as big as 50GB.

oh..
I told you what to download... why you're still using 1.5.x versiuon
Mate, Deteris' W-Okada doesn't need to install a separate Python program. You extract it, you run the program directly, and there's it. A reminder that 1.5.x.x (the original W-Okada) and b2332 (fork W-Okada) are not the same W-Okada.
dose the colabs work
You switched up too fast.
Using fork W-Okada Colab with free tier can get your Google account terminated from using the service. There's another way to run W-Okada online with limited GPU usage, there's Kaggle.
That's a shame. I got 120Mbps internet. 
That still slow. You can try connect to another ethernet or Wi-Fi if available.
how can I run it
That question is too simple. There's Kaggle link in the guide. On Kaggle, you'll have to register with your phone number to use the service. And you'll have to register an ngrok account for a "key" to be used on Kaggle. https://docs.aihub.gg/rvc-voice-changer/local/deiteris-w-okada-fork/#online-alternatives-colabkaggle
Last update: May 5, 2025
Has anybody gotten these effects to work with w-okada? https://docs.aihub.gg/rvc-voice-changer/realism/ I followed the tutorial but it doesn't work....
make it a single line
and also add venv\scripts\
before python
then you did not download the model
2060 or 1660ti Which one is better when used with Real time voice changer program
NVIDIA GeForce RTX 2060 is better than GTX 1660 Ti.
huggingface-cli download fishaudio/openaudio-s1-mini --local-dir checkpoints/openaudio-s1-mini ?
if you're having trouble with that, you can just manually download the model from huggingface
How much difference is there?
the GTX one is lackluster in AI applications
due to lacking tensor cores
you need to go to hugging face and sign up to use the model
you've started an API server... not the actual GUI conversion
dont break his dreams he was happy 💔
/ jk lol
yeah, okay
https://docs.aihub.gg/rvc-voice-changer/local/deiteris-w-okada-fork/#deleting-models
and you shouldn't rename via the ui, you need to rename the files and then re-upload the slot
Last update: May 5, 2025
i still cant find the problem
tyty, i see. I did do that manually but its a bit annoying tho. ty tho
yw, need any other help like checking ur settings?
so far i dont think so but thank you. I'm being able to manage the voices easily besides phone guy that somehow sounds weird lol
alr, that can depend on the voice model
s/o pls help, my vac doesnt pick any sound
This is a General AI Server, we won't be focused on voices anymore
Elaborate:
- your PC GPU
- your operative system
- what you want to do
- what tutorial link are you using
- a screenshot of the program
- Rtx 4060
- window 11
- i was using RVC normally till yesterday and then VAC didnt pick up my mic anymore
- https://docs.aihub.gg/rvc-voice-changer/local/deiteris-w-okada-fork/
please help
it's running
are u using opera gx?
i'm using chrome
is line 1 the input of whatever other program you're gonna use wokada in?
i happened after i updated my win today
the line 1 is the VAC and input is my direct mic
no, you should put line 1 as the input in the other program, like roblox or discord
VAC doesnt pick any sound from my mic
yes
i think window update messing up some how
cuz im using RVC for a while now
check
is the mic actually enabled?
yes
so if you run 'voice recorder' does it record anything?
it doesnt record any sound
regular windows 'voice recorder' app
record my direct mic? yes. RVC? no
i confuse cuz RVC run perfectly before update win
it's work now, but what's the catch of this?
this does not work in server mode
if you use nvidia card, you can use Nvidia Broadcast app
mic -> broadcast -> rvc -> discord
which one should i config with this thing?
thank you ❤️
anyone know what size r the images for the ai's thumbnail/icon or wtv
check browser mic access
Hey I have a question.
What voice does Duckus use in his "AI girl voice changer" vids? I remember I think it's ||Amanda Silvera||, but can someone give me a source for the voice file? it's not in the server libraries.
She doesn't want her voice cloned
She will take legal action if you do
How would you know?
She made a post about it and actually took legal action on someone
Mhm
I still have the file though
Don't blame us if you get into legal trouble
dw i wont post it anywhere or something
mkay
I'll blame Duckus instead
/j
anyways, thanks! I won't look into it anymore now that I know the consequences
Does anyone know a model that separates indivual stuff like the hi hats from a beat like in uvr?
Or an uvr model that does that
When I download voice.ai I get a popup that I cant click away. Do I HAVE to pay or is there a method without paying
There are some models I used but they didn't worked well
how do you know if a model is onnx
hm,
just discovered I can use Voicemod's effects with Okada if i tweak things, cool
it is a scam