#🧬│ai-chat
1 messages · Page 330 of 1
HII
Ayo? @elder willow level 2 !!! 
Alguem Br ?
Are you using the webui
i'm using vst3
i guess vst3 have better control and speed
Vst3 are not gpu accelerated unless gpu audio plugin and are you using Beatrice v2
The vst3 is only for inference not creating

Just use mvsep bs reformer viperx
but i'm using vst3 for inference
vst3 is not for training, what you mean?
Then what are you using for training
beatrice trainer
Hello epta
anybody here know any programs similar to Wav2Lip for easy deepfake lip syncing using the voice of the person in the original video? Wav2Lip has some issues when I get to the last step
Ayo? @elder willow level 3 !!! 
Heyy I am f please
Interesting
what the fuck is even happening in this channel dude
Idk
Ayo? @grand relic level 2 !!! 
So the webui version
Facefusion is good for free but still not as good compared to the paid ways
AI music isn't the real art, it is a form of AI art. 
I see, what would be the best paid versions?
Talkingavatar.ai https://www.talkingavatar.ai/r/HZNCL5RO is the best I have used but over priced but good and for open source free then facefusion
Awesome thanks
Ayo? @brave holly level 1 !!! 
hiya
Hey guys, are we allowed to share prompts, new here. Looking to share some prompt and/or learn some new ones
hiii
No
Then you are not doing it the correct way then
🤣
Because the colab way is not as good and if you have a Nvidia gpu then local webui way is the best
I will give it a try, but I haven't seen many changes from original beatrice trainer. He says that it runs in 3rd gen nvidia cards, but he set batch size to 32. The maximum i got to run good was 10, in 12 it start to share mem, this is weird to be so high, but i will give it a try and see what happen.
🔥 💳
his trainer is based on v2 beta 0 which is old
п
He uses an rtx 4090, so he was able to set the batch size to 32
hi every one! i'm new in there
Is it
yeah, it was updated four days ago
nvidia 4090s aint it
Yeah but also colab is older
Why
I thought he uses 4080
i just dont think their that good compared to other alternatives
What alternatives
the colab version pulls the latest version from hugging face. I tested the colab version yesterday, and it uses the latest beartrice code.
Okay but colab is not good local is the way is there any local webui with latest
I don't think so. People don't even bother with Beartice
Yeah because they don't like switching even when something might be better
I would do it if they included more languages in the pretrained model
hola
Mind if I self promo? ✅ Walker Extreme yo
No only in #1159290752195633273
Talkingavatar seems to be super new? I can hardly find any info about it online besides from them
.
Yeah it is super new and not that known I have examples I have made
is it allowed here to post telegram links?
No
Hi guys can someone suggest an ai voice fir journalist youtube channel
Sure would be curious to see them
Dm'd you all them
better avoid posting possibly illegal links
he left 😭
bro joined just for promo
telegram piracy 🙏
Anyone else having problems with the Applio Colab (NoUI) and Pytorch?
What’s ur pc gpu?
It’s on Google Colab, my PC GPU shouldn’t matter, I think
It just started happening yesterday
yes it doesn’t as its cloud, but many people got a good pc and don’t know they can do it locally
so that’s why im asking that first
there are people with an rtx 4070 using colab, and people with a miserable integrated graphics trying to do it locally 😭
integrated gpus suck
yeah but im not talking about doing this with an igpu practically
I think it would be just using the cpu
I got an intel cpu so igpu
intel igpus suck but i dont think ryzen igpus suck as much
still doesn’t have cuda cores, and im not sure if zluda could work with igpu
@chilly lake would zluda work for ryzen igpu?
im not really thinking of doing it but im just wondering out of curiosity since ryzen igpus are not super super super terrible
100% u can use cpu
but they are not super super good either
whats the new AI hub doc link
yo
would any1 like a bsky code
what is that
bluesky
meaning
send
There is a video that remixes the existing MR with just tags, does anyone know what AI it is?
@covert lake
Please don't ping me much
I dunno about that AI unfortunately
I have a link to the video on YouTube, so would it be helpful to show it to you?
sorry
It's fine dw
From what I understood it's about remixing music, but I fr don't know any ai that does that
I understand, but thank you for your help
Wish you good luck to find one, only I know is https://replicate.com/sakemin/musicgen-remixer (paid only) and idk if Suno maybe does that
But never tried both
Oh, this is a big piece of information, thank you
- How to use RVC Mainline Colab by Cauthess
- AICoverGen Colab Guide by Eddy (Spanish Helper)
- Create a model with RVC disconnected (colab) by Angetyde
- Italian Guides by Ilaria
Your welcome
It's not in the docs
Well..not yet since...
how to make pictures with ai
lol no
nobody was crazy enough to compile libraries for it
ugh... i was wrong.. someone was, but would not recommend
Uh where is useful-links?
infer would be fine, but training very unlikely
Lmfao
i'ma gonna test it
Let's just talk in the help forum post u made, no need to ping me about the same thing twice in 2 channels
"C:\Users\Administrator" - dont, use a normal user
how do i fix the error?
1
nope, the libary set for igpu does not work
1
Rip then
Are AMD CPUs really better btw?
Tbh i rather use colabs
it is a choice.. paying for max performance at max power consumption vs lil less for much less. Also DDR5 is better on Intel, better memory controller, usually can use 4 sticks of RAM without issues
but again 13k and 14k have the degradation and corruption issue, probably fixed by now
Damn, I didn't know that
if you had those 13900k for some time, they are probably fked
I see
okay ! killing myself
I got the tools, now I just need a better computer lol
Makiing clean audio for the model training gonna be delish
arrow lake incoming
Who? standard trainer is beta.2
hello
I plan to post overwatch full characters today, idk if i will
what are you gonna do with it without a motherboard?
I'm not sure if Z890 mobos should be launched together with the cpu
thats nice but i wonder if people will start to train other model on beatrice
I can train any model for you for a few bucks
im gonna try out weights.gg model training!!! what's the minimum duration for the regular training?
-docs
- How to use RVC Mainline Colab by Cauthess
- AICoverGen Colab Guide by Eddy (Spanish Helper)
- Create a model with RVC disconnected (colab) by Angetyde
- Italian Guides by Ilaria
is there a channel that allows sending voice messages here?


U need to get level 5 for those perms
RuntimeError('Repository root is not found.')
hi! does anyone know if nova app is legit or n
im gay
Ayo? @dim mural level 1 !!! 
real 👅
hi
Imagine playing a gacha game at 3am, and suddenly one of Kendrick Lamar's songs plays. 


i need him
Ellen Joe Blahaj. 
I have a request for the player of the fire emblem series. I have a request for a word from Lyn's voice lines, does anyone know the words and can do it correctly?
Does anyone know someone who wants to hire an editor?
Maybe in #1159290752195633273 would be the right channel
Tysm
hello there general chat!
@orchid slate Hi, can you please help with the setup and with the ai voice version?
Hello 👋
Have you downloaded RVC ? And di you have a GPU ?
well, I have a video card, just which version to download 1.5.3.1 or v.2.0.65-beta
+how can you set up your voice well so that it is not robotic and does not lag
Are you using real time voice changer
yes
Ayo? @mint nymph level 1 !!! 
Which GPU do you have.
GeForce RTX 2060 super
⠀
Settings for Nvidia GPUs :nvidiagpu:
F0 Det.: rmvpe (suggested for all series)
RTX 40-series: 80-96 chunk | +16384 extra
RTX 30-series: 96-112 chunk | +16384 extra
RTX 20-series: 112-128 chunk | +16384 extra
GTX 16-series: 128-192 chunk | +8192 extra
GTX 10-series: 128-192 chunk | +8192 extra
Advanced Settings
Protocol : Sio or Rest
Crossfade: 4096 start 0.2 end 0.8
Trancate: 300
Silencefront: Off
Protect: 0.5
RVC Quality: Low
⠀
Follow the RTX 20-Series
can I write to you in the bos? I'll send you a screenshot
kinda weird how the server is called "AI hub", yet theres no llm channels
Ayo? @pseudo salmon level 1 !!! 
Bos ?
⠀
Settings for Nvidia GPUs :nvidiagpu:
F0 Det.: rmvpe (suggested for all series)
RTX 40-series: 80-96 chunk | +16384 extra
RTX 30-series: 96-112 chunk | +16384 extra
RTX 20-series: 112-128 chunk | +16384 extra
GTX 16-series: 128-192 chunk | +8192 extra
GTX 10-series: 128-192 chunk | +8192 extra
Advanced Settings
Protocol : Sio or Rest
Crossfade: 4096 start 0.2 end 0.8
Trancate: 300
Silencefront: Off
Protect: 0.5
RVC Quality: Low
⠀
what? I just have a strange version and there are no such chunks, there are 2400, 3840, etc.
which version should I use?
We work for Voice to voice conversion and We have some helpers regarding stable diffusion also we work for facefusion that's why it's Ai hub because it consists many Artificial intelligence channels
Use The Fork
Don't know about this GPU.
What does this mean, I'm not good at English
⠀
Download for Nvidia GPUs :nvidiagpu:
Version 18a cuda
Download for AMD GPUs :amdgpu:
Version 18a directml
Download for Intel GPUs :intelgpu:
Version 18a directml
Download for Mac :macgpu:
Version 17b Mac
⠀
sps, I just used the 2nd generation version
Basically the fork is the modified version of a original program to provide best quality at low specifications
Have you downloaded it from the following link ?
Now yes
Ayo? @mint nymph level 2 !!! 
Maybe your GPU is not enough powerful for it. As I don't know much about you GPU
-realtime
This interaction has expired, use the command
/guidesif you wish to see it again.
Download first one @mint nymph
Ayo? @unborn jacinth level 1 !!! 
It's good to ask in #🔍│help-w-okada
Well, I downloaded the topmost one
ok
That's not great. Use fork instead
Would you guys recommend Okada or RVC?
And where can I get it? I thought it was a fork
Basically RVC is a program used for voice to voice conversion. And Okada is a live version of it. You can do voice to voice conversion in real-time using okada. RVC is the program and okada works on thr principle of RVC
I sent you the link
Hey, I have a question for you guys.
Yes please ask
can you create an generative AI from If - else statements alone?
This interaction has expired, use the command
/guidesif you wish to see it again.
noted i apprechiate it
@mint nymph the first one (fork)
Yes, I've already found it.
deiteris-Fork
/
voice-changer-windows-nvidia-b2309.zip I'm rocking this one
No. It's a very complex process to creat an Ai model. Not only if else statements you have to use couple of things such as libraries like tensorflow and many frameworks.
Basically if else is a kind of decision making statements. You can use it to make decisions.
For example
If value = 1
On the light
Else
Off the light
Yes it's correct. Follow the guide. And if you have any trouble use #🔍│help-w-okada
This is only for general chat
❤️
I'm aware it's solely decision-based and rigid, and that's why I'm going lock myself in a room and drink lots of coffee and figure out how to do it. I like making impossibilities into possibilities.
Ayo? @half gust level 1 !!! 
Great. I recommend you to learn python to create your own AI stuff. And it's a very long process. Work on it day by day. You can't achieve it in one day. I appreciate you confidence. Good luck buddy
I've already started learning Python for months now, and by the way, I'm doing this to make AI(s) much easier to grasp by beginners and also because I hate calculus (not Maths, just calculus).
I'm not a developer but I hope you will became one
that's not really how it works, if u wanna check the big messy code u can see https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/
And yea RVC code is pretty messy and left to rot, still the best tool for speech to speech models tho
the creator is working on gpt so vits instead of rvc so rip
For learning python u could check https://www.w3schools.com/python/
For learning about AI coding (with python) u could check the HuggingFace (biggest ai platform ever) docs https://huggingface.co/docs, iirc rvc uses Diffusers, just like Stable Diffusion
No one has found any quality improvement in RVC since a year, and I can assure you it's code is spaghetti and hold by duct tape lol
This is also known as mainline, the original project
there's also forks (modified versions) like Applio which are slightly better & faster https://github.com/IAHispano/Applio (still no quality change and ofc the models are interchangeables)
Haha. Enough nick. He got his answer already.
well, yea it's mostly just RVC & Wokada
has an help channel for ai art too tho
but mostly Speech To Speech conversion
Why are you replying same questions which I have answered? ?
oh lol, just wanted to be specific
🤕
wtf discord changed the notification sound
my bad i was just scrolling the channel
Never listened that 🤣
were u thinking of being an helper?
Can I be ?
i mean u gotta /jointeam
hii
i asked as u seemed to wanted to help people
And someone told me they will check my activity in server.
oh cool
wish u goodluck that u get it then
i just answered to be more specific and that there aren't llm channels here unfortunately
Ya I love to do that but in my free time. I'm student. And don’t have much time for it. Whenever I'm free I love to help individuals
Hello
Understandable, i'm a student too
We should find few helpers for LLMs that will make this server more useful
Agreed,
I just meant to say that most stuff here is just rvc & wokada as u can see
Well I have mentioned about facefusion too. I remember you ate the only who help for facefusion
yea i appreciate that, unfortunately i can help only for cloud as i got a bad pc
Your specs ?
cpu =i3 10th gen
ram = 8gb
gpu = integrated graphics
Decent for some gaming but shitty for AI
See the facefusion I have mentioned.
plus i got only 256gb of storage 
Anyone knows how to install Tensorboard in the Vanilla RVC?
Oh no. I recently upgraded to Intel i7 12700k
32 gb ddr 5 RAM
2TB ssd storage and a RTX 4060
I mean i could do CPU inference locally (tried fastsdcpu sometimes) but i don't wanna wait hours 
tensorboard can be installed independently of RVC
and then you just run it and points to a log folder from command line
damn u lucky fr
my pc is good atleast for some gaming tho
i mostly play indies so
Like portal, undertale, some emulation (3ds)
fortunately i can do that atleast
I earned. I saved and I make it.
Ye im glad for u
If you need any help or want to use my resources. You are free to do that 😊
Can I send you friend req ?
Got it
Ayo? @ebon ravine level 4 !!! 
Use your resources? wow i honestly thought u were mad at me bc i replied to the same question u did 😭
sure lol
No I respect my seniors and you have more knowledge than me
Ayo? @queen kernel level 27 !!! 
Dunno about using someone's resources as i kinda feel bad about that but ofc u can send me a friend req
How old are u btw?
Cus im just 15
i thought u were 16
Honestly, I thought you are 23. But I'm shocked
damn do i sound old 😭
Don't want to reveal in server. Lemme DM you
Understood
damn, i've joined a kindergarden it seems
Most people here are minors
Are u 20 or 30
I mean i would have guessed that tbh
But yes most people here are just 14 or smt
most staff is minor too iirc
What about admins. Tea ? I guess ilaria is 27.
Dunno their age but surely older more than 18
I dunno ilaria's precise age but should be close to what u said
Check her profile. You will get it
Oh 25 yea
that's my professional programming experience lol
get off my lawn, junior

I knew u know lot of shit of coding
but didn't think that much
not that old
well, welcome here
btw W portal pfp
i just finished portal 1
how much space i shloud have for ai cover programs on disc ?
what's different? does it have an option to save .wav as file in firefox? 🙂
dunno about that as i'm on chrome, but it uses SSR instead of SPA
meaning way faster loading
and ofc they changed the default theme colors a bit
someone PRed me about it https://huggingface.co/spaces/Nick088/Audio-SR/discussions/3 and tbh doesn't look as bas as u showed me in that screenshot
u might wanna retry on applio
its not a dev version anymore, its officially out so
there's gradiop 5 update coming, I think
how to make ai covers and how much space i need?
Ayo? @loud wraith level 1 !!! 
Oh wait i might have confused u with @sturdy elbow #✦│chat message
my bad, confused u with someone else
we are on the same team
oh ye u work on Applio too right?
!aicover
cool
yeah
⠀
Google Colabs :colab:
⠀
AICoverGen-WebUI
Useful for making quick covers, by Hina.
AICoverGen-NoWebUI
Useful for making covers, doesn't include a UI, by Ardha, by Eddy, Hina and Gdr.
RVC Disconnected
To train new voice models, by Kit Lemonfoot.
EasyGUI
The OG interface, by Rejects.
⠀
damn
what's your pc GPU first of all?
You can check your pc gpu via:
ctrl+shift+esc (task manager) -> Performance tab -> GPU
NIVIDIA GeForce RTX 3050 Laptop GPU
What's the GPU memory?
where i can check?
i cannot post screen
U can only in help channels bc ur level 1
Do it #✨│ai-help
i did it i even pinged you
what does RVC stand for?
some think it is realtime voice changer... but it is not, it is actually Retrieval-based Voice Conversion

Ayo? @elder willow level 2 !!! 
hello
Ayo? @wispy dock level 1 !!! 
hi
@earnest dragon and @polar flax i have to agree with you about RVC and Beatrice, RVC is much better, but Beatrice is not trash, it's good for normal voices, more for female voices. Hope they improve Beatrice soon
hello huzz
do not call me huzz please
Ramattra from Overwatch 2
now i just need a RVC trainer CLI
to put a queue on characters
Ayo? @tired jasper level 11 !!! 
i have seen people complaining about multi voice is trash
i guess can be good if merge after solo train
Ayo? @quaint jewel level 1 !!! 
Cogvideo
Also pyramid flow
I tried to run locally but they are very heavy
Yet good quality
nice, thanks
im actually a noob at ai, anywhere i can learn it ?
or do u think exposure is my best friend ?
Can I dm you @tired jasper
Yes
You need at least Python language basic knowledge to go deep, but you can always follow tutorials about install GUI versions
then i am done, dont even know wtf is python lmao
wazaa
the beatrice drawback is the 24k sample rate which is less desirable by audio engineers (don't even rely on the imperfect AudioSR, Apollo, or other existing upscalers so far)
also would better discuss it in #🔊│ai-development
The problem in my opinion is not totally about sample rate, but the beatrice input is 16k, too low
there's RVC in comfyui though haven't tried it
How do I get rid of background vocals in UVR
With Demucs or manually removing noise with audacity
Alright thanks
Oh, sorry, I didn't see "UVR", but i guess you can get rid of anything that is not the main vocal with Demucs, it is also multi band, can separate multi instruments
This ain't no self-promotion - someone else uploaded this compilation! https://youtu.be/Yq-huwHv120?si=lbqQwn1-H3NZRTHg
can i find some tutorial for GUI at huggingface ?
I don't see a good reason for using it in ComfyUI
Maybe on youtube, huggingface is most for just store models
got it, thanks
There's a zillion demucs, which should I download on UVR @tired jasper
Idk, i use the original by command-line
Does anyone here know a good local TTS so I can clone voices and make x famous person speak, so he can say what I want?
try GPT-SoVITS and F5 TTS
And those programs, in addition to having the typical Asian languages and English, how many more languages do they have to be able to make TTs?
GPT-SoVITS v2 supports English, Chinese, Japanese, Korean and Cantonese while F5 supports English and Chinese.
wait it's installed but not in my list?
Ayo? @buoyant star level 3 !!! 
nvm i see it
for some languages that are not supported by common TTS you either have Edge TTS from MS Edge screen reader, or you need to train a TTS from scratch for your language
Good idea, I'm going to try it
there's a limit on speakers in the free Edge TTS though
well, there's of course 11Labs
No, that doesn't work for me, since that is based on an already created voice that is neutral, and then my voice model when making inference, sounds like the neutral voice, with the same accent, and what I am looking for is something more like my model having its own voice, like what ElebenLabs has, but local on my PC.
can someone recommend a realistic female voice
Any voice can be yours for few bucks
how many bucks we talking
20
i dmd you
Ayo? @keen jackal level 1 !!! 
hi guys
I l. realb video too ...
There are different Text To Speech (TTS) AIs:
GPT So Vits: RVC isn't as good as GPT So Vits for tts, but gpt so vits (few shot tts, which means needs just a lil training for models) can't use rvc models (and viceversa), and its only limited to: english, chinese & japanese, if you wanna check gpt so vits instead, read https://docs.aihub.wtf/tts/gpt-sovits/
Freemium 11labs: A easy way to do TTS is https://elevenlabs.io/, you can't use RVC model on this but its a mostly premium easy way for good quality TTS
FishSpeech: FishSpeech is a 0 shot (no explicit training needed) TTS, if you got a good pc you can use it locally else use their site
With RVC Models:
RVC is natively for Speech To Speech, but forks such as ilaria rvc mainline & applio have built in tts (using Microsoft Edge TTS to make a generated tts audio, which i suggest you to choose a tts model that is the same gender and language of the rvc model you wanna use, and then convert it with rvc)
If you wanna do tts locally with RVC Voice Models (if you got a good pc):
If you don't got a good pc you can do tts with RVC Voice Models on cloud:
-
Ilaria RVC Zero (Running on A100 GPU, free fasted rvc on cloud) and the guide
-
Use Applio UI Colab (with google colab T4 free daily limit gpu)
-
if you don't wanna use edge tts, you could try another tts ai from our tts index and use the output as an input in rvc
If you want actually good tts, use a tts program like gpt so vits or fish speech 1.4
RVC is made for Speech To Speech, no tts
I'm interested in fish-speech, but isn't there an easy way to install it or do I have to put code in Python?
what’s ur pc gpu btw?
u should be able to do it following https://speech.fish.audio/
Targeting SOTA TTS solutions.
whats?
i did a typo, but i meant to ask what’s ur pc gpu
You can check your pc gpu via:
ctrl+shift+esc (task manager) -> Performance tab -> GPU
RTX 3070TI 8vram
yea that’s good enough
try following this guide
how do i make an ai cover
It's a shame that all TTS applications require Python code and there is no executable yet.
- Make ai cover
- Upload it to youtube
- RIAA comes to ur house

Well actually it’s pretty normal for almost every single AI project, unfortunately not everything can be 1 click
Anyways they already tell you the code to run so you don’t need any type of coding knowledge to run it
okay but what if i live in Rwanada do i have to follow the US laws
if you go into github, basically no ai project has an executable file
at most they have .bat or .sh files that just run the same commands that u should
Rwanada? When i search it shows a french toast 😭
How annoying to have to program to be able to use fish-speech
u mean Python?
well, its an Open Source AI program so
its not that bad, you just have to copy paste code
Tbh, never seen an AI Open Source project with a .exe and don’t think you will ever see one either
Python or cython is the way to go exes are stupid for ai applications unless it's a whole program
Or github projects in general
It’s also because it’s easier also for security to read the code while for a .exe it would be hard, and .exe are technically runnable only on Windows (well you can still run them on linux with Wine, even if it could have some issues, but it’s better to just give the users the code to install it)
Ayo? @covert lake level 113 !!! 
never seen a .exe on github
I have before but on github there is always the source code usually
Oh lol
i think people should know more that almost no ai project has a .exe
Python 🙏
Cython is the best but not as much stuff uses it
Yea never used it either ngl
What programming languages do u know btw?
Really only basic python because most of the time someone else did the coding for me
ah its basically same python code, but it translates to C so it’s faster
i would be okay with it by just knowing python right?
oh understandable
Yeah so we need w-okada cython and applio cython
Im just a student but i know python & html
Probably
Yeah so someone should convert those 2 to cython instead of worrying about messing with rvc models optimizers and stuff
@chilly lake what do u think about this ? May u know more about cython
The difficult thing is that rvc is spaghetti
Yeah that's why applio and w-okada not default rvc
I already mentioned that python automatically compiles code with CPython since quite a while ago. You do not have to write CPython manually unless you are crazy enough to do that
https://peps.python.org/pep-3147/
huh python uses that by default?
At least since 2010
You can notice that when you run python code, it creates pycache folder with pyc files for each file (and for all its imports as well in their own locations)
That's compiled bytecode with CPython
What just happened, I left for a second and they started a tremendous conversation
And there is no difference between Cython & CPython?
its a sign u have to learn python 
I should, but I don't want to, it's too much work.
how can i use instant dataset creator
im looking for a dataset creator that can filter out the vocals and cut silence and everything else
Oh, wait, cython. No, that's different, it actually translates python code into C code that is then compiled with C compiler. So it is as if you'd write a C/C++ library with bindings for python, but in a more convenient/python-specific way
There's alot of work that goes into restoration, adobe is coming out or came out with an updated ai tool for cleaning up audio, it's still far from perfect but it's not too bad
Ayo? @loud kiln level 1 !!! 
u could see w3school tuts tbh
i dont think there’s that, u have to use UVR to clean the dataset
ok
fair enough
So it’s better than CPython?
In terms of performance, yes, because code produced by Cython is not python, it's a pure C code. But you will tradeoff portability of such code since it has to be compiled separately with C compiler for each python version
from what i understood CPython compiles it in bytecode
While Cython actually translates it in C and compiles it right?
oh damn
Did you ever try it out?
it’s interesting
Personally didn't, but not sure if it will be an easy or useful rewrite in combination with torch. Even though rvc model code is python, it's compiled and optimized by torch already
And wrappers like realtime/offline inference mostly use numpy arrays/torch tensors which are, again, C code already
the better way to do is the guide like this https://rentry.co/RVC-dataset-RX11
hi
.wk xaviersobased
👑 adiiose - 5284 plays
** 2.** #igrok - 4388 plays
3. remy - 3999 plays
4. exan - 3174 plays
5. luminal - 2823 plays
6. yeresin - 2555 plays
7. sqwab - 1990 plays
8. vice - 1808 plays
9. wg - 1346 plays
10. osc - 1191 plays
11. boshia - 1105 plays
12. Teckmek - 1032 plays
13. Markus - 1030 plays
14. buga - 1026 plays
f
you mean program to use for speech to speech models?
RVC
what’s ur pc gpu and what are u looking for?
Hello
hi
Yo
Do what?
hi
Nick do you know how to do this!!!!!
😭
i dont know why some directly ask how to do this
its not the first time i see smt like that
And then they disappear into the void forever
fr
Training all Overwatch2 tanks as single RVC model
Hello everyone, there is some kind of lightweight version for AMD video cards, in the version MMVCServerSIO_win_onnxdirectML-cuda_v.1.5.3.15 my voice is either normal or broken
i wanna know what realtime voice changer works best for me
ihave
15 440
16ram
rx 470 4gb
windows11
iwanna it simple like ui and this stuff and good
If you live in the US, check your chicken, it
might need to be thrown out. There's a recall
on a lot of frozen chicken brands relating to
listeria
Frozen waffles and pancakes, and fish in
some places were also recalled.
(I'm sorry for the off topic just wanted to warn everyone)
maybe someone knows
its gonna sound like shit
i think you are talking about wokada, well there is a fork (modified version) which has better performance if that’s what ur looking for
-rt
This interaction has expired, use the command
/guidesif you wish to see it again.
1st link
Why?
Is RVC pretrained (170 voices as single model) like shit?
can you give me a stable lightweight version? I will be very grateful
ur gonna mix multiple voices into a single model
but that’s used a pretrain not model
read the 1st guide
-rt
This interaction has expired, use the command
/guidesif you wish to see it again.
thank you
yw
Pretrained model is the same we use, bro
you’re talking about using multiple voices in one tho, u can’t like use each voice
I'm not mixing, they are in each id
Yes i can
iirc speaker id didn’t work, unless im wrong
That's 12 speakers in one model
I already tested, bro, they are all correct in the model
Just need to train more now
huh fr?
@chilly lake speaker id works all fine?
You used one of many voices from a mixed voice model? Where did you get this information from?
last time i heard of someone tryna use it i remember it should have been set to just 0 as it didn’t work
Not mixed, just changing speaker_id.
In webui? No. They are lazy to implement, that's why i don't like GUI
So is it like many voices packed in one model like a ZIP? Because I've never heard about it being used generally.
prequel: I wanted to run comfyui with FLUX on colab servers and so I installed all required files like unet, clip etc. And when I clicked to generate the image, it took about 2 minutes and then the UI just shutted down. Other models work normal, so I guess Colab's 12 gb of RAM aren't enough
And I decided to run this stuff on kaggle because it has ~32 gb of RAM. And I rewrited the same code for kaggle but when I ran the last cell, it returned error
Oh i was talking about UI, there is an option but doesnt work
are you using the rvc cli fork by blaise?
Rvc kinda supports multiple speakers, but I'm not sure anyone uses it. Wokada allows choosing speaker id and it should actually work
Yes, like a zip behavior, but the model size is the same as single voice
Yes
any ideas?
Last time i tried didn't work, but works in RVC webui, idk why yet
for Applio the model needs record of speaker IDs, then you can use them
Even blaise is broken, i had to fix many parts
Fork or original? I don't remember if I ported speaker choice properly from mainline but I never tested
Something doesn't exist in Torchvision? Have you tried reinstalling Torchvision?
Fork, changing speaker id don't change the voice
Then it might be that I either removed part of rvc code responsible for speaker choice or forgot to pass it to model
I can test again later, could be false positive since i didn't tried all voices
But VC kinda stuck with my model
changing speaker does change the voice, I tested on OG pretrain
So I'd just need to make a dataset with a folder for each speaker, train it and id be able to use that feature?
Maybe make sure you compare between fork and original to make sure which code was added and missing. 
If you have a working model, I can try fix this
yes
i really need to take a look and chage the way the speakers are enabled on UI
What version you tested?
any, if you pass speaker id it works, default is 0
I'm wondering. Who will train anime twin voices into one single model? 
Ayo? @solar torrent level 13 !!! 
Yes, you need custom feature extractor and custom dataset preprocess as well
Where to send it to you?
Idk if i trust
😂
extract your own then 🙂
can u share? would like to try
Yeah
awsm
da pomogite naxyu
It's not finished, also don't know what to do with it yet, i'm considering to sell or something
emb_g.weight torch.Size([109, 256]) torch.float16
this is a tensor with speakers - 109
id rather pay for a tutorial on how youve don preprocessing lol
How would others know which Id is which voice vtw
If you know Python you can just take a look at rvc-cli from blaise
install a compatible set - torch, torchvision, torchaudio
ill have a look thx man
I mean it should be fixed in Applio UI too
And ye Ik python, it's just I do things in cloud
you dont even need rvc-cli, just use env\python core.py infer
This is the failure, people need to attach a text file to model path. In beatrice they solved it using toml file
You can send a link in dm to google drive for example
I only heard about Applio once, idk what exactly it is
would this also mean that it requires less computing training several models at once?
This can potentially be solved by including id-name mapping in model metadata, but no one does that currently
I will send multipart zip ok?
Sure
RVC fork
Made by Blaise and the Applio team
Like Vidal too
Including a text file would make so it doesn't work when someone tries to download it directly on cloud tho
Theyd have to extract it on their PC and use the manual uploader
Doesn't it work in the web UI too?
it only works from UI if it was trained and recorded
since all models trained with og pretrain have the same tensor, they kinda all have all the original VCTK spakers
maybe need to delete this weight when training a new model
would work if the cloud read it, is just a reference index. Using Beatrice you can zip all the models and import zip in VC, it will load voice names.
you also can put index in model dict
but you would need to tell all people you are doing it
idk why they didn't implemented it on beginning
Any unnecessary files would make the downloader not download 
it's necessary when you have more than 5 voices in one model, but they didn't though about it in beginning, now someone need to create a convention
I understand, I'm just saying doing like that would just make more people confused as it wouldn't work
If i want to train specific language model do i have to tune pre train model to get it sound better?sry i just start for like 1 week
how do i delete models
It would have been better also if instead of the Id it could use a name
yeah, best way is to put it on model dict
Which language are you looking for
@icy pendant
idk even know why they did faiss index separately
Is there a way I could do that?
Btw, how's Beatrice going?
you need to code it
not so good, but not bad at all lol
the metallic noise is annoying
but not all voices have it
Overwrite the slot with a new model
Or go to model_dir delete the slot number
some you can workaround with EQ and such way of speech
So it's like a "cheaper" RVC as of rn?
ok thanks. btw whats okada?
Thai
no, it's good, but general quality is worse. I guess their aim is to make it faster for realtime, in this way beatrice is much superior than RVC
So a faster but less quality RVC yea
Voice Changer
The Okada is a newer voice changer that works just like RVC.
if they fix the metalic noise will be not far from RVC to be honest.
whats the difference
Ayo? @mortal sail level 1 !!! 
Ah, you could try with the OG pretrain
You could also make a pretrain if you want but it's hard to do so https://discord.com/channels/1159260121998827560/1233407331405004954
Does Beatrice get regularly updated?
w-okada is the maintainer of the repo voice-changer, which is a tool to run RVC and Beatrice models
The difference being the resource usage and supports more models I think. 
idk, most people here don't even know that is there new version: v2.beta2
i guess it updates slow
but it's not dead
Okada is recently maintained, while RVC is kinda outdated since its last update was last year. 
not that outdated, RVC v2 is pretty good, no need to update too soon

Go down, go down, go down, yeah
Ayo? @tired jasper level 12 !!! 
Getting graph like a slider. 
Where do I get the realtime voice changer
-rt
This interaction has expired, use the command
/guidesif you wish to see it again.
Thx
Afternoon
don't use drugs, it's morning, joke
Thanks also, i have dataset that speaks thai , korean , jp and a bit english
While i speak thai and jp it sounds good but with english it struggle to pronouce
Is this because of too low english dataset?
Roadhog from Overwatch2 in the multi speaker model
the model learns 1) voice 2) prononcuation of specific sounds
if your dataset lacks in specific sounds, then it wont be able to reproduce them correctly
for that purpose you can use a pretrain and train your voice on top of it
a voice model can fix some accents, but it cant fix bad ones
Thank you.
Ayo? @vital snow level 1 !!! 
also if you use an index, it brings prononcuation from the trained dataset, so that may result in an accent. if you dont use the index it keeps the original
that's the 'retrieval' part of rvc
model seems to starting to stabilize loss =(, some of voices are robotic yet, maybe i should not include too different pitch voices in the dataset for single model, or increase dataset, idk
how much data you use?
Imagine the AI Hub becomes the Rizz Hub. 

Where do i make ai art
Hello guys! I'm new here. Can you help me which program I have to use for voice change?
idk
At #🏙│ai-images, there's a latest pinned message that includes some of most known AI generation websites.
What blud typing
Thanks
Can i create?
Ans
And what do i say for the bot or what it is to make the art
Each AI has its own way to generate. For Stable Diffusion, you type some words in prompt box, click generate, and it will generate an image you typed. 
Give me an example
Ayo? @hazy ferry level 1 !!! 
you can go to #🤖│bots and ping the for bot to ask it to generate an image
that's probably the easiest thing to do if you don't wanna install anything
A chicken at a farm
I can just say : generate me an image of a dog?
yis
o k a y
Does GPT-SoVITS support other languages like portuguese? If not which tts solutions support it?
What bot

In the guide https://docs.ai-hub.wtf/tts/gpt-sovits/
It says:
GPT-SoVITS is an open-source repository focused on TTS & cross-language inference, with a Colab port coming soon. Credits to RVC-Boss.
but then says
Currently it only supports Chinese, English & Japanese. More languages are coming soon.
So does it support training in other languages or not?
guys how tf do I generate images
Ayo? @silent flame level 1 !!! 
@elder willow
To generate images for free, either:
- Use @elder willow in #🤖│bots (It's powered by DALLE3, from ChatGPT+)
- Use Open Source Models like stable diffusion & flux that could be a bit harder but better, what's ur pc gpu?
ah alr bet
I
Ayo? @hazy ferry level 2 !!! 
most ai things are in python tbh
Helloo
Ayo? @ashen blade level 1 !!! 
Hello beautiful people
Helo
ello

Bro I, did it. I created the world's first entirely If - Else based neuron.😅
"Unlike with normal models having the dataset for a pretrain too high quality might cause it to be more sensitive to noise when using it to create a new model so I would recommend making it a finetune of the base RVC pretrains.
To make the pretrained model from a scratch, audio with low-mid quality should be considered."
how it will be different from normal high quality dataset?
i only need to remove instrument and keep the reverb , noise , echo ?sry im dumb
you will not get anything good by training from scratch
not unless you use a bunch of high end cards and 100+ hours of audio
i just want to tune from original one
guys how can i use the model ?
yeah... my rtx a5000 isnt on the list maybe i just give up and work more on my dataset and train other model
do you have anytrick to cut out noise without suppressor?all my model face this problem
having a slight noise in the audios is okay
having distortions / bad mic / someone revving a motorcycle outside is not
with a little noise the model is more adaptable to handling the noise in the inferred audios
use UVR5
Ahh my house is near highway so there are a lot of vehicles noise
Thanks for advice! I fixed my model's english by get more data
Luckily he do speak many languages
UVR can remove some noise, sure, and long as it does not damage the actual voice
k
Wazaaaaaaa
is this server under china's support?
💀
No, this server under the support of the shadow government
then another server will replace its place
no way
the shadow gov aint messing around with these e girl voice models
hmmmmm
shhh they must not find out
Multispeaker actually works, it's just that the difference between some speakers is not big. I didn't notice a difference between speaker 0 and 1, but 0 and 12 are completely different. Though there're few problems with the UI
Like you cannot add speaker IDs (I guess they were supposed to be populated based on model metadata), and metadata has "yet another field" for number of speakers and the code is not aware of it
you can run inference from CLI and just provide a new SID
every trained model has 110 original speakers, well... 109
0 is the one that gets replaced by regular training
Hello!
The problem is GUI and standardization. I understand that you can just specify an arbitrary SID and expect it to work and that's how I ended up testing it, but that's not how it is supposed to work. And I see that Applio, for example, provides the number of trained speakers (not pretrained) in speaker_id
https://github.com/IAHispano/Applio/blob/1f743e9ab0cc80c8238afafed259e7665fca713f/rvc/train/extract/preparing_files.py#L70-L74
yeah, we've started collecting and storing them recently
May need to do some extra work though, that was just an experiment to check whether it actually works
Hey there! Is it possible to convert multiple files at the same time on 1 GPU?
DDPN RVC apparently provided speaker_info with SID as key and string as a value. At least wokada voice changer was taking it into account
https://github.com/ddPn08/rvc-webui/blob/b71742809a24cd89eb18081b831c0b1ac11ccb2a/lib/rvc/checkpoints.py#L113-L114
seems like they had a json with speaker names?
But oh well, yet another dead fork with its own pecularities
so what though about all models having all speakers is not exactly right
there are different results if you run inference, but they are not original
i need to try a different sample
Having all speakers is not a problem. The problem is that there should be a standard way to at least understand how many trained voices are in the model and which ones you may choose
Sure, but there's no other RVC clone that does multi-speaker training that I know of...
And that's unfortunate, because everyone was claiming that it just doesn't work. But anyway, adding an object with speakers would be nicer since you'd be able to also label each speaker
Just knowing number speakers based on speaker_id is okay-ish, but you'd either have to figure out and label yourself or model maker must provide a description
would probably help to just delete speaker weights so when you train a new model only the trained ones are available
so it would make it easier to just list the speaker dimension as max
and then the model maker can provide a metadata file with speakers
Can someone link me the google workspace RVC tool? That or whatever is the commonly used one these days.
Already in the model dict, no need for a separate file
I mean names, etc
Can be stored as object or whatever is easier to serialize/deserialize. Even csv would work in this case. Model config is already stored as stringified array
i can be just a normal dictionary, no issues storing them
safetensors supports only string as key and value, but it's not like a big issue
someone can help me?
with?
Im trying use the Ai Voice Changer and isnt working, even if i try use others channels and not working
for pth this works just fine import torch m = torch.load(r"x:\test\model.pth", map_location="cpu") s = {"0":"Arnold", "1": "Bill", "2": "Mike"} m["speakers"]=s torch.save(m, r"x:\test\model2.pth")
Hi
Ayo? @vestal temple level 1 !!! 
Does anyone know how to use the new voice changer?
a video tutorial?
Can someone tell me how my voice changer sounds
wdym new voice changer
The voice changer does not recognize me
When I put the file it tells me, "You need to specify a configuration file" appears.
miss the old peak days
how do make one of thoes ai text to speech vids
hey guys, does someone with nitro who got a xbox gamepass gift trial and not use it could send it for me in dm ? really thanks if you do it 🖤
0 is female, 1 is male 🙂 so the difference is on transpose
Vote Booster: Vote now for a 10% boost. https://arcane.bot/vote
✨ Tip: Use /card to modify your rank card
Does using old version of snap bans tour account?
yes
Ayo? @fleet hawk level 1 !!! 
Not new I am old but i never really show up that much
ok brah
hi
Really?
Ayo? @spare lark level 1 !!! 
hello
Why would you use a older version
Ayo? @ornate ore level 1 !!! 
it's usable in #🤖│bots lolz

wut are u offering?
you sub to channle

done
Ayo? @worn mesa level 1 !!! 
subscribed
hey anyone who is good at voice cloning and wants to get paid dm me
@final cairn слушай, а ты сможешь ещё голосок репринцева намутить?
dose any one whant to subscribe
Ayo? @ornate ore level 2 !!! 
please
no, unless you share it in #1159290752195633273
any reason to include multiple voices in a model? probably?
- switching to another voice on the fly (for like RP-ing purpose), I don't 100% agree to this tho
- the same speaker in different expressions, e.g. normal speaking, screaming, sobbing, whispering. may be useful if the inference could automatically select one of them that matches each part of input voice part.
idk to be honest, it's an experiment, i want to compare with solo, and i'm building a script right now to test many models
i want to know if it worth
also it can save much money when training
i have 12 voices in one model, took around 160 epochs to all 12 voices, when it took around 90 epochs for each solo
i want to know where to find the most quality speech audios possible now
so it will be fair test
I'm not sure how it works for that situation, maybe like? training a LoRA with multiple trigger words for each character outfit, rather than training different LoRAs for each character outfit
a specific voice does not take all that long to train
after ~5-6 epoch sounds it already sounds okay
.
Goated server
fr it gives support more than my dad
Ayo? @vital snow level 2 !!! 
CRAZY ...
waht
erm hi
it does not take that long
Hi
Oh hey you're here too
?
lol
💔
Hey devs! 👋
I'm looking for a developer partner to join me in building Agelix, an AI-driven platform that helps creators and businesses with everything from social media content to sustainable practices. If you’re into hands-on projects, brainstorming new features, and making tools people actually love using, let’s chat!
Shoot me a message if you’re interested or just want to know more.
#generate the picture of one dog is going to park with here chileds
what
for very clean voice input i can't notice any difference between train solo speaker and multi speaker
Hello.
what to do if voice is squeak then if i lower pitch it will be distorted
Do time stretching instead
meeheeehehe
anyone wanna vc
I wanna test something
Are There rvcv2 collabs with pitch min and max (rmvpe +) that work
Ayo? @hollow summit level 1 !!! 
Anyone got a anais tawog AI voice?
You can search rvc ai voice models at:
- #1175430844685484042
- In #🔍│find-models :
- Send " @atomic vector search (name of the model)", without the ()
- Do /find with @hidden grotto
- https://weights.gg/ (login required)
- https://huggingface.co/models (but watch out cus in hugging face there arent only rvc ai voice models)
- https://applio.org/models
- https://voice-models.com/
- https://thevoicemodels.com/ (for Turkish Models, login required with discord and level 2 on their server)
if there isnt one, you can:
- #1159289738314919936
- #1191429836321849435
- make it yourself with our docs guides https://docs.aihub.wtf/essentials/how-to-make-voice-models/
that rmvpe+ is only for inference (use models) btw,
There's:
- Ilaria RVC Zero (ZeroGPU, A100, HuggingFace Space, faster than colab)
- Applio Colab
Table Of Contents Introduction (with website link) Model Loader (Download & Upload) Inference (use RVC AI Voice Models) Ilaria TTS Settings (Inference) Vocal Separator (UVR) Troubleshooting “No gpu is available for you for 60s” “GPU task aborted” “You have exceeded your GPU quota ( NUMBER s l...
I see but you can’t adjust the frequency off the pitch like the older versions
- Creating Datasets for RVC using iZotope RX11, by Cauthess
- Gathering and Isolating Audio, by SCRFilms ❄
- Instrumental and vocal & stems separation & mastering guide, by deton24
- Vocal Mixing Tutorial, by Roomie
- https://mvsep.com/
it maybe dumb question but if i use only singing dataset to train speech model will it work? I always mix both speech and singing so idk
why is my ms at 41000 (im new to this)
what r u using?
"voice changer client demo" im tryinna download villager voice
what gpu are you using?
nvidia mx250
i dont know if that gpu can do much try lower your chunk
damn gpu so old the official specs page on Nvidia site is deleted 😭
I searched it online and it seems to have just 2gb of vram
the minimum would be atleast a gtx 1650
You could try the Fork which has way better performance and even a CPU model
Guide style is in the same as Blanc_dot's. Thanks Blanc_dot for corrections. Most technical information comes from deiteris.
Last update October 6th, 2024: Multi PC setup explanation added
Translations added for:
German: https://rentry.co/ForkVoiceChangerGuide_de
Turkish: https://rentry.co/ForkVo...
not sure if that would work decently tho
In case it doesn't, there's Cloud (remote good pc for those who don't have a good one, YOU CANT DO THIS ON MOBILE):
- Google Colabs (4 hours daily of free T4 gpu, easy to use, require only a google account) :
- Kaggles (30 hours weekly of better GPUs, T4x2 & P100, harder to use, requires an account and a phone number)
i have only integrated graphics too
I do most things on Cloud
still a decent pc for some indie gaming tho
i3 10th gen with 8gb of ram
ohhh same here
it still good but i struggle with my ram 8gb isnt enough for me now
real 
Does anyone have a female voice model for some with a low pitched voice? I wanted to mess around with my friends.
yo
poyo*
hi
Hi
I use wokoda fork with rtx a5000 i realised it use very low gpu resources but when i try to upper settings it shows red pref but still low gpu usage what am i do wrong? Or i should change my gpu
Ayo? @vital snow level 3 !!! 
My spec
AMD EPYC 7543
16 GB RAM
RTX A5000 24 GB
What's this server about...
I dont know about the compatibility with an A5000, I dont know much about that gpu. But what number does the perf display, and what f0, chunk and extra you using
AI, mostly RVC & Wokada but u can also find other things such as FaceFusion & TTS
Im here because i gave a bot permission about " join servers for you" ...
And now it threw me here
Leaving thanks a lot ! :))
120ms , 2.7s extra, rvmpe
My pref show 70
It uses like 4%
Of my gpu
AI HUB Docs


🔥🐪