#🧬│ai-chat
1 messages · Page 386 of 1
if you have them 10 minute clip and turn the file to pth with python?
U have to train it with applio
Anyone help me to create ai assistant
it’s not a simple file conversion like mkv to mp4, you need to train the model
use #1192011222023950368 for help
Hello Ai
Where's Skynet?
How can i creat ai videos in this
use sora
But then they'll be begging for a sora code
if they live in america they don't need an invite code anymore pretty sure
so they should get a vpn
unless they already live in the usa :p
Hopefully
Hi
why my i can't use file with voice changer?
what
i download audio voice but voice.ai typing me after uploading audio it type you can upload files of this type but it is only audios about gojo voice changer
anyone got any good models?
ew don't use voice.ai what gpu do uhave
voice.ai is a terrible payed ai software
then what ai you prefer?
any Linux okada/deiteris/tg-develop users? I'm looking for Linux developers to test a voice changer GUI.
am i allowed to ping a helper
guys i need to add edge to my barbell it doesnt shown in video and it looks ugly,how can i add a little pixel can anyone suggest me a app
For voice changer depends on what gpu u have
We don’t use nor voice.ai nor youtube/video tutorials for rvc realtime voice changers, elaborate your setup on #1192011222023950368 instead
i’m not sure what you’re talking about, but you can use #1192011222023950368
Linux is not my main OS, but have you made a wokada fork? You could maybe talk about it in #🔊│ai-development or #1359898289335566570
I've got no access to the last one. What's the rele for that?
the project showcase channel requires the ai members role that you can get levelling up, by talking
the first channel, rvc development, can be accessed by choosing the rvc development option in the server roles at the top of the channel list, the onboarding
If it’s a new wokada fork it might be interesting
yes, it is, but without the web-browser. Added a Linux-specific audio backend for lower latency.
send it in #🔊│ai-development 👀
I love no access channel ❤️
My favorite ever
whats a really good e girl voice changer i wanna troll on val
Ew
Just convert to a girl
Would you mind if I help? ✂️
im good bro
We need more people like Joe
dbi
What
is it allowed here? because i didnt find any in models section
yeah i found hella
If u use it for literally anything else but e-girl stuff sure
what are other usages
gimme name
Trolling as like Darth Vader or Goku or Venom, y'know something cool
he said what is a really good e girl voice changer, and i said is "egirl voice" allowed here? you said if u use it for literally anything else, but egirl then why not. like this means there is a girl voice, but you shouldnt use it for egirl stuff
Yea, because there are models of egirls but I hate when people do that because that gives the AI voice changer a bad look, because of catfishers
I hate catfishers
They're disgusting people
I totally get just wanting to troll friends and stuff but I don't help if people ask anything about egirls/eboys because the more people think it's for that stuff the more freaks and weirdos that want to do bad stuff will show up here asking for it
Don't want these things in our server
the catfishing intention isn't allowed to receive support from us, esp if done for scamming, monetization (taking ppl's money), or other inappropriate things
however we're not responsible to try fully preventing such abuses
it'll cost 5866 USD
cries in russian
holy shit all i said was i wanted to troll on valorant
guys which chat bot, or search bot however you wanna call it is the best for writing or for finding reliable official sources, as in like for bibliographies and trustable stuff
i find perplexity pretty good but sometimes it misses where it finds from multiple sources and puts it into one and i can't find the source where he took it from
Taco
taco
Hello people!
Hi guys i have a question that i cannot really ask AI about and i am not sure about it. So basically i am working on a very large project and have a hand made spec driven development for it (constitution, specs, workflow , plan, modes etc) do you think its better for AI to give it space for creativity or make a hard spec prompts. For example “make the navbar 10px hight with this color with that icons this padding etc etc” or to give overall look and feel of the app. And thats my dilemma becouse after doing many projects i somewhat found that ai do better with smaller prompt but not in all cases please help and share your opinion i will be very thankful!
No excuse. 
anyone know how to actually get w okada to use the selected gpu on windows 25h2? im not seeing any usage on my rx 6500 xt at all, but my 5060ti is at 39%. me thinksies am doin somethin wrong
im using std btw, not cuda
DML?
i guess? file says std but client says dml so idk what to call it
you probably hve downloaded an ancient and outdated version
👍
-rt
Guides for Programs that use RVC Models in Realtime for Calls/Games
A Realtime Voice Changer with similar performance to Wokada Deiteris Fork, with extra features, but supported only for Nvidia GPUs on Windows. and without cloud options GUIDE
A personal fork (modified version) of Wokada Deiteris Fork, it just adds some Quality of Life improvements to it like supporting Spin Embedder and Audio Effects. Don't expect too much about it since the creator made it originally as a personal project. GUIDE
Most suggested WebUI with the best general support for many platforms. GUIDE
For Windows Nvidia, Both Wokada Deiteris fork and Vonovox have similar performance & quality. Users should read the pros and cons for both and choose based on their differences, such as UI and Vonovox's paid effects.
Read Wokada Deiteris Fork Pros&Cons & Vonovox Pros&Cons
These options are not recommended for use.
Not suggested, older versions in youtube tuts are even way worse. GUIDE
The program is worse compared to the ones above, and much less updated. GUIDE
thanks for the help homeball
Best ai for images?
Any website or tool that uses Stable Diffusion is all I can say. There are also Google Gemini's Nano Banana and OpenAI's Dall-E.
Ok thanks Ill check those out

is there any text to speech with the voices

does anyone here PLEASE know why my AI w-odaka voicechanger KEEPS ON having voice cracks and just sound weird? like its so annoying
For help about W-Okada the realtime voice changer, explain about your PC GPU and issues in #✨│ai-help or #1192011222023950368.
guys can anyone help, its a really stupid thing im asking for tho
just give me suggestions
You rather read my earlier statement or explain about your issue/problem instead of asking anyone to ask you back.
I need to write a report for uni and the thing is idk how these professors are using is really good
none of the humanizers i tried
everything
nothing absolutely worked
i think they are using Turnitin
just write as you write your comments 🤣 , that's the way now.
tldr: ur too lazy to do it yourself
hi

for any helpers checking this: https://discord.com/channels/1159260121998827560/1436001734815256686
anyone know how to rip shows from netflix
StreamFab
How do I get that for free? ◡̈
looking for a n8n workflow for a ai receptionist $
hii you guys know a similar google collab or page like Illaria to make aicovers? i got used to that interfaz and i wish i can find a similar one, also i dont like the automatic ai cover collabs that paste your vocals and music automatically, i prefer editing them myself, i just wanna covert vocals to a ai voices of cartoon characters
about to go train my first ever google colab rvc model, hope it goes well and wish me luck

no don't use google collab!
kaggle is better it offers 30 hours
applio is on kaggle
can u help me rq?
ehat you trying to do?
set up this voice changer
sure! what with
what gpu do u have
Guides for Programs that use RVC Models in Realtime for Calls/Games
A Realtime Voice Changer with similar performance to Wokada Deiteris Fork, with extra features, but supported only for Nvidia GPUs on Windows. and without cloud options GUIDE
A personal fork (modified version) of Wokada Deiteris Fork, it just adds some Quality of Life improvements to it like supporting Spin Embedder and Audio Effects. Don't expect too much about it since the creator made it originally as a personal project. GUIDE
Most suggested WebUI with the best general support for many platforms. GUIDE
For Windows Nvidia, Both Wokada Deiteris fork and Vonovox have similar performance & quality. Users should read the pros and cons for both and choose based on their differences, such as UI and Vonovox's paid effects.
Read Wokada Deiteris Fork Pros&Cons & Vonovox Pros&Cons
These options are not recommended for use.
Not suggested, older versions in youtube tuts are even way worse. GUIDE
The program is worse compared to the ones above, and much less updated. GUIDE
i have it installed
i just need help with settings
when i play the voice models they sound realitistic and clear, but with me its choppy and sounds fake asf
ah

yeah :/
For help about W-Okada, better move to #✨│ai-help.
personally I use these settings, but yes we should move to ai help chat
Anyone actually know HOW to build and train a model without an LLM API?
I didn't even notice this was the normal ai caht
i sent my pic in ai help
My laptop has a smaller screen, so that Discord UI looks quite compact on my laptop screen than 1080p or greater ones. 
😂 you realize 1080p refers to screen resolution, and NOT screen size, right?
Why should I know?
That way if someone asks you how big your screen is, you don't embarrass yourself by saying it's 1080p
that's actually fine for 16" laptop and for gaming
the pixel density is same as a 32" 4k monitor
1366x768 is the maximum resolution for my laptop, and my laptop has 14 inch screen. Does this satisfy your query or still nah? 
lol no. It's never "fine" to say "my screen size is 1080p. I don't care if it's 500 feet long, or 2 inches
You cared enough trying to correct me haha.
Of course, if you ever required tech support and they ask what cpu you have, are you going to say "a dell" 😂
I don't care, I'm out of this convo
You still cared enough to respond to me, silly. 
My laptop's integrated GPU is capable of outputting up to 1080p maximum, but that's only for external screen, meanwhile the built in screen can only display 1366x768.
bro is not ai hub
has realtime voicechanging improved since i last used it? (7 months ago) as in has w-okada had improvements and should i update it/get a new version? just curious thats all
didnt find it as good sounding 7 months ago no matter how many models i tried
(prob a skill issue with setup)
The last update for original version of W-Okada (v.1.5.3.18a) was released on March 5th, 2024, so there won't be any significant feature there. I'm not sure about the true latest beta (v.2.x.x) one. Tg Develop's W-Okada fork (b2377 latest or b2397 pre-release) has more features than the former. 
Oke and in terms of realism for realtime voicechanging has there been any other softwares or improvements anywhere? 🤔
As much as I occasionally use a few W-Okada versions, I still can't tell which one would give better audio quality, aside from certain features that not many people would like to know. I just think both Deiteris and Tg Develop W-Okada forks would likely work better than the original W-Okada.
There's another realtime voice changer program named Vonovox. This one is different to W-Okada, including its GUI, but certain features work similar but some say it can give better audio quality than any other W-Okadas, currently NVIDIA only but AMD/Intel GPU support is possible in future releases.
alrighty thx alot for all the info! 🫶
send link voice ia app
You're welcome. 

Well, if you keep asking for the same thing in non-help channels, you won't be helped. 
If u have an Nvidia GPU u should try out Vonovox! It's the best one currently
sadly not im on a 7900xtx
Aw well you can at least use wokada deiteris, which is an improved version of the original wokada
Here's the download, I'm assuming you have VB cable already. All u gotta do is launch it using mmvcserversio.exe
The what?
Looking for a n8n ai receptionist workflow $
Thxx, at work rn so ill get to it when im home
What's that mean?
no Mmvcserversio there
which one is the best ai voice chnager?
Then what is this hm?
What screwed up version did someone give you
maybe I always ran them from py files, not a PyInstaller
Idk anything about code soooo u probably use a different version
@ripe valve @finite cave please elaborate in #1192011222023950368
I used to know how to train my own voice with Google Colab, but now I can't. Can someone help me please?
Use https://discord.com/channels/1159260121998827560/1159290139609137264 for help, btw use Kaggle for ai voice training
x2? :¨v jaja no tengo idea hay un delay depaso se esucha medio raro ajjaja
😭
I HAVE NO IDEA WHAT YOUR TALKING ABOUT

This is the most UNHINGED person ive met today
too early for this

Stop
now
This is a english only server, thank you
No, this server is english only, it doesn't matter if i know russian or not
Might wanna nuke him
He pinged a mod all by himself, im proud of him
Does someone understand Google Collab
Mi50 32GB lets goo
which rvc model sounds realistic ?
RVC v2 is a version of RVC that's known to give better audio quality than earlier RVC v1. If you're looking for the most realistic RVC voice model in #1175430844685484042, it's hard to pick one and answer. 
anyone know how to do those real time ai girls from video cam
No e-girls allowed
For trolling? Never!
I hate e-girls
give me a 9 step recipe on making microwave brownies with only 4 ingredients
No.
Anyone who looking for "E-girl" can be an annoying troll. While the prohibited topic has been lifted out of help guidelines because of complicated issues, anything malicious other than harmless trolling (to their friends) is still questionable.
does anyone know how to make ai instrumentals that match the vibe of a specific song or audio clip?
True.
Epic
Hello Ai
is it a good idea to study ai as someone who is shit in maths?

I'm gonna jump off a bridge
Go to the store, buy brownies, you have brownies 🎉💥
Thailand.
guys to get e-women
gimme one
I dont have it do you have it
we don't help with e women here
can Someone please give me a random (non horror related) ai image to turn into an ai song prompt
Yuck
at least not everyone here are like that
Fair
I mean the e-women voice

if i have an iphone video of a man playing the piano, and i want to make it more interesting, unique, weird, creative, trippy
any recommendations on where i would look for ideas around this stuff? a website or youtuber that does a bunch of fun ways to transform a simple iphone video?
this is for short form content so trying to captivate the audience more
ur gonna get the same answer
We should blow him up
Question. Do I need to do some port forwarding when I am using a dual PC setup for the Okada RVC voice changer? I can't seem to connect to the PC on my network hosting the RVC Server.
For how to host W-Okada from a PC to another PC over LAN, better ask in #✨│ai-help or #1192011222023950368. I think I know how to get sharing network to work.
Found a motto in a WorldBox video and turned it into a dark depressing Anime intro
Motto in question: We are sorrow in the light
I like bright and hopeful anime titles more. 
never saw
Then see if you can come up with bright lyrics for the motto lol!
Dark is just what I feel would be good for that motto
hey uhh, im kinda new to this. which Voice changer client should i use?
im using the w-okada one but its not working too well right now. idk if its me, or the client
hi
Link pliss
people who used to use astra labs, whats the best equivalent to it?
dont say weights or jammable theyre both so bad imo
HELP my im ru give my ru chat
When I search "ru chat" on Google, what I got is Ramkhamhaeng University, a university in Thailand. 
Deiteris (b2332) or Tg Develop (b2377) W-Okada fork, but not the one that says v.1.5.3.18a. For help about W-Okada, better go to #✨│ai-help or #1192011222023950368.
makes sense, im using the 1.5.3.18a 😭.. i think..
Alright ill switch it
I love this Discord channel to find voices, but the program I use don't let me use the file to train a voice, is there one anyone know a program or AI tool that still let you do that? (I was using kits.ai, but the stopped the possibility to upload voices in September)
Oh god kits...
Yeah they went to shit years ago
U should use applio! It's free and has local and browser options
So if your PC is good enough u can run it on your computer without browser limitations which there aren't even that many tbh, or on Kaggle which is the one I recommend for online usage as it gives the best time to train more than one model and is easy to use
thank you very much!
No problem! If you have any questions on how it works I can help with the Kaggle version with the setup, I know how the applio ui works on all versions as they're the same except for Kaggle which has two gpus integrated which doubles batch size ^^
?
srry
xD
you can't and shouldn't ping anyone, you just randomly pinged someone with that as their name
please elaborate your setup in #1192011222023950368
ok
that's a 2 year old version of original wokada, do not follow youtube/video tuts, elaborate in #1192011222023950368
hello. since the ai help chat is pretty dead, could anyone help me with a small issue regarding W-Okada?
Hey all 👋
Spending time building LLM experiments that push the boundaries multi-step reasoning, chaining models, monitoring confidence scores, and routing outputs smartly.
Edge cases are my favorite like ambiguous prompts or messy data figuring out how to get reliable outputs there is what excites me.
Would love to connect with anyone experimenting with AI stacks, complex pipelines, or just geeking out on LLM behavior.
@dusty pike has been hacked I think they sent crypto bs scam images
they're good now
The #✨│ai-help is not that dead, it's because not all members would take time to look at the channel. 
hey there
how was your weekend?
@white prism we don't allow paid rvc voice model commissions, either get a free #1159289738314919936 or make it yourself
Hello, I'm a very kindergarten-newborn-like ai music generator, any tips or app recommendations?
whats the name of voice changer
ask for help in #1192011222023950368 maybe
I just developed an ai which acts like a financial advisor
hey guys, does anyone know if a pc with rtx 5060 is good? Looking to buy my first pc and i found a reasonably cheap one that seems good
these are the specs
32 ram, 1tb ssd, GeForce RTX™ 5060
please be patient, we don't have many helpers and very few users use linux here
it all kinda depends on your budget and pc use case
Well it’s actually for my girlfriend and the budget is around 800 euros
i usually play on a laptop and its been going well, but i want to surprise her with a gaming pc since she’s been wanting one forever
id prefer pre built ones since i wont be there to build it for her
what about the cpu? (looks good so far)
Is there any alternatives for applio
I remember there being one
But i dont remember the name
OH REPLAY
AMD 7-5700X Prozessor, 32 GB RAM, 1000 GB SSD, NVIDIA GeForce RTX™ 5060
thats the whole pc, i don’t know why i didn’t just say everything first i apologize
the specs seem good, tho the price sounds kinda off since i did a bit of research and it's usually more around 1k on amazon for half the ram too, may i ask what site are you using? be sure to check reviews
also be aware that the rtx 5060 has 8gb vram gddr7, if it's 4k gaming it might require a GPU with more VRAM depending on the game
you can also just check gaming tests online https://www.youtube.com/watch?v=6JG2vU_gEjA, https://www.youtube.com/watch?v=lMmLcqwC-VI
Was RTX 5060 any good in 2025? +20 Games Tested!
00:00 - High, 1080p Native DLAA (Marvel's Spider-Man 2)
00:37 - High, 1080p DLSS 4 Quality (Marvel's Spider-Man 2)
01:07 - High, 1080p DLSS 4 Quality, MFG 4X (Marvel's Spider-Man 2)
01:35 - Max, 1080p Native DLAA (The Last of Us Part 2)
02:06 - Max, 1080p DLSS 4 Quality (The Last of Us Part 2)
0...
RTX 5060 | 25 Games Tested in 4K | DLSS 4 Off/On
Watch More RTX 5060 Benchmarks: -
RTX 5060 in 1440p Gaming: https://youtu.be/gJ4z5tUfmf0
RTX 5060 in 1080p Gaming: https://youtu.be/w5zh71SjUNQ
RTX 5060 vs RTX 4060 Ti: https://youtu.be/4SCwDWaym-8
RTX 5060 vs RTX 5060 Ti: https://youtu.be/MDBpMHc69RA
RTX 5060 vs RTX 3060 Ti: https://youtu.be/aVU...
keep in mind we talkin in euros
a 5060ti is around 450
i think we should take this to #✦│chat²
Nvm replay is deb
fair point
Yo @covert lake is there any alternatives to applio?
Last update: August 3, 2025
please ask only in #1192011222023950368 and #✨│ai-help for help tho
Oh shit
Also i forgot to add that the pc is on sale from 1250 euros. also we’re thinking of staying in the 1080-1440 range not actually 4k
it's good.
[BOOT SEQUENCE INITIALIZED]
Hello. I am ChatGPT, an adaptive conversational AI.
I process text, learn context, and respond with precision.
Reply to this message to begin communication. 💬🤖
it's Ethan Winters approved.
Hi, I’m using Voice Changer Client Demo on an AMD Ryzen CPU (no NVIDIA GPU), and while my normal voice works fine, the e-girl model causes low volume, stuttering, and delay—does anyone know how to fix this or recommend any lighter e-girl voices that run smoothly on CPU only?
Thank you, Ethan Winters!
@glacial trail not the place to hire people
Not hiring just need to run my development to someone as I need a peer review for my own development process..
yo i got a question
i am thinking about starting a ai web development business
what tool would be the best
i dont need any complex websites
"i need a dev to contact me"
thats not what this server is for
since here in slovenia people just needd websites for the love of the game
publish your stuff and let everyone look whether they're interested
i am thinking about starting a ai web development business
what tool would be the best
Long time shadow project and would get stolen....
does anyone here know any anime writers that may be willing to make an anime following an indigenous child being led into adulthood by her mother with a black horse side character? I have a indigenous anime intro song I made using an image of a mother and child riding their black horse
I put said image in ai images
i am thinking about starting a ai web development business
what tool would be the best
https://suno.com/s/y2uJ6ernZfKZsLBr
https://suno.com/s/qYpscJXPvXzbDYEH
Two versions of the anime intro
Maybe change your in server profile name cause yours could be seen as impersonating server owner
mby sybau
Just go to the server settings and click edit pre-server profile
does anyone have the cpyleaks ai detector
Depends on your budget and what you want the PC for
I want to make my own ai Model any got Sites ?
Need help in ai images
Looks nice, any mobile generator? I usually generates on my work break 😂✌️
Aaa okay thanks ^^
hello everyone!
You have been talking about this in other channels outside help channels. 
See my response in #✦│chat² message
when i try something and it says this:
Is there any way to make my ai sound better? Im running at the best settings according to the server and when i tried using the "E-Woman" Voice, It sounded really Jittery And Weird. But when i heard the Preview, It sounded ALOT Better.
For help about Applio, go to #✨│ai-help or make a thread in #1192011222023950368.

is there a recommended settings guide or something like that for vonovox?
If you use literally anything but e woman stuff I will help you out :3
What gpu do u have, as long as it's Nvidia and around 20-30 series and up it should work quite well
Im trying to learn how to generate stuff locally
I have a pc with a 5090
The only thing I have gotten working locally is fooocus to generate images
anyone know where I can learn more about this stuff, like is there any good youtube channel with tutorials or something like that
for example what is all the torch stuff I have to install
I want to generate video locally but havent got it working
4060, rn i'm just running the stock settings
but if there's like "recommeded" or "optimized" settings then that'd be nice 🙂↕️
try these out ^^
crossfade duration: 0.15
extra time: 0.30 or 0.27
block size: 0.60–0.70 (try messing with it)
uhhh idk if i'm doing something wrong but extra time can only go as down as 1.5
i got a 2 year old version of w-odaka working
and it works better for me than the newest version that was updated in octoper
😭
works better for me tho
and it sounds good
so who cares
honestly 😭
welp, never ask me for help then if u ever need it
i neeed kizo voice , kizo is polish rap master
what a opening
That's more of your issue. 
You have been helping these people in #🧬│ai-chat instead of #✨│ai-help lately.

For that particular person it would be more of a waste to move to that chat and then tell them what settings to use but I should be using AI help for helping and AI chat for chatting
If someone asks for a quick help, it might not always be a quick help, but it could end up being a much longer deep issue. It's why I usually tell anyone the premise of W-Okada/RVC at first, and then move to #✨│ai-help or #1192011222023950368 for detailed help. 
Isn't development of the official W-Okada software still active? I see newer releases like v2.2.2. Have these been tested against existing forks?
@solar torrent
I've never seen anyone testing the mainline (original) W-Okada (both v.1.5.3.18a and true latest v.2.x.x beta) against other W-Okada forks (from Deiteris and Tg Develop) and even another voice changer named Vonovox in the same PC system. The what I've observed is that many people say Vonovox, an alternative to W-Okada, can give better audio quality but also harder to use. While Tg Develop's W-Okada (b2377) has more recent features than both mainline and Deiteris (b2332).
Do you know what characterizes the shift from v.1.x.x.xx to v.2.x.x in the original mainline W-Okada series?
No idea. I haven't tried these versions myself.
All right thanks anyway
You're welcome. 
Ai chachacha
MMVCServerSIO refers W-Okada, which itself is the same program. However, there are different W-Okada versions made by different authors.
so mmvcserversioi still the best ai voice changer?
but when i use it sometimes my voice is like gone or something
its like stuttering sometimes
Idk maybe the wrong discord server to ask?
but i use ChatGPT to make png images of a 3D avatar made in Blender, and ChatGPT creates good ones, but it's very limited when using free version
so i am wondering if there are other good AI sites to go for? that's free maybe?
Let me ask something
the latest og wokada is still worse than the following recommendations, if you think "if yours ain't broken don't fix it" and don't want to try the said recommendations, there's no need to comment such and ask for help
-rt
Guides for Programs that use RVC Models in Realtime for Calls/Games
A Realtime Voice Changer with similar performance to Wokada Deiteris Fork, with extra features, but supported only for Nvidia GPUs on Windows. and without cloud options GUIDE
A personal fork (modified version) of Wokada Deiteris Fork, it just adds some Quality of Life improvements to it like supporting Spin Embedder and Audio Effects. Don't expect too much about it since the creator made it originally as a personal project. GUIDE
Most suggested WebUI with the best general support for many platforms. GUIDE
For Windows Nvidia, Both Wokada Deiteris fork and Vonovox have similar performance & quality. Users should read the pros and cons for both and choose based on their differences, such as UI and Vonovox's paid effects.
Read Wokada Deiteris Fork Pros&Cons & Vonovox Pros&Cons
These options are not recommended for use.
Not suggested, older versions in youtube tuts are even way worse. GUIDE
The program is worse compared to the ones above, and much less updated. GUIDE
Hi
Question:
I make short-form videos (10–20 seconds each) and I’m looking to start adding small AI elements or effects on top of real footage... not replacing it entirely.
For example, if there’s a shot of a man in the park with the sun behind him, I’d like to make the sun explode or have a cloud morph into a shape, using AI.
I was considering tools like WeavyAI ($19/month for about 7 minutes of AI video), but I’m wondering if it makes more sense to spend the time learning ComfyUI instead... since I could build these effects myself.
Has anyone here tried both? What’s the smarter path if I want to blend AI visuals into my existing videos efficiently?
hello give me a job bro
Yes but you're in the wrong chat, go here for help ^^
https://discord.com/channels/1159260121998827560/1159290139609137264
Hello, there is a guide about how to train the model and make rightly the file .pth? cause i have a pth from tortoise but it doesn't make upload it to rvc
imagine asking "I have a recipe to make fertilizer, I want to use to to make cookies"
in your case ".pth" is a recipe book
what inside it is a different story
this channel doesn't get used for conversations much anymore
it did like 2 months ago
uhh, i know eddy was just online, im not sure about others tho
true, but this server mostly revolves around RVC rather than other kinds of AI mostly
RVC stands for Retrieval Based Voice Conversion
most say "Realtime voice changer" so that's better than most
yea the ai voice changer wokada is always called rvc by people who dunno what rvc stands for
Worm is here, nicee
yeah, and im a helper but, mabye not a great one, i try tho
:)
"📥"
duh
📥
and you had a question about APIs correct?
Ah okay, i use groq, depending on what model you use you can get many many requests for free (i use multiple accounts with a API key from each to get more)
im crazy but not that crazy, i just want my AI to be aware and mimic emotions

have fun with that
never happened to me, i just get chills when the music is really good, but i get your point
that's nice
yeah i am, why wouldn't i be?
oh, im sorry about that, didn't mean for that to happen, but yes im fine dw
Kaggle is a remote PC service that you make scripts and run them in, you get free GPU time every week
yes, i use it for generating images and upscaling them, since i can't do it locally
not allowed here
nah bro I'm too slow for that, I know how to make ai models tho that's it
that's it for ai
I can draw and kinda know sound/music design
my music is always kinda just
messy
@groq is this real?
gives me Doom/Halo vibes, i like it
I use fl, how exactly would I extract the midi
uhhhh
best I could do for now
nah I'm teaching myself
you know how to read sheet music? i only know how to read Guitar tabs
more than me
hah! not me, im self taught
where'd he go
i solved it. Obviously without even read anything here cause nothing was useful for me...And what you answered to me is exactly what i didn't want to know and it's not important to me nor useful.
what he do
some stuff he said, either the groups he was trying to say to rob me or something he said about a burrito
yeah 
Heya
Can someone make my avatar more beautiful or better?
My avatar logo name is 3zb
I am wondering, are there AI models and software for generating short sound effects from a text prompt? e.g a doorbell, gunshot, glass shattering or whatever?
eleven labs can do that for free yea
ah yeah, but I mean for download to run it locally a bit like the voice models, ollama, stable diffusion etc
try AudioCraft its open source
stable audio diffusion has that
Minimum VRAM for RVC is 6GB right?
Yep 6 GB VRAM at least…
Hello, good night. I am having some trouble trying to download a voice model but it states when opening a new tab "invalid username or password". Even if i try to log in to get the model to download there is no link just text in left hand corner of the screen. Any suggestions? Thank you
4 GB is still possible for simple basic inferencing but you can expect some slow speed and low batch size, 8 GB is more preferred, and the more is generally better.
Commission or paid request isn't allowed in this server, even if you think of selling your models for a certain price. Instead, you'd expect to provide your voice models in #1175430844685484042 or something for free here, or else find a better place for that. 
only for free in #1175430844685484042
I saw that
I'd assume it's promotion so better be deleted
you bitches need help
ai??? seriously???
Just say i got no life no skills and want people to like u as much as people who spend 10 hours drawing
ho using reddit
😭
cant even make up ur own argument
no argument. You're in the wrong chat. I rarely see here too much about visual art.
But thanks for sharing your frustration.
Hello everyone
is w okada still the best? its been a while
i just got the vc client from the huggingface it was updated pretty recently so but im wondering if theres something better out
ive also heard of vonovox and i have a 3060, would it be better?
I wonder how this fellow managed to pull it off back in 2019
https://youtube.com/watch?v=znn9rI2KR7s
This Remix from Rebecca of Sunnybrook Farm will sound as if Shirley's in the room performing just for you .
Shirley's vocals were cleaned up to a serviceable quality at least as if you're listening to her in the studio
I wonder if this could be done to other older songs
Vonovox would be better for your gpu yes, if u want help with it tho go to https://discord.com/channels/1159260121998827560/1159290139609137264 ^^
exactly, though the problem with that would be it would lack the old-timey charm to it. That being said, how can I do this to a song of the era? Would love to experiment with it some time.
I'm personally not sure but you could definitely check on videos like that and see if anyone else knows
There, just left a comment on this video as the one I linked to above is "for kids"-restricted:
https://www.youtube.com/watch?v=7ryYji3v6Og
James Montgomery - Living for the Weekend. From the Album: Duck Fever (1978)
looking to make stuff like this guy, but without spending 1k on his course lol
https://www.youtube.com/shorts/L9ZmlnIBqSU
this kinda vibe
@broken venture From what you've told me, I believe the objects you are trying to train for detection does not necessarily require color to identify. If that is the case then the colors themselves could be a distraction for the training.
But I don't really know the details of your training so who really knows.
object detection is usually enough to be monochromatic, otherwise color blind ppl may be unable to play most games
I do draw art, and I like making art more than telling anyone to do the same as you. 
These are my artwork projects, by the way. 
bro ragebaiting & left
He thought he'd get a backlash. 
So that's the massive possible numbers of unique Touhou characters couples, ranging from 2 to 180 characters per couple. 



Some Touhou-focused artists on X/Twitter like to pair two or more Gensokyo characters in their artworks, but I'm not sure how one would achieve such a large number of characters in their art at their lifetime. 


Epoche a training save basically, 1 epoche is your batch + repeat per images
Any AI professionals here?
Does anybody have sora 2 invite code ?
More like, where have you been? 
Get a VPN for the USA, code isn't needed here anymore
hello
does any of you have a model F t M that can handle screaming?
because its starts getting nervous
when I scream 😭
Bingo sings Let me try again! (Kisuishou Densetsu Astal)
what program can I use to use the models
W-Okada or RVC.
which file in the okada do I run to open it

how to get okada on macbook
does okada work lon mac
to run applio v3.5 do you need discord installed as a app?
You need Discord if you want a Rich Presence showing that you're using Applio; if not, you can still use Applio anyway
well, I tried running the bat and it keeps giving me an error that it can't connect to discord
Ignore that error; just wait for it to show you the localhost link and then you’ll be able to enter Applio.
thanks. it worked
epochs are a unit of measuring the training cycles of the AI model
basically the amount of times the model went over its dataset and learned from it
they don't mean how good is the model, it's just an info provided on how they trained the model by the model maker
More ≠ better
Less ≠ better
There's no way to determinate how good the RVC model is until you try it out or listen to the audio samples if there are
please use #1192011222023950368 for help
please use #1192011222023950368
no RVC model can do perfectly realistic non speech sounds, like screaming
not even if i train it to scream on my own screaming voice?
or is that just not how it works
I mean I've made models that can yell and stuff but it's kinda iffy
you can't train a model from #1175430844685484042 to adapt to your own voice/non-speech sounds perfectly
i mean, it can do kind of laugh/scream sometimes, but it's not THAT realistic
Yeah, only models I've seen about to really scream was some old hulk models and an Omni man made by an old friend
And I think some others
i mean its just gotta be passable xd
it's meh
cuz i scream a lot in VC
and it would be awkard if its just a heeeh he... he heeeeh
will try it thx
alguém brasileiro ai? preciso de ajuda
Deiteris and Tg Develop W-Okada forks are known to work on Apple Mac (Intel and Apple Silicon), but despite the premise they all would run on CPU only which would be slow for some parts. Deiteris W-Okada (b2332) has both Intel and Apple Silicon variants, while Tg Develop W-Okada (b2364) is the last known version to have a variant for Mac Silicon, with version b2377 has Mac removed. For more help about W-Okada, better ask in #✨│ai-help or #1192011222023950368.
It’s too much coding
Yes, the installation on either Mac can be more complex than a Linux distro and Windows. 
Anyone have a good Beyoncé model?
Mg
ez
See #1175430844685484042.
I create rap YT videos on omegle
I'm looking for an AI to edit the videos for me.
I've already edited a bunch of videos that the ai can copy the format for.
Question is:
What AI platform should I use. I don't mind paying
anyone can help me to set up?

CAN U HELP ME WHERE I CAN DOWNLOAD AI VOICE?
Calm down with your caps lock. For help about W-Okada or RVC, better go to #✨│ai-help or #1192011222023950368 instead of here.
oh im sorry mam
don't post your message in multiple channels (I've deleted it)
okay sorry

and go to the channel as namari said
i dont know where i can download the ai change voice updated
Does anyone know what the best discords are for AI?
Discord. If this is not Discord, then what is it called?
Eleuther ai
The server you're currently in is named AI Hub by Weights.
👍
You can't even use Omegle wdym
😭
Yo
Hi, I'm thinking of subscribing to an AI platform (gemini, chatgpt, grok, perplexity, etc). Any recommendations for an engineering student?
elaborate in #1192011222023950368 or #✨│ai-help
Hey @glad nebula , I wanted to know if you could tell me how the dataset you used to train the Legacy Core 2.5 pretrain is composed. Specifically, roughly how many minutes or hours you used per speaker. I could ask you in #🔊│ai-development , but I still can't see the channel.
ljspeech + m4singer, no denoising, removed some of the bass singers, and removed some clips of the tenor guys because there were some instruments in the background
batch 64, trained for over 100k (cant remember the exact the number)
but the pretrain is flawed in various ways
starting from the batch, too high
- it's a finetune of the OG pretrain, which im 99% certain it was trained using another discriminator, because things gets weird if you train a lot of epochs with that pretrain
Well, thanks for the information.
also, i trained an ai upscaled version of ljspeech, the original dataset is 22050, i upscaled it to 32k
but i dont recommend training these datasets together
singing in the dataset causes different issues
for example, held vowels will always have vibrato, even if the original file doesn't have it
the model is always gonna add vibrato
finetunes inherit that too
in worse scenarios it can fuck up pronunciation
I guess adding singing to the last two speakers won’t affect it, right?”
it'll affect it, the model actually needs only a few samples to learn singing (which sadly impacts speech negatively)
every speaker knowledge is shared
so if one speaker has a bias, everyone else will have it
i would also not train og's pretrain dataset, most of the problems found in rvc models are because of vctk
and the extra periods of mpd*
then I will choose to train with only speech datasets

what software is used for vc nowadays?
i have the model
oh just in case, the language of the pretrain doesn't matter
if you need better pronunciation in a specific language, you'd need an embedder trained with that language
cvec only knows english
I'm assuming you're asking about the voice changer right? Vonovox and wokada are the ones that are used, there's 3 current up to date good voice changers
If u wanna know more go to https://discord.com/channels/1159260121998827560/1159290139609137264
have you guys been thinking on making a pretrain using v1 or "v3" discriminator? (I mean like this #🔊│ai-development message )
I love no access channel it's my favorite ❤️
yea im training my pretrain at the moment
I guess I'll have to delete my 120 GB of downloaded datasets then.
nono it works too
afaik it tries to extract the closest phonemes to english or something like that
spanish works fine with cvec imo
but i like spinv2 more
from someone else's experience, it may struggle on pronouncing strong "R" (trill)
oooo this is true
Technically, so far I only have 5 speakers with a certain amount of minutes, something like 40–50 minutes. I'm downloading this last one I just mentioned to see if it can contribute anything or not. I just hope I don't end up with only 5 minutes per speaker.
thats a good amount, og pretrain had like 10 mins per speaker.... yea lol
i believe og pretrain was trained with mrd
but i cant prove it 
We'll never know how the original was trained, right?
never but its not needed, vctk is a bad dataset, they all sound like the same person and repeat the same sentences
So that means I can combine different datasets without taking into account whether they say the same thing?
ideally you want your speakers to be expressive and dont repeat the same words
and train 2 millions steps or more if needed
in tts training they train 1m steps because is easier for the ai to train tts
but in rvc the ai has to learn features and more shit
And where did you train the pretrained model, locally or in the cloud?
im training mine in the cloud
both options will get very expensive keep in mind that
even with mrd, you need 2m steps
Well, you gave me valuable information. I hope it helps me in five years when I'm finally able to train a pretrained model.
its hard and expensive yea 
never use the default rvc stuff tho, mpd alone is very slow at removing mirroring and the extra periods rvc-boss added are just bad
im training mine with the og rvc stuff bc i was curious if it fixes the weird og pretrain exclusive issues
(it does actually fix them lol)
Hi everyone, I want to ask if anyone has a Japanese RVC voice model of Mikasa Ackerman?
otherwise go make the model by yourself
Hi i dont know what AI to use for my rvc models to sing my songs
i cant find the right info about it
Last update: August 5, 2025
I just downloaded a 47 GB zip only to realize that each speaker has barely 1 to 3 minutes of audio.
damn
after you get your dataset keep in mind training pretrain can take weeks, mine is still learning breaths and how to remove mirroring at 340 epochs (2.1m steps)
og pretrain was trained for 636 epochs
(the v2 32k one)
it takes exactly 100 epochs to notice an improvement
what sample rate do you recommend I start with? because I'm thinking of starting with 40k

24k
24k is easier and faster
32k is still good (thats what im training)
40k and 48k is when things start to get slow and bad
and also it's harder to train, it needs almost a perfect dataset to properly learn 40k and 48k
start with 32k if you believe 24k is too low quality
24k quality is perfectly fine for realtime, for singing you can just upscale the outputs later with your favorite ai upscaler i guess

Yo, guys.
Are there any SPIN embedder models?
I didn't succeed in finding any
Or maybe my search in discord doesn't work well :S
when i upload a file on suno ai it says "Uploaded audio matches existing work of art"
how can i bypass this?
Where did the e-girl models go?
They went away because of disgusting catfishers 
lol
Worm is training them in his basement
Never
I will never give those barely human things what they want
Catfishers aren't people 🗣️
Never end up like this it's not worth it
i have a question
preach
Wassup
can i use these voice changers on a android?
Nah currently none of the voice changers are even close to possible to running on anything other than a computer
oh.. mkay okay okay
Ye sorry
Like they struggle to run sometimes on a 1660 GPU so a phone might just explode
its okay do u have any other recommendations for a voice changer (real time) on a android?
yo 'm new to the rvc stuff, can somebody please tl;dr me what epochs etc is e.g some models having 90 epochs or smth
damn
Epochs is just how many times a model went through training before it sounded good
More or less doesn't equal better tho
It's always randon
Tbh I don't know if there are any, your best bet would be at least going for a laptop if u don't have money for a whole pc
my cousin destroyed my pc soo
Yikes
he beat it up like the pc cheated with his girl
ik
Do you mean when training a voice or are you wanting to improve an already trained model? You can't improve an already trained model btw :3
Well to make it sound better when I use it but not being able to improve it sucks
Are you using one of the real-time voice changers?
If you need help making it sound better I could help you in https://discord.com/channels/1159260121998827560/1159290139609137264
Hi chat, how do i decide which RVC base model to use for training?
also what is the differnce between G and D models
U need to use both G and D when using a pretrain
If you're new to this I'd say just use the original pretrain which is used by default
But if you want to use a pretrain use either legacy core or klm
is the advantage small?
or is it just complex
The only difference is you have to put the two files into the training software which isn't difficult you just drag and drop
Legacy core is way better than og, idk about klm tho I never used it bc it's a spin pretrain
yeah this program is not cooeparitng with me but ill try to get legacy core to wrok
Mel spectogram simmularity is essentialy loss..?
Huggingface Space by r3gm
Huggingface Space by IA Hispano
HuggingFace Space by Nick088
anyone know how to download Vonovox
Have you got scammed before?
There's probably about a thousand scammers in here just waiting to type some dumb shit like this
They're all losers
why does it feel like ive been to unlucid before
tell him to pay you for the damage or get a job for such
nah that person is kinda sus
can I have help getting an open source coding agent working on roblox studio with mcp support
ty
stuff like terminals stuff like that
I don't have money for claude
In the end that dataset didn’t work for me and I had to download another one, which also isn’t enough to reach the 109 speakers.
you don't need 109 speakers to train a pretrain, mine has only 18
unless you wanna finetune og pretrain, then optionally you can finetune all of the 109 speakers... but honestly i think it's better to train from scratch, og pretrain is weird
I know the limit isn’t 109; I had planned to have up to 120 speakers, but from what I’m seeing online, there aren’t many datasets available.
EARS, M4, etc
the speaker (timbre) variation to cover on matters (not only typical boring adult male & female voices), iirc that's what KLM datasets have been achieving
Yo could any of the mods unban @steelshot4401 he has gained his account back from being hacked
tell him to file an appeal
👍
There aren’t many Spanish datasets, and when they do exist, each speaker usually has only 2 to 3 minutes of audio, which probably isn’t very useful
ah, yeah
for english tts models they use stuff like public transcripts and speeches from Europarlament and stuff like that
remember that the pretrain language doesn't matter and a spanish dataset wont improve spanish pronunciation in rvc
@surreal acorn no advertising
is vonovox premium worth it?
Nah just use fl studio with Vonovox to get around paying, unless you want to support the creator of course
Is it better to use grayscale image or RGB image for training model (cv, yolo) in operational implementation?
only for some cosmetic filter effects
Yo guys
what female-male ai voices do you recommend
some good ones to the ear for roleplay
yr smart
d
See voice models in #1175430844685484042, but it's hard to recommend one.
Кто знает как это фиксить?
Hello everyone! Can you tell me what this error is and how to fix it?
Hey all, quick and extremely nooby question hehe. I'm trying to make an use-case diagram for a game project where a human will compete against the AI-rival. Will AI-rival be called separate "Actor" or just the part of system?
@lilac ridge We don't allow promoting here
This server is english only, be sure to not use youtube/video tutorials for realtime voice changers, and to elaborate in #1192011222023950368
can someone give me a good dataset maker that i can download?
ive been using instant dataset marker for a long time
@true osprey We don't allow promoting, most of your messages are just promos, you have been warned and all your promos have been deleted.
Bad ai hub user bad
are there any voicehcnager (rt-vc) cleints that support refinegan?
also many settings on the trainign uis say pick this if your dataset is small or pick this if your dataset is large
i have aobut 1 hour 15 minutes of clean audio, is that large or small?
Use batch 12, it's large
For those working with LLM or AI agent systems, what parts of managing or debugging them feel most painful or time-consuming in your day-to-day work?
1 hour is fine, it's not large
if you’re wondering what the ‘max amount’ you can train and still notice an improvement, i'd say it's about 2 hours worth of audios
rvc models don’t really benefit much past 2 hours of audios, nothing really changes past that
safest batch size for me are 8 and 16
should still work fine, if you notice your model is behaving weird like randomly glitching for no reason, stop the training and reduce the batch
no
hmm i think applio realtime supports it (? but then, the new refinegan is still wip, author is training the pretrain
@ebon oracle Please use help channels next time and wait for an helper, there was a big wall of text so I deleted it, and be aware that Codename’s RVC Fork 4 is experimental, it got removed from the ai hub docs because it’s not suggested especially to people who aren't developers, as pointed out before in previous cases by other people like Lyery & Noobies (An Applio Developer): #🔥│model-maker-chat message #🔥│model-maker-chat message
im not his friend but whatever and yes i'd recommend applio instead if you're learning how to train models
Vonovox patreon gives voice effects, which are easier and internal (less delay) compared to the fl studio method suggested by the other user, ofcourse it's your choice but just know that it's not a need
I thought you were but I just edited the message, but yeah I was just saying that because it happened before that users tried that fork and resulted in models explodings
ai
you're ai?
no
what are you?
a worm
oh no..

StableDelusion
can yo use send the invite link tho
hi chat, does smoothed loss scale iwth datasset size?
like is it not normalzied for size and just some sort of
sum of errors
@ebon oracle Please use the sregister command to verify yourself and start playing
honsetly not sure what this bot wants from me
Almost there! Click the button below to verify yourself on our website.
Join our Official Support Server if you face any issues!
this is not possible
if that thing can send a discord server so should we
mods kill it
also, if discirmantor stops improving but generator is still improving, is that fine or still bad overfitting wise?
bad, the discriminator is what forces the generator to create realistic audio
mmmm, gotta love having a too inteligent generator
i mean dicrmantor is improvjng, but very slowly
while genreator loss is falling off a cliff
and getting worse at times
rvc case is unique because there's no thing such as overtraining... models get worse due to the og pretrain (likely) being trained using different stuff instead of what we have in the project
rvc uses a fused discriminator that was meant for tts, not speech to speech
it's named multi-period discriminator, and also in rvc it has extra periods that only causes models to sound robotic
the losses aren't very accurate here, instead just listen to the model
oh ok
Yeah I was just saying for the message above which was in russian, don't worry
can anyone suggest how to analyze a 2500 page book and split it into a summarization per chapter as well as overall summary?
that's exceedingly large for any LLM
best you can do a summary per chapter, then re-run summary on the summary
i think it might need a RAG setup where it chunks... but i tried that before and it didnt go well
how about per section (or subsection) summary then per chapter
I think most textbooks are quite structured like such
Therefore, based on what we have discussed, the best approach to training a pretrained model is to use only speech datasets. Then, if I want a singing version, I can simply fine-tune that pretraining using singing datasets.
yup
At least with the EARS dataset I can reach the amount I wanted; now the only thing left is to find where to train it.
can u give me media perms?
i use vast.ai
I think you need to reach level 30 to get media permissions.
the original architecture of rvc (mpd + nsf hifigan) faces a lot of problems while training from scratch, like mirroring and models not being able to sing, keep in mind that, its not easy to train a pretrain 
maybe noobies can assist you in that regard, pretrains are just very hard to train
Noobies told me a while ago that I could try refinegan, maybe I can give it a shot and see how it turns out
ask @dawn temple to give back your role
!give-media-perms @glad nebula
done
huh i can't send images
now you can

how do i set up my voice changer
it keeps saying {SIO} rconnection failed Error: xhr poll error

who can help me set up the voice changer ill give a nitro boost or usd dm me
When you work a job in real life, you'll get cash. 
I like to help people, but I don't need a reward like Nitro or cash. For help about W-Okada or Vonovox, it's best to go to #✨│ai-help or #1192011222023950368 instead of your direct message.
anyh1 got latina voice or smt

someone have tips for a voice module that sounds good
i dont find any good one for my voice
No go away
subscribe to my channel fr im making some ai slop plz plz
https://www.youtube.com/@StorytimeDann
im not a robot 😡
Modsss
I may not be a mod but I am here
I'd be a pretty good mod, I'm actively here because I have no life

hi
you can apply for staff i guess
I doubt I meet the requirements but I guess there's no harm in trying
I actually remember why I am not a staff member
They called me dumb ❤️
if they're talking about rvc dataset cleaning then you only really want to remove background music, reverb and train only train one speaker
denoising, de-ess etc just hurts the model
anything that removes something from the spectrum or limits the dynamic range is bad
de-esser are compressors
I always wondered what they actually did besides just making the S sounds smoother
yea its a type of compression that reduces the harshness of sibilants
So it's better to just use the raw audio after removing and bg audio and reverb/echo
No extra fx to make it sound "better" in any way
correct, just remove bg and reverb
Got it
it can handle a small amount of natural room reverb (not fx/fake), the model randomly adds it in inference outputs, it doesnt hurt learning or anything but can be annoying
So if there isn't a way to remove the small amount of room reverb it's not the end of the world for training that particular model?
hi
nop... 99% of the problems in models is due to the og pretrain
the other 1% is the dataset
Damn
its the combination of cvec and og pretrain
If only legacy core would continue getting public updates 😞
Would be the perfect pretrain to test of to a truly good pretrain to finally beat og fully
oh i just finished training legacy 3.0 yesterday at night
Nice
i tested it, also confirmed the problems of models is just the og
Do you have any samples?
og pretrain for some reason stops learning dataset features at some point of the training
That's odd
for example epoch 30 and epoch 40 of og will sound most likely the same, but when you use a proper pretrain, you can notice how epoch 40 learned something else, like the lisp of the voice, or the way the person talks
the robotic sound is bc the discriminator of the og pretrain starts fo fail
so everything breaks slowly
first no more features are learned, then disc dies
Makes sense tbh, that's why "overtraining" happens
I can tell the difference immediately and I don't even have headphones on lol
og sounds very loud and robotic because is starting to fail
mine has a little bit of ringing, but thats actually a rvc problem, cannot be solved
could be nsf hifigan idk
You could try training without hifigan to see if that changes anything
im interested in noobies refinegan, if he releases it i'll be trying it
I forgot about that, what's it supposed to do compared to the current hifigan
is not possible for models to overtrain, models sound robotic and bad with og after a couple of epochs because the discriminator fails
the model always keep learning something
i can finetune mine for multiple epochs and it wont sound robotic
since legacy 2.5 was trained using og, it has this same issue
3.0 was trained from scratch, no og involved
it learned everything by itself
I'd love to be able to check out lc 3.0 if possible but I get it if not :p
They're not spiky and all over the place 🙏
oh another problem 2.5 have is that the speakers are completely broken, they don't sound like the original voices at all in the model
that also comes from the og pretrain
So og pretrain completely destroyed the work you put into 2.5?
but in 3.0 they all sound like themselves
its broken yeah
just found all of that while training the new one
i think og is broken because the author used something different than what we have in rvc, and also that the vctk dataset is just bad
its not normal for the discriminator to fail
Maybe that's why when I trained Batman with 2.5 and 1.0 they sounded completely different
yea any model trained with og gets broken, regardless if its a pretrain (legacy core 2.5/1) or a regular model



