#🧬│ai-chat
1 messages · Page 384 of 1
what happened to em?
Anybody got sora ai codes
sora ai code should be a banned word
it is fking annoying
make it unable to be said like the k + the y + the s
its allowed if its banter
hello everyone
Here's your Sora, as you wished. https://static.wikitide.net/bluearchivewiki/9/99/Sora.png
that's a big forhead
That still counts as begging, by the way.
i could land a plane one that forhead
anyone have a sora 2 code?
hey
I need my model maker back mannn, yall can't keep me waiting for over a day, i'm fiending to post rn 😭
we got 2 still
and we are thinking of removing qcs soon
@dawn temple btw when are we going to change the qc system?
my voice changer keeps cutting
like im talking then it stutters
Soon, I'll start working on the bot next week
can you help me with my settings
For W-Okada, better ask in #✨│ai-help or #1192011222023950368.
I no longer have Weights Pro, but you can tell some staffs behind the website might be underpaid at some point, so they put some ads there to generate some revenue. 
I've been known that this website had have a few hidden ads before, but not as many as this one and it was less obvious for most people to spot on frontend or client-side. 
Why would you pay them a subscription?
The thing is, I never paid to them. The "free" Weights Premium or Pro was given to me by Bea. 
Interesting
Didn't know u could get given a subscription
Is it like someone else paid for a month of a subscription or something like that, how discord u can gift nitro
But for weights
I helped something a bit on "Weights" website 11 months ago, although the time I got the actual prize was some time in the early of 2025. 
That's neat
what you thinking about gemini-3.0-pro
I use it everyday
It's awesome ngl
I just got an 5090, what kind of cool things can I do with it? Besides gaming and pcvr
ai voice changer will run basically perfectly
you could do so many things on that thing
помогите настроить
I know. But what are some
honestly ive never thought about the things you could do with a 5090, but you could, run a ChatGPT level model and have a local no limit chatgpt, you could generate 8K images in only a few seconds, honestly soo much
@novel scarab
who r u ?
zain nro
chatgpt model will never go public to host locally, only through cloud API access, and you mean hundred bil param models which need a workstation with more than 96 GB vram
ahh, wait yeah youre right, thanks for correcting me
there are many AI applications that benefit on up to 32 GB vram, though 5090 is technically moddable to 96 GB vram (pretty like RTX pro 6000)
note that only some skilled ppl like paulo gomez (search in youtube) and some chinese ppl can do vram modding & bios tweaking
u can give it to me
That would be cool because i need to get rid of my 4080 super
just wait for 5080 super 24 GB, while the 4090 price hasn't significantly dropped
Okay, anyone online who does that where i can pay them to mod it?
I upgraded from a 4080, overclocked to run like a 4080 ti super, if that was a thing oc
Bro i upgraded from a 1650 to a 4060 to 4080 super
even it is still risky for such professional modders
1050 mobile > 3070> 4080>5090
Literally
also there may be no modders available in your region, maybe only in china
Aw man
I saw a 4090 upgrade vram kit thing
You think it will makes its way to the 5090 soonish
Heyo does anyone know if there any ai autotune that can runs live ?
buatkan saya gambar ikan
for a 5090 should i use W-Okada or Vonovox for best quaily
vonovox
Hey everyone, Grant here 🖐️hope the week’s been good so far. Studied Law & Finance at LSE, but dropped the theory in 2020 to focus on algo trading… I study market structure and build automated risk systems mainly how ETFs distort volatility and how to design code that absorbs that chaos instead of getting wrecked by it. Excited to see what the builders in here are working on
Vonovox can give much better audio quality, as much some other people saying, but its UI is less friendly and more professional than b2335 W-Okada.
It looks like it auto adjusts the delay, or can j manually change that
The v.1.5.x.x and older versions of mainline W-Okada, and also Deiteris/Tg Develop W-Okada forks, aren't known to work with GeForce RTX 50 series.
I'm wondering if Nvidia NeRF has been used by anyone here?
soon
is this really the best voice changer there is?
i have 5090 and 5070
i see stuff like veo 3 wan 2.5 and wonder how voice changers that focus on voices cant get that good
but a image model can? so i just wanna know if this is still the best or not
can run anything really
because veo 3 and wan 2.5 are made by multibillion dollar companies and the voice changers are made by individuals/small teams
ahh ig ur right
does mak sense
you haven't seen the best and the worst of voice models and image loras you could ever find
can anyone help me with the voice changer?
oh ive seen some pretty bad voice models but have i seen the best voice models. no
ive made a few
ive checked a bunch
i make loras on every new model that comes out that i think is worth it for me
i just hope they release wan 2.5
i pray
I'm pretty sure it's not that hard to find bad SD 1.5 and flux loras
though SD 1.5 tends to be worse
its not i just go on cicvit but i also make stuff that suites my own needs its not that hard
i use flux only
bc i do realism mostly
sdxl etc just not consistent
so i dont use it much
RVC seems to still have limitations for average to mediocre quality datasets
but have you tried TTS cloning with chatterbox?
i never heard of any of that
yea iu jut run clown fish with it for the most part
i mean from samples rvc sounds better ngl😭
only if ur datset good though so could be why
U can find the worst of the worst if you just on on weights.gg and search through the models directly made on weights
is there a way to hear it back when you said something because it works but I cant hear it back myself
Get voicemod and turn on the heat yourself button
omd I dont know why I didnt think of that thats rlly obv 😭 ty
I'd rather take that as sarcasm and not some practical solution. 
Im not being sarcastic its acc real 😭

Using voicemod is the easiest and less tech savvy way of being able to hear yourself with the voice changer
Hello, are there any voice changers that work on Windows 11 on ARM with Snapdragon CPUs?
Hello everyone,
I specialize in smart contract development (Solidity, Rust) and full-stack dApp architecture using technologies like Next.js, Node.js, and PostgreSQL. I’ve built and integrated onchain systems across Ethereum, Solana, and L2 networks including zkSync, Arbitrum, and Optimism.
My toolkit includes Viem, Ethers.js, Wagmi, TailwindCSS, OpenZeppelin, Foundry, Hardhat, and Anchor. I’m also familiar with IPFS, The Graph, LayerZero, and cross-chain messaging protocols.
I’m focused on building secure, scalable, and user-friendly Web3 applications.
If you have a new idea and need a developer, feel free to reach out—I’d be happy to collaborate.
Thanks!
hi does anyone here have a girl slovak voice model?
read the rules'
it's way too common so just making sure
alr
yo wont ai voice changer break my mic?
nope
as long as you're using a trusted one like Vonovox, or one of the two wokada forks
i'm using w-okada i think
i just want to do some voice chat trolls and i don't wanna break my mic
trolling of what kinda exactly
it is yes
btw if u need help pls ask here ^^
https://discord.com/channels/1159260121998827560/1159290139609137264
ok thanks
np
what voice are u using?
idk i tried a bunch of voices they all sound bad and my settings r good
it could be your mic maybe, is it picking your voice up good
s
s
what's that
oh ur mic sorry
I forgor that was what we were talking about
are you able to record a sample of what it sounds like when u use any model?
btw how close or far awar from your mic
that's ok
like normal talking
I meant like are you physically close to it or
maybe try getting your head a litttle farther back? see of that helps
its the same
hm
wait can't u just record with snipping tool for the voice changer to show how it sounds
there should be a feature in it to record
wait
where can i send?
u can send here pretty sure but you'd have to ask a mod if u can't send images ect
i can't send here
there's 6 regular mods online and one old man/j
can't i just send it on your dms?
sure
available red dead redemption ultimate edition account full access price 20$
wsp

i ate a live dolphin
this message was sent with a delay of 3 seconds
it was?
yeah, this one too
yeah it was
i can record it 😭
on your end
ur responding in like 3 seconds
unless u type and react at the speed of light
nuh uh
on your end its fast but it took a few seconds to be uploaded to discord
and i just type fast
this message was sent with a delay of 6 seconds
yeah, no it wasn't, yours was sent with a delay of 3 tho
this message was sent with a delay of 16 seconds
no it wasnt
yeh it was
no, im just doing th same thing u did to me
trolling
lol
no, i can actually see the delay
thats y u wer responding in 3 seconds but also saying it takes 3 seconds right
also the message on ur screen doesnt go white until its public
u can visually see when it sends
no, the delay from me is typing, and there is a delay of when it gets sent to the cloud for everyone to see
but i can see the delay
discord does not tell you the delay of a message natively
when ur message goes from 🩶 to 🤍
:D
glad u learned sumthing new
thats when it gets sent to the cloud which then can take a second before its made public
but i can see the delay
cause of the way i have discord set-up
does anyone sound like a real girl here and can test with me during a call?
dude, so many people are gonna be lining up with their AI voice changers
You're number 69420 asking for this stuff, read the rulesssss
Hey can u help me with something
depends
Voice models seem to be all the hype here🤔
could anyone help
Doesn't seem as interesting as other uses of AI, why is it so popular👀
some one got good e boy voice ai
Because other common uses are dominated by other services and sites, so a lot of people that come here come for the voice models and applications thereof.
That is interesting, when I first came here I didn't think it would be a hub for a niche
I don't have a GPU for ai covers, only CPU
You cared about the delay? Well, it's because I use my hands to type on keyboard and then send one. 
die
For helps about W-Okada, better ask in #✨│ai-help or #1192011222023950368
YOOO CAN SOMEONE MAKE A GHOST FACE VOICE PLSS
To request a voice model, make a thread in #1159289738314919936.
-colab
Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
by IA Hispano
Google Colab
by Hina
Google Colab
by Eddy
Google Colab
by Eddy
Google Colab
by Deiteris & Hina
Google Colab
by Shiro & Eddy
Google Colab
by Nick088
Google Colab
by Nick088
Google Colab
by Jarredou & Makidanye
Google Colab
how to change your voice with a neural network not in real time, but so that I can upload a file and change my voice in voice changer ai
Hi
Guys, is it safe to use a random sora2 code I've found on the internet?
Hi , You can try LunaBloom AI if you want to create social content or even professional vides for business purpose. Content can be created over 50+ languages and custom AI Avatars also available . Its a text to video generator. Heygen is also good for creating professional AI Avatar driven Videos.
This question sounds stupid. For voice models, find one in #1175430844685484042.
why are you replying to messages from 4 months ago
recommend one
arent u a helper
do ur job
😐
You asked like if I know every model of the entire library. 
nvm i figured it out
-huggingface
Huggingface Space by r3gm
Huggingface Space by IA Hispano
HuggingFace Space by Nick088
Rules, read em
Guys, does anyone know why I try different voice models with RVC but most of them sound like, the mic cutting off for no reason? Im talking very clearly, mic close to mouth. Seems strange to me.
what voice changer are u using? did u get it from a yt tutorial?
no, im using Real Time Voice Changer Client, the official one
there is no official one
I do have a model that works much better than the others somehow, I wonder how
I mean the github Real Time Voice Changer Client one everyone shows
ooh
and the original which is outdated
i cannot share image but, it says on top v.1.5.3.18a - onnxdirectML - cuda (Im using AMD CPU and GPU)
1 model does work good the others cut my microphone off, seems weird
we should probably move to https://discord.com/channels/1159260121998827560/1159290139609137264
okey ^^
the og version is garbage for AMD gpus
that's like an over year old version of original wokada
dw I gave him deiteris fork, I had a friend who used the tg fork one and it's confusing and he said buggy
they have amd
I had a friend who used the tg fork one and it's confusing and he said buggy
It's not buggy at all, it's only "confusing" because it's a newer personalizable interface, every bug that's in the wokada tg-develop fork is also in deiteris fork lol, like crashing when you edit the model's name via the UI
I never got to ask him about it before he deleted it but he said it was buggy, idk what bugs he encountered tho
smh adapt to new UIs 
he can ask me for help in #1192011222023950368 if he wants to try it
but i checked the commits and i assure you that deiteris has the same exact bugs
the only difference between wokada deiteris fork and tg-develop are:
- UI
- spin model support
- voice effects
damn don't eat ur friend 😔
I didn't eat him 
it's because you're super awesome
Anyone got a stackwopo voice
Not quite found in #1175430844685484042. To get one, you either open a request in #1159289738314919936 or train one by yourself.
What is the best way to get the sound effects from the instrumental?

When you talk a bit in this server, your name will turn green/blue. 
¿?
Hi everyone
I’ve been working on a small AI project related to agriculture.
I trained a DenseNet121 model on Google Colab using my own dataset,
but the accuracy is only around 43% so far.
I’m not an AI expert, but I’d love to learn how to improve the results
any suggestions or best practices for improving model performance when the dataset is small or not perfectly balanced?
Thanks in advance for any tips 🙏
read the rules
it's for school
i have to prove that ai is realistic
so please give

no
plss
no
why
bro gt 1030 is lagging the voices
Welcome
hi everyone. i'm training mouse trajectory AI with a simple PyTorch LSTM model and a mousemove events dataset collected from 65000 real visitors for passing akamai & datadome bot checking. the loss is keeping ~0.000024 after 12 epochs. but the generated trajectory still looks very akaward. can anyone tell me how to improve my model?
could anyone help me??
the way the mouse event data was collected seems questionable, do the "real visitors" actually consent with it? otherwise it sounds like privacy invasion to them
i think mousemove data itself is not related to privacy without webpage context.
besides, i only recording relative positions and on myself website.
how about when to start and stop the mousemove recording?
what's more, many legal marketing services like Microsoft Clarity and Sentry records mousemove events and all DOM changes with absolute positions without prompt, and let webmaster replay them exactly for each sessions.
do you mean realtive mouse position to the window position (yea exactly means focusing on the window context)?
relative mouse position to the starting point. i only need these to train mouse trajectory model.
I think it doesn't matter, the relative mouse position to the window could be done without concerning "privacy invasion" as long there are no other data that may be sensitive
mousemove event won't be triggered if your cursor not inside the webpage.
well I bet it could be why it doesn't seem to be accurate
again the mouse position to window could be more accurate (without being "invasive"), though you should consider some other factors like window size and the webpage's responsive UI
as i know, behavior analysis system of most antibot service only care about how realistic of mouse trajectory itself, not the click positions. so i think relative positions are enough.
how about the event when it starts recording the trajectory
note that the users don't manually trigger it and the mouse pos when clicking a button could vary
nothing happened. just like the Microsoft Clarity. I put things like I will record and upload anonymized behavior data for security research in the privacy policy of my site.
does it involve a little computer vision or purely mousemove data?
when a click event happen, the trajectory will be marked as end then start record the next trajectory.
no computer vision. purely mousemove data.
what kind of the click event exactly?
actually it is the mouseup event. it will be triggered when you click anywhere inside the webpage.
again where they clicked prettily varies, so that's why it is inaccurate
i don't think the antibot service will render all DOMs on their backend and know positions of all buttons. the intention of my mouse trajectory model is only prove "there is an actual human browsing your website" to the antibot service so my data crawler won't be blocked
re-reading your topic, I dont think people could move the mouse and click while being blind, so that's why it is not feasible without requiring such data that may be invasive
i think no matter what is the intention of people clicking the mouse, it ends up the same mouseup event. there should be no statistical difference.
after reversing enginerring akamai bot manager code, it uploads all mouse and keyboard and window and sensors events to the server. but I think making a mouse trajectory model is enough.
different position results in different mousemove, that's the point you're ignoring
i read many papers about mouse synthesizer and no one mentions screen position will cause statistical difference of mouse trajectory. but i can't confirm that.
and the akamai only collect pageX/pageY (positions relative to 0,0 of the webpage) as well, not screenX and screenY
my point is the (pageX, pageY), sorry not have to be (posXtowindow, posYtowindow), never meant to be screen one
and I thought what you were talking is (dX, dY)
which could certainly vary for different actual positions
I'd expect NVIDIA GeForce GTX 1050 Ti as minimum possible, which has 4GB VRAM, for W-Okada and even RVC. This one is GeForce GT 1030, has 2GB of VRAM, so you can't do much about it.
and still you need reading the webpage rendering size and how it positions the responsive UI elements
using (dX, dY) is like navigating the trip path without a compass 
You can go for any anime or fictional character voice model. It doesn't have always be the "girl" voice model as you'd expect.
when I talk with the model it sounds fine, but when I record or talk with it in a call it sounds super fake and like choppy
For W-Okada the realtime voice changer, go to #✨│ai-help or #1192011222023950368 and explain about your issues there.
okay
See #1175430844685484042.
any of yall got a sora code
A character named Sora is in Blue Archive game. 


pretty sure anyone that did have one has used it already
your point is basically like a macro but it tries to reproduce variation of the users' mouse movement for that purpose. it'd make sense to use (pageX, pageY) instead of (dX, dY).
it'd probably require software-level hook (or OS-level), though I'm not sure if websites could have detection as invasive as some sophisticated anticheat softwares.
For those who live outside North America and keep asking me for OpenAI Sora 2 invites, this is it, their blog stated that the Sora 2 is available only for Apple iOS in United States and Canada through invites.
-rt
Guides for Programs that use RVC Models in Realtime for Calls/Games
A Realtime Voice Changer with similar performance to Wokada Deiteris Fork, with extra features, but supported only for Nvidia GPUs on Windows. and without cloud options GUIDE
A personal fork (modified version) of Wokada Deiteris Fork, it just adds some Quality of Life improvements to it like supporting Spin Embedder and Audio Effects. Don't expect too much about it since the creator made it originally as a personal project. GUIDE
Most suggested WebUI with the best general support for many platforms. GUIDE
For Windows Nvidia, Both Wokada Deiteris fork and Vonovox have similar performance & quality. Users should read the pros and cons for both and choose based on their differences, such as UI and Vonovox's paid effects.
Read Wokada Deiteris Fork Pros&Cons & Vonovox Pros&Cons
These options are not recommended for use.
Not suggested, older versions in youtube tuts are even way worse. GUIDE
The program is worse compared to the ones above, and much less updated. GUIDE
Has anyone found a way to get bypass certain sora ai filters consistently without having to phrase things like Yoda?
yes
give me
You request me like if I know and hear every voice model from the entire library, what.
Think think, use your ears to hear each one of those. It doesn't have always be anyone to pick one for you. 
I can help about settings in W-Okada, but picking one good voice model for you is too much, you know.



If you want "Touhou" voice model, search keyword "Touhou" in #1175430844685484042 or use Weights bot in #🔍│find-models
ok help pls
One more time, homie, if you keep asking me in #🧬│ai-chat instead of #✨│ai-help, or still annoying other people, you won't be genuinely get helped.
ok im sorry i will leave 🥺 😢
Hello
There's a type of videos that I really wanna create but I don't know how!
can someone help?
anyone familiar with generative audio?
i have some questions but chatgpt/gemini cannot answer me
was wanting to discuss or ask questions to someone
what kind of "generative audio"? please be specific, that's why it as well as me can't understand what you mean
i had some question regarding latency differences between Text-to-audio and text-to-speech. like architecture wise they both seem to employ similar architectures but TTS is fundamentally usually like 10x faster and I can't understand why. I get TTS is just lower dimension as a whole i guess but idk. sampling rates and stuff tend to be the same so i don't get the difference. I've been arguing with gemini for 2-3hrs but losing it.
what is difference of "audio" and "speech" according to your definition?
so TTS/speech would be:
"I am walking my dog" and that is directly converted to sounds that pronounce that.
TTA/audio would be:
"90s rock music with heavy drums" and we create sound that roughly reflects this label. so this is more similar to image generation. however somehow its much slower atleast on the few SOTA models i tried.
and usually these TTA models cannot do speech, and the TTS model cannot do general audio
so you meant "audio" as music, and it is prompt-based music generation which is basically Suno, etc.
that's completely different from TTS, you're just comparing a dog with a banana

yea music or music + soundeffects like "waterfall sounds". but why is there such a difference in latency? ive tried a TTS diffusion model and TTA diffusion model but why is it so different. and the actual models themselves seem fairly similar too.
so it feels weird to me
what's the point of comparing the generated waterfall sound to the TTS voice saying "waterfall sounds"?
or you mean for classifying the former sound to the text spoken by the latter as the label?
I'm not comparing them they are for different purposes sure. But it feels weird one is much slower even though it's both just audio.
Because I'm trying to make a system that serves both speech and general audio but because of big latency differences there are load balancing/serving issues
yeah it's so weird why dogs can walk but bananas can't?
Yea I get it's different but like how tf it's both just sound
Like image and video makes sense this is just both audio
hello can I share links here?
Is there any good lipsyncing website or softwares that make these AI interviews?
There's a channel called Mind to Goals
Anyone give me the active invitation code sora 2. I will pay him 20 dollars as a thank you. With much respect.
Yo I have
Somebody have a sora code?
I do
Would you share it to me pls
no it's mine
LOL
Bro i’m just trying to make anime fight videos
Like the tik toks
It’s so cool
It is not
Why
anyone maybe has a sora 2 code pls i need it so bad
I have it! BUT it's mine
someone has experince with lipsyncing?
😭😭😭😭
Brooo
U cant like give it to more people or sum?
Echte help me plz
Why nobody is helping me here???
im so sorry, waht do you need help with?
I'm really looking for someone who is good with lipsyncing videos
to help me out
to make videos like this
ohh, ive never seen anything like that
seems just like it was generated with the audio tho
i've got the ai voiceover and the actual footage of elon musk in this video but lip syncing it is the hardest part tbh
This isn't the help channel
Where in this server i can hire people?
Is there an AI like Suno AI that I can use?
we have no recruitment in this server
okay.
please ask in #✨│ai-help and i can assist you further
babamda @vague escarp bu adamın numarası var da aramıyoz
no promotions, advertisement, and hiring are allowed
we need eboy
we?
im probably gonna end up feeling like that at some point
Hello, how are you? Any recommendations for AI soundtrack creators? One that's free and unlimited?
https://ilovesong.ai/ pretty good and it allows you to create 3 free songs to test
oh the pain im gonna have to endure, mabye not everyday, but most days
please no
Anyone know any good voice changers I use voice mod but it sounds so bunzzz
what a question
ElevenLabs
Ai
Ai
does anyone know any good ai voice model creators
do the ai girl voices actually work or do u have to like try to sound like a girl
Hi all
Hey everyone 👋
We’re building an On-Board Courier (OBC) project — basically time-critical logistics where documents or parts get flown across countries within hours.
We want to integrate AI-driven automation for the entire process using Make (Integromat) and other tools.
The idea is to automate:
-AI-based pricing and route optimization
-Automatic quote and invoice generation
-Factoring and cashflow management via API
-Flight booking and courier assignment
We’re looking for someone experienced with Make, automation, or AI workflows who can help us build and connect these processes.
If anyone here has experience with AI automation, APIs, or Make scenarios, please DM me — I’d love to discuss the project and see if we can collaborate.
Thanks a lot!
thx
QQQQ
I really want an AI for coding but mistral can't do this one
So wich Ai should I use?
Can anyone guide me please?
@timid musk
hi
What are you planning on doing 🤔
does anyone have premade n8n automation templates that i can get?
i got some il hook u up with them
What even is a n8n automation template
u dont wanna know
catffishers smh
Is Samsung developing an Ai?
I've heard about it before, but I don't know if it's available or not
helo! how can i fine tuning to lm studio models
I have a question, I have been using this activation function:
If x > 0 then
Return 1
Else return 0
I think this is the derive of relu, but it seems to work fine as an activation function and maybe even better and faster so why not use it?
function like that only works in special cases
it is not learnable and can not be used for gradient back propagation
But I used gradient back propagation with it and it kinda worked.
I tested some xor variants on it only tho (like 3 bit, 2 bit and different patterns), and it worked with that, I even tested giving it some string sequences and it kinda worked
Free premium training items are no more. I can no longer make models with them. I can only use credit cards now.
credits ≠ credit card
Just train on applio goofball
Unless it's images or smth
I want to use it for future AI Covers for myself, and back in the day, I loved using them for multi-voice covers
I actually have a 679 cover featuring Dean Wendt Barney, Avenger Chuck E. Cheese, and Kermit the Frog
And can voice models be made on Replay?
Nope, but never train voice models using anything related to weights
It's ass compared to actually good training software like applio or mainline
they can
whats the best model for backing vocals/lead vocals
Last update: August 18, 2025
can someone give me a link to mel denoise v2
yup
the doc doesnt seem to mention which exactly
I'd recommend BS roformer frazer, and then mel aufr33 and becruily
becruily karaoke sucks ass
that is voc-inst model
mine doesn’t work
frazer is better if u dont believe
Hey folks 👋 — has anyone here played with voice-first AI tools?
I’ve been testing an open-source one called Ito, and it’s wild how fast ideas turn into actions just by talking.
Do you think voice could actually replace typing someday?
Yo, why do we multiply the learning rate by each weight in the network individually, why not just multiply it by the gradient of each output neuron
There's a model that says "frazer" in a customized MSST Colab notebook, but I don't remember the name of who made this notebook. 
yo whats an good model im a boy and want to sound like a girl for catfish lol
u have to speak english for any of them to actually work
and u cant laugh
this is if u wanna sound like a girl
any good settings? i kanda got an bad pc
wait
alr
Yeah, you two can help each other, but I won't be going for that.
alr
if u go above there will be more delay but it will sound better
okay
but 256 should be good
like ure just gonna speak? i dont speak english
np
yo can any1 help me with the settings? i just dont get it it sounds weird with my settings
can u come in call rq
its like laggy idk
yo can anyq come in vc with me and tell me if it sounds real?
What area of AI interest you most?
yo can any1 help me? i have to find the perfect voice, i got an deep voice and want to sound like an girl
did https://huggingface.co/becruily/bs-roformer-karaoke/tree/main works with msst ?
what repo did you used?
There's no such thing as a perfect girl voice
Is anyone here knows the best way to generate pictures locally ? I just can't see any new guides on yt, only about year ago
Help me..
I trained a model on an NVIDIA T4 GPU using a 40-minute audio file that I downloaded from YouTube as an MP3, then converted to WAV. The training was done with a batch size of 8 for 250 epochs, but the output contains noticeable robotic noise. How can I fix this? I’m sharing the result below.
U gotta try to sound like a girl
idk i cant find a good voice
Take woman pills/j
can we go in vc and can u tell me if the voice sounds good?

nuh uh
Hey i wanted to ask whats the best ai tool to create summaries for studying at uni that implements different fonts and a nice formatting and marks important stuff (and exports it to pdf or word)
I didn't make that notebook my own. I got a custom MSST notebook from #📰│dev-updates message
Nuh uh
WHO NEEDS A SORA CODE
i made a basic server of trust for any one to join
dm me if u want it 🙂
Hlo
this rule isn't a thing anymore
isn't it still illegal to help weirdos who want that stuff
if they blantently say it though?
basically this
or to "troll" people on games to get stuff
really sad behavior
im not forcing anyone to help someone they don't wanna help, just saying asking why they want to use it is not appropriate to begin with
👍
if i go into a poster store and ask for a big poster and they ask what for, im not answering that question
it's frankly none of their business
fair enough
99% of the time tho people wanting "e-girl/e-boy" voices is for trolling catfishing behavior
only good reason for it is if they dislike their own voice or are trans
the entire point is to change ur voice to whatever you want
you don't need a reason
i would hope people would troll with it
trolling as like Batman or Darth Vader
but i also wouldn't conflate trolling with catfishing in its most literal sense
that's not for us to decide
bad people will do bad things with any tool
that's true, but to keep those bad actors from doing it if they ask for egirl whatever I don't help them
someone else can
ofc, you're not even staff it seems, i wouldn't make you do it even if you were (can't even if i wanted to)
just saying as a general rule of thumb don't ask, if someone is stupid enough to tell you they are straight up going to do it to commit fraud, ofc dont help them, but otherwise it's not your concern.
got it 👍
I'm not staff at all I just like sharing the voice changer since it's cool
for sure
but not for ppl who missuse it yknow
there are multiple! what gpu do u have?
AMD 9060XT which is somewhat equivalent to a Nvidia RTX 5060ti
is that Nvidia, AMD?
I'll just use google
okay lol
Guides for Programs that use RVC Models in Realtime for Calls/Games
A Realtime Voice Changer with similar performance to Wokada Deiteris Fork, with extra features, but supported only for Nvidia GPUs on Windows. and without cloud options GUIDE
A personal fork (modified version) of Wokada Deiteris Fork, it just adds some Quality of Life improvements to it like supporting Spin Embedder and Audio Effects. Don't expect too much about it since the creator made it originally as a personal project. GUIDE
Most suggested WebUI with the best general support for many platforms. GUIDE
For Windows Nvidia, Both Wokada Deiteris fork and Vonovox have similar performance & quality. Users should read the pros and cons for both and choose based on their differences, such as UI and Vonovox's paid effects.
Read Wokada Deiteris Fork Pros&Cons & Vonovox Pros&Cons
These options are not recommended for use.
Not suggested, older versions in youtube tuts are even way worse. GUIDE
The program is worse compared to the ones above, and much less updated. GUIDE
eyah thats wat I was abt to say
I found reviews abt wokada and I gotta say its very realistic
should be a amd version linked in the download section
oh yea! it's really good
ohhh okay thanks!
tho idk if the latency is good for yknow gaming
would be nice if it wouldn't get delayed for more than 1 sec
that's all settings dependent, once you get teh application setup I could help with them
okie I probably wont be setting it up rn cuz I dont even know what game imma use this on
but ill mention u if ever
alrighty I'll be here
yoo brother can you help me
Sure!
can i dm you?
I don't mind yea
I ain't helping with that stuff u shoulda read what I said earlier in this conversation
find someone else
alr
does anyone use vovox here?
tf is vovox
vonowox
tf is vonowox
i mean i doubt its the best
every voice changer that uses ai is basically the same
they use the same thing as in RVC
which vocal model should I use
Gabox FV4 or FV6
hi guys :) just wondering if ai voice changers are safe to use in CS2? (specifically w/ okada voice changer or if there are other alternatives)
Hi everyone
I am new to the community
I want to ask is it actually possible to get money selling AI automation services?
Bro i want to train my voice model and then make cover song on it how can I do that
Does anyone know why when I try to run the Http_ file it opens for a second and closes? Even when i re download
It’s like the final step to downloading the Mod
can you provide a link?
?
also is BS Roformer SW available locally?
Yeah it is available in uvr 5 ui
@crisp flame no advertising
Who wants a free month of perplexity pro? Included gpt+ gemini pro etc
Aha
rip RVC 2023-2025 :(
any 1 knows how to fix robotic voice
@errant portal no advertising
im making a medical ai called SanAI if anyone cares
Competition is high. I recommend developing a niche SaaS
looks alive to me :/
What does your MedicalAI do?
right now I am feeding it chest xray images
so nothing yet
Actually reminds me I've been curious to know what the most recent RVC developments have been (if there've been much since like, early 2024) I've had RVC GUI since like 2023 so I've been curious about newest training models and such along with the tech itself
uhh no its not
heres a link to the model
I thought you died
I mean, theres no New things, like rvc 3
rvc v2 is gone?
is it actually you micheal?
yes
Idk (it's I dont know ok Michael?)
oh ok
skibid itoilet RIZZ
67
my model is learning fast and getting better at detecting pneumonia
@lofty hull , since u were askign
How do I make UVR export FLACs in more than 44.1khz/16bit
it'd be nice to have Hi-Fi exports
i can tell u
if the model is trained to produce 44k/16bit, how would you get more?
lol
ridiculous
can i ask a question to someone that knows ai training a little
what context
Coding as in Websites? orrr something like RVC?
huh
no like
can i dm u
wait its fine i cracked it
I'm not educated on sophisticated AI training for coding websites or something complex like a service or whatnot
But I can make voices
btw what model is THE BEST for isolating sfx from a youtube video/scene from tv show or movie (5.1 is out of the question)
hi
Updated uvr ui ?
what is that
Where you checked btw ?
Why does the voice changer cut out and sound bad
I see, I talked to the the devs and they said you have to wait for a week. They are going to push updates soon.
should I download all of UVR5 UI models and some other models like MVSEP exclusives?
what model is this
model_mel_band_roformer_ep_3005_sdr_11.4360
i forgot
Please elaborate in #1192011222023950368
Hello! HRU?
"Mel-Roformer-Viperx-1143"
mel 1143 is an old model released 1.5 years ago and before BS roformer 1296 and 1297
judging by the quality, there's no reason to use it anymore
Hello, has anyone ever worked on an agent that gets its responses from a firebase database?
Any documentation about it?
my ai can now detect pneumonia via xray imaging
at a cost of burning a bag of coal at the power station vs a regular radiologist doing it in 0.5 seconds flat
Hello
is this true
does anyone have a Sora 2 invite code?
or does anyone know how to make AI vids w/out the 'policy guidelines'?
what true? that using AI is a waste of resources for something that a human do with ease?
its not a waste of resources at all
hi
anyone needs AI related work or projects done? im quite skilled in ML and RAG. and ill do the work with no funds, my payment is going to be the experience
if so just dm!
How many epochs would yall recommend for a 2 hour dataset model
Funny question
Considering we use RVC, which modifies/manipulate your actual voice to the target voice with AI, it is prone to limitations as a deep tone voice will have problems changing it to higher pitches like a feminine voice.
Is there any technology that instead of changing your voice, it actually listens to what you said and in real time replicate it with the target voice?
There's no such technology like that sadly.
None, you must figure yourself by training the model and testing/listening to the pth's.
rvc doesn't modifies or manipulates your voice, it's listening to what you said and replicating it with the model's voice lol
this
Huh if it's the AI doing the voiceover, why do your own pitch matters so much the
this type of ai audio uses a f0, unlike TTS which doesnt (so the pitch change they make are completely randomized)
you can technically train a no f0 rvc model, but it sounds very bad
(used to be a feature in mainline rvc, removed in applio)
so rather than "imaginating" pitch, it tries to follow the reference audio pitch instead
rmvpe as a f0 is nearly perfect, most of the time models learns voice cracks during the training
there are other realtime solutions besides rvc, like ddsp-svc and so-vits-svc
but so-vits is super slow and eats your vram alive, and it actually has more voice cracks/artifacting than rvc
and ddsp-svc sounds like a downgraded version of rvc
lol
plus it needs a big dataset in order to produce good results
just like so-vits
i think there is another realtime solution that i cant remember the name uuhh it was zero-shot
but the quality was also bad and limited to 22050 sample rate only iirc
edit: ok they released a 44k model, nice i guess lol
seed-vc yea lol
tldr; rvc is very powerful actually, the other realtime solutions simply do not compare to what rvc can provide
Hey guys, whats the best software to change voice by ai nowaday? was using w-okada/voice-changer back in the days, but I got a new pc and would like to try again (free and locally ran only please)
throw away 75% of your set, train with the remaining part for max 200e
To me the best one which seems less hardware intensive, simple and has almost no delay is vonovox
there's no recommneded epoch
I'd suggest you to tell your hardware in #1192011222023950368
Vonovox is only for modern Nvidia GPU on Windows 10/11
voice manipulation and voice replication/reproduction are two different things
for the former, roughly you can use a VST plugin to do pitch and formant shift
or something like Vocoflex by Dreamtronics
what's this do
search in youtube

Hey, curious if there are any tutorials or people who could give me a rundown on how to setup Applio. I’m trying to train models locally on my pc. As you can see I’m a noob who has no idea what I’m doing
does the voice changer actually make u sound like a real girl
I'm scared of what u want from it
That didn’t come out right
It’s ironic, that there are questions like my and then ts
i wanna sound like a woman
please say miku and not an egirl :(
idk the voice changer never works for me anyway
what gpu do u have
amd radeon 6650xt im kinda confused hwo to use deiteris
all u gotta do is download it and start the mmvcserversio exe
When i do the voice sound bad
dosent sound like a real girl it sounds like ai lol
Hello
bro I havnt been on this thing in like a year does anyone know a realistic girl voice. Has this advanced at all?
Unfortunately yes
wym? explain in simple terms lol, I joined this server 2 years ago so should I use the same download or the newest one? Also which voice model is the best girl voice
me?
all these gooners man
Yes
??? I have a yt I want to start voice trolling again
Please troll with something that isn't sorta illegal
its always the same
Like literally anything else but e-girl/eboy
thats why im asking u guys cause I tryed this when it was first coming out 😭
There's some new stuff but u gotta find someone else to help u in your freaky activities
gng what 💔 What can u do with a voice changer
alr 😭
just asking for help but ok
Don't end up like this ok?
Please
Stay away from the dark side
oh
I can help u out in https://discord.com/channels/1159260121998827560/1159290139609137264
ok thanks
Np
hi
Yp whats up boys, im really new to ai but I do marketing and im trying to learn how to build ai chatbots to add to my premium marekting packages. Do yall have any advice on the best open source software to build em on?
Ok, you didn't have to ping every moderator like that.
If you mass pinging other staff one more time, you won't be genuinely getting helped, homie.
There are #✨│ai-help and #1192011222023950368 for help topics, yet you keep asking in #🧬│ai-chat which is not even a help channel.
I see a member from this server has the phrase "no AI art" in their server username. Ironically, "AI Hub by Weights" is a Discord server about AIs, which also including AI generations and AI arts. Because I count "AI-generated music" as a type of AI art, so does it? 



Give me some new AIs
I'm running out of Ai...
Why isn't any company unveiling new AI?
Technically music is a form of art it's just not like pictures
just say it #🏙│ai-images
not only realistic graphics but also illustrations in wide range of styles
Some1 has an code for sora ?
Hey, curious if there are any tutorials or people who could give me a rundown on how to setup Applio. I’m trying to train models locally on my pc. As you can see I’m a noob who has no idea what I’m doing
how do i make the voice changer work on discord
please ask in #✨│ai-help
It's funny how every person whom just blatantly admits they do it to get 'robux' or any type of currency also automatically is admitting fraud. under u.s. law that falls under deceptive misrepresentation and can even qualify as wire fraud under title 18 section 1343 of the united states code
which is also the reason that if they say they are going to do any of that we don't help them
Literally
It's so sad
definitely the right approach. I respect that in a way i can't even type in words. Since on discord people tend to care a lot less when it comes to money. Providing assistance could be seen as facilitating that behaviour and you guys make sure not to.
truly is the right way
Wait why should I delete 75% of my dataset
cant rn
when u got time for vc
few minutes
alr ill wait at support vc
im here
am back i was at kitchen
Jr.
Sexo
because 2 hours will be excessive
but it is fine, your model will be done much sooner than you may expect that way
i can see where you are coming from tbh, but the whole dataset includes various ranges and its all studio quality audio
so i think theres nothing in there that could make the model worse
like the quality remains the same all thru out
training with a huge dataset for many epochs erodes the model's knowledge
30 minutes is more than enough
could you elaborate a bit more? i just wanna make sure i understand so i dont like, make any mistakes and stuff
what else to add? the model is trained with 100+ voices to learn the difference between them, you only need a small sample, like 15 minutes to retrain the default voice characteristics
with 2 hours you dump 8x more audio that is being used by other speakers, and since youre training on top of the pretrain the model slowly loses information about other voices as it tries to realign towards the new data
the trick is to train the model just enough so it realigns the default speaker but does not forget how other voices work
ahh i see
but what if i want the model to have a very good range (from lows to highs when singing) and ive tried training 30 min model with the dataset but i felt like 30 mins was a small time for the model to learn very well all the ranges
and therefore it was a bit more inconsistent when hitting different ranges
well, the default model has low and high voices known
it should be able predict how a new voice sounds with different pitches
wouldnt having more dataset make the model more consistent tho? (considering the whole dataset is all great quality, raw vocals and good mic)
like, theoritically shouldnt the model be better and have less glitches and stuff and be better at pronouncing if it has a longer dataset
thats how i think about it, but do correct me if im wrong lmao
that's not how it works
the model learns phonemes and pitches in relation to individual voices and how does a spectrogram look like for each
when you train a regular voice model with speaker 0 it realigns it to march an existing voice from speakers 1-108 and learns how it differs
then when you infer it takes the newly learned voice, phoneme and pitch from the input audio and predicts a spectogram (how the audio should sound with a new voice)
and based on this prediction it generates the new audio
ah i see
so you are saying that a 30 min model will sound better than a 2 hour one 100% of the times?
(obv considering that the data is all good and the same quality etc...)
I can't say... give it a try
use 15 min, 30 min, 1hr
train, compare
ideally you want a consistent voice in the set
I’ve done a 25 min and a 1 hour one and it relatively sounds very similar but the 1 hour one is better with higher ranges and stuff
dont think having singing and speaking will do any good
I also forgot to ask, what specific problems would you think could come with me training the 2 hour model
If you have any in mind atm
I’ll train it anyway just to test it
But I’m curious
Wow
<@&1159293140440723499> some mod pls take a look at that.
thanks for the swift reaction :)
you cant handle the most basic things in 2025 bro
eating 
you cant read in 2025 bro
hate speech isn't "the most basic thing".
Not-embeded image from newer member especially is always suspicious, but I'm kind of not allowed to delete one's message here.
what are you, an 1800s peasant?
i mean im trolling in most servers rn
im not but my friend is a guy thatd be willing to buy
read the #📜│rules
whats the best tool for text to audio ?
nah who reads those anyawy
pointless lmao
That wasn't trolling, that was just being a basic lame-ass bigot. :-P
you when i give you a mute to read them
For even more severe incident, you can use the modmail bot to contact the moderator in this server. 
im probably never gonna interact with this server again
Your choice. 
im back
<@&1159293140440723499> Scam advertising here
(somebody please tell me if you rather have stuff like this reported in a different way, instead of just pinging all of you ;) )
generally nothing but time
at a certain point your hitting diminishing returns
you can do it it's just not worth
unless you want to train a model for extra 4 hours for a negligible difference
and a model will be different every time you train it, trial and error
Small tip: Mixing a lot of varied tones of a same voice won't work well always.
With certain singing voices it's better to just make models focused only on their distinctive singing/rapping voices of the same person. Even more if the target sounds like a completely different person when doing rapping/soft singing/aggresive singing/etc
Believe me, i know this thanks to experience.
idk why this server recommends not training big datasets
you replace more things of the pretrain and rely less on it with a big dataset
relying on a voice that it's not from the model will never be good
and the losses just get much better, gradients become smaller
you also get less robotic/metallic moments, since these are actually coming from the og pretrain knowledge (the dataset used to train the og pretrain has close to 0 harmonics past 4k)
fm gets much more stable, mel loss is lower
there is no diminishing returns wtf
you get a more stable model
lol
also, the discriminator gets super strong with small datasets, which makes the model even more metallic/robotic :U
obviously if u train a big dataset it has to be good tho
Yup, for sure
I encourage a lot using big datasets only if these are well made.
new here
what are u guys even using AI for? im hearing abt gazzilion of new routine killer AI every day but after looking at them i just cant imagine scenario where that would be useful
Do larger dataset models tend to hit the lowest point earlier than models with less dataset?
bro this "lowest point is the best epoch" is just plain wrong
its just noise
lol
oh
That’s jus what I’ve always heard
What’s the best epoch then?
Do you just have to test every epoch
Or
||
👍 ||
basically the best epoch is the one you like the most
no loss graph can tell you which one is the best
Got you
How did the “lowest point” thing come about then lmao
Cuz that’s what I’ve always heard
same source who told everyone to train 200e ig
i know where this came from tho, back then when so-vits-svc was a thing they trained 500 epochs to 1k
So what’s basically like, the correlation between epochs and how good the model is, like as the epochs increase
understanding epochs is quite simple
an epoch is everytime the model has seen it's full dataset
when you train from scratch (no pretrain) and you use a big dataset (100 hours) you might wanna train a lot of them, like 500 or even more, thats because the model has no understanding of speech or anything
but when you train with a pretrain (og pretrian, legacy, klm, etc) the model already has a base, you're just meant to do small adjustements so the model forgets the pretrain voice, but doesn't forget everything else
by training 1k epochs with a pretrain you're just telling the model to forget everything it learned, rather than just learning your dataset voice
you don't need that much of them when you use a pretrain, it takes a small amount for the model to forget the pretrain's voice
Ahh I understand now
So you would think that a 2 hour dataset models should theoretically be good with lower epochs than a large amount
what i do is save every 10 and train for 100epochs, but some want to train up to 200e
Say: 200 vs 600
correct
at every epoch, the model extracts more features from your dataset, but also, forgets the pretrain even more
cuz I left my model training over the weekend to 650 epochs, you think this may be too much?
you want a sweespot where it sounds like your intended voice but at the same time doesn't fully forget the base
tbh i have tried doing multiple tests with the epochs and everytime anything above 100 epochs overtrains
yea regardless of dataset size
Oh wow
for context my pretrain took 110e iirc and that was a 50 hour dataset
Insane
