#🧬│ai-chat
1 messages · Page 346 of 1
What's ur PC GPU
Ask and elaborate in #🔍│help-w-okada
do people here help with making ai image models orr is it just voice
cause i tried training an ai imagae model using flux dev but it doesnt generate me the safetensor file
I'm not sure which trainer used but there should be save every step option, just wait till it saves
im using pinokio idk what do u mean by save every step
theres a sample every step but it doesnt work and the model says it finishes training in like 3 mins
What model is good for deep voice naration?
like this in flux trainer comfyui workflow
idk how ot use comfui :/
is there like a tutorial how to use it
cause i have it installed
Ey
I have got a question
How do i make voice overs for free?
Like those in viral brainrot edits?
I've been in disbelief in people from previous AI Hub for a while. Is there anything I can help?
You mean AI cover? Well, there are options.
If you have GPU that's newer than GTX 10xx series in your PC, you can do RVC locally. If not, you might wanna look for cloud servive like Google Colab and Weights instead.
huh? who told you to train flux locally? 💀
RVC. 
mostly voice rn but u can #🔍│help-ai-art
what's ur pc gpu and explain more detailedly what u want to do, Speech To Speech (STS) or Text To Speech (TTS) ?
HELp
oh im a bit slow i didnt see that channel
yo wsp , im from sa . is available tiago pzk voice ?
search in #🔍│find-models or weights.gg
i find it, what is the best easy option to generate ?
thnx
or if you have capable gpu, you can try local inference (read the docs below)
Not available yet
-rvc
- How to use RVC Mainline Colab by Cauthess
- AICoverGen Colab Guide by Eddy (Spanish Helper)
- Create a model with RVC disconnected (colab) by Angetyde
what's your pc gpu
NVIDIA GeForce RTX 3050 Laptop GPU
How much vram?
4 GB, surely
yes, it is
Is Weights still giving me hidden ads when I got premium? Because I see numbers for blocked ads on an adblocker extension I installed. 
nvm then it's better to not do it locally, but you got Cloud (remote good pc, easier and faster than ur PC but it's limited):
- Weights.gg: Partnered with AI Hub, lets u do them easily but u may be in a queue
- Ilaria RVC Zero: fastest and simplest that you can get for free
- Applio Colab: max 4 hours, not granted, of GPU
Easiest manual (you still have separate vocals and instrumentals): Ilaria RVC Zero
Easiest possible and automatic: Weights.gg
it is still fine for inference imo, unless you care of the speed
Do you guys know if there's an open source f0 editor (like melodyne)? With note moving and modulation editing etc
I know not all people who have a very fast PC can spot on where's an ad hidden somewhere. However, my laptop is notably slow, so I launched up Chrome task manager to end task of huge load of ads. 
thnxs
it could be potential for RVC devs to implement, except the polyphonic detection unless they implement a new f0 method which is probably SVS-based: https://arxiv.org/abs/2401.16837v1
A novel model was recently proposed by Schulze-Forster et al. in [1] for unsupervised music source separation. This model allows to tackle some of the major shortcomings of existing source separation frameworks. Specifically, it eliminates the need for isolated sources during training, performs efficiently with limited data, and can handle homog...
So there's not something that exists already? I would have thought there would be because melodyne has been around 20 years. I don't mean polyphonic, I just mean for vocal editing
Interaction has expired, use the command again for a new interaction.
.
I've been thinking on that too and could do some research later
🤘🏼Very interested. Btw did anyone get BigVgGan going on a fork with RVC? I'd be curious cause bigVgGan v2 is out now, supports up to 44k, and is apparently way faster.
Noobies did get it working but it was to memory intensive
You can look in there to find what they are working on
Says #no access
Try grabbing the AI testing role
no, it's not faster
it's a actually much slower than hifigan
@gray rover I'm saying bigvgan v2 vs bigvgan
I mean yeah but why comparing bigv 1 to 2
in any case, we're having mrf hifigan and refinegan being tested atm
so, bigvgan isn't a priority anymore
w-okada
voice AI or kits are things we don't quite support
I had W-okada i think but i forgor which one i used in the past and when looked for W-okada mostly voice ai vids poped up
Thank you very much for letting me know
-rt
Interaction has expired, use the command again for a new interaction.
check the fork one
ofc don't follow yt tuts
^
Majority is wrong or outdated or simply following the trends / hype disregarding performance or cost
Yeah i see the one that showed W-okada to me first is gone now and now most vids about ai voice changers are promoting Voice.ai
By the way why Voice ai is considered the wrong one? Its a scam or smh?
it's closed source
paywalled + uses your voice for training
Wokada is literally free and open source
scam
voices are paid
most of things like kits or voice AI are based upon open source stuff that's been pretty much taken, monetized and then claimed to be 100% own
Ah yeah i remember that when i looked up "If w-okada is safe" everyone were saying it is open source so its most likely safe
- paywalls n scams
fr and the only way to get them for free is by getting coints that is by letting it be a crypto miner basically
u have to keep always the app in background
scam +´cryptomining
Oh damn
Why people keep promoting that?
App that basically slowly destroy your GPU
And I can confirm this bc I used it myself very much years ago, literally once, then deleted it bc i realized i dont even need a realtime voice changer
The same people who would promote Opera GX after it's company has a record of spyware lol
ig money
because most people are idiots, paid or simply don't know any better
👀
for people who don't know about it: https://rentry.co/operagx
wouldn't worry too much tho, you're in right hands here
yea some people think paid = better ig but it doesn't go like that
yuh, fully agreed
only really good paid service of this sort I can trust is eleven labs
their style copying is just flawless imo
I prefer Open Source tho
but true
11labs is good asf
ye, hoping for open source to get closer to El levels ASAP
I remember using W-okada before for streaming (Got voice owner permission) and had to delate when i was cleaning discs
it was pretty fun and good quality
yup, Wokada uses RVC (Retrieval-based-Voice-Conversion) models, which are the top open source Speech To Speech models
best quality since like a year
Can they also be used as Text to speech? I forgot if it was a option
Well they are made for Speech To Speech natively,
Do you want me to send you a paragraph that explains you how to use them as TTS?
Sure please! Il copy and paste it into notebook and do it
There are different Text To Speech (TTS) AIs:
GPT So Vits: RVC isn't as good as GPT So Vits for tts, but gpt so vits (few shot tts, which means needs just a lil training for models) can't use rvc models (and viceversa), and its only limited to: english, chinese & japanese, if you wanna check gpt so vits instead, read https://docs.ai-hub.wtf/tts/gpt-sovits/
Freemium 11labs: A easy way to do TTS is https://elevenlabs.io/, you can't use RVC model on this but its a mostly premium easy way for good quality TTS
FishSpeech: FishSpeech is a 0 shot (no explicit training needed) TTS, if you got a good pc you can use it locally else use their site
With RVC Models:
RVC is natively for Speech To Speech, but forks such as ilaria rvc mainline & applio have built in tts (using Microsoft Edge TTS to make a generated tts audio, which i suggest you to choose a tts model that is the same gender and language of the rvc model you wanna use, and then convert it with rvc)
If you wanna do tts locally with RVC Voice Models (if you got a good pc):
If you don't got a good pc you can do tts with RVC Voice Models on cloud:
-
Ilaria RVC Zero (Running on A100 GPU, free fasted rvc on cloud) and the guide
-
Use Applio UI Colab (with google colab T4 free daily limit gpu)
-
if you don't wanna use edge tts, you could try another tts ai from our tts index and use the output as an input in rvc
applio has tts, or does it not work anymore / there's better solutions?
wondering cause I haven't really explored tts field lately
I sent you different types of TTS
If you want to use RVC models, i explained that too
And in the 'tts index', you can find how to use TTS in realtime too for calls
i mean, 'realtime', you still have to type
Alright, thank you very much!
It does work
Ofc RVC models aren't natively made for TTS tho
True that
An actual TTS program would be better than it technically
Like GPT-SoVITS, F5 TTS or FishSpeech
perhaps I could sometime integrate gpt sovits into my fork
so RVC+GPT-SoVITS ?
yeee, been thinking of it for a while
You're welcome
shouldn't be too hard
Yes, it is Needy streamer overload
pretty good psychological horror/sim
I mean, you want to generate a speech with GPT-SoVITS then use RVC over it?
Would work i guess
ye I heard of it sometimes even on TikTok but never really played it
that's one way, I just thought more of simple integration of two into 1
Maybe I should give it a try
as in, unified interface
mm interesting
Its a pretty specific game, if you like staring at screen clicking stuff and doing different strategy then it is good
Wouldn't it make the project a bit too big tho 😭
But it is not fast or that interesting game if you already dont like slow and monotone games
I mean I usually love story or fighting games
I played even visual novels, as long as the story is interesting
If you love story games then you should like Needy streamer overload, it is story based
For example I played Bad end Theater, the story was good
mm yea worth a try
my favorite game would be undertale
I played it too!!!!!!!!!!
undertale has peak story 
Really good game, i loved Ending song
Oh lmfao never thought i'd find anyone else who played that
when i finished it a year ago, i tried searching for a fandom, but it was almost not existent
agreed, the ending was also unexpected
I don't remember everything tho 
@covert lake can you help me
what's the issue?
what's the matter ?
wtf i got the notification 10 mins later 😭
discord moment
i dont now how i can download it
Could you clarify what you want to download?
can you help me in a call
And also what's your pc gpu?
i would like to download okada
Sorry but I can't really in a call, but I can text and show you guides tho
let's go over #🔍│help-w-okada
Follow the guides
Why do they have mod ping perns
i want to know everything
Last update: Oct 21, 2024
No why can they point ppl
hlo
@hidden grotto
:wave: @dire plume, How can I help?
Available Commands:
• @weights find <query> or /find <query> - Search for RVC Voice Models
• /create - Create an AI Cover
• /image - Generate an Image
finally found
Hmm... @minor blade I wonder, is blase in here under some different name or, totally ditched discord?
I'm sincerely not sure.
hmmm.. then who's the main maintainer after Blaise who's authorized to do any changes to mainline?
Yo guys is there a website for making ai cover?
Ask either Vidal or Deivih.
🐢: Lettuce!!! bites the lettuce
You're welcome buddy..
✨
what's your pc gpu first
Hi,
I’m an Senior AI engineer, and I’m open to work. I’m excited about the opportunity to contribute to innovative projects.
If you have a great idea and need the expertise of a senior engineer, feel free to DM me.
Thanks!

hey if anyone here working w a reference model hmu
Good day, can i ask if you guys have bot that humanize texts or paragraphs effectively?
What is the best AI image generator right now?
@night lake hey man, as I'm on adding features to the ui, aside of configurable warmup and avg loss per gen / disc, do you see any other gimmick like that being added?
or perhaps @tepid basin or @glad nebula
Any propositions will do, as long it's within my capabilities
ill have a good idea at some point

i am bad at thinking
all i can think of rn is being able to change the optim
Oooooo, that sounds good actually
✨
'll see what I can do 'bout it
for now, it lands in the wip category tho. I believe I'd have to prepare some presets or more stuff required to have it configurable
changing the lr in the ui might be neat, in a small dataset a lower lr than 1e-4 helps a bit
added to the list ✅
can maybe also add this: #🔊│ai-development message
The avg loss ?
yeah
Actually
that ( configurable ) + use of warmup ( custom duration ) + toggleable mute files
will be in initial fork's release
( soon, hopefully )
man, ngl, applio's handling it really well, the args n modularity 😌
Glad I made the switch
😔 how do I make rvc not sound like siri when using discord
Siri...
Can you elaborate?
oh yea raz, what would you say on ability to delay discriminator or gen? idk, sometimes can be useful
I suppose, I could also see custom grad norm setting too
Its kinda hard to explain. Was gonna do voice trolling on a friend but the audio sounded super robotic when it came out.
oh, well, then that's due to A) wrong pitch settings B) your inability to mimick the source speaker / aka adaptation or C) the model is just trash
custom grad norms sound interesting but idk about delaying gen maybe disc it just doesnt seem useful in many scenarios
True, but might be there's a scenario one wants to experiment with giving a generator a lil headstart ( in case more than 1 disc is in use for example )
to not overburden a fresh gen
yeah combo of b and c
yet, that's most certainly C if you mention super robotic / rough / metalic and so on
yeah, you could add it and add text that says its only really useful in multi disc scenarios
Are their any current voice models that have been trained in emotion or are all of them dead pan or monotone?
I mean, ask people who make the models
If a model's trained on not diverse set or on a one that's lacking in pitch / expressiveness department..
there's nothing you can do
( man that f delay is triggering smh, completely not fitting my writing style oof )
can I train one with an amd card or no?
Tho ye Tachi, it's a matter of how the model was made and on what set
as far as I know yes
provided it ain't 4 gigs
I have a 7800xt
Not sure on the exact workflow but, this should be somewhat useful?
https://docs.applio.org/applio/getting-started/installation
There's a section for AMD but again, I am not the one who worked it through, ideally you'd want to ask Noobies 5663
I think
alright thanks for the help
Could also maybe add an option to change rmvpe's hop length iirc you said it was doable
oh yea
tho I am still unsure if rmvpe even likes that
afteral, it's still not there ( as in, slider for that )
'll need to test it sometime n see how it behaves
you can definitely change rmvpe's hop len
In that case, lands into wip
whats the wip list looking like?
wip/todo list:
from the ui level;
- Different optimizers choice
- Custom independent learning rate for G and D
- Adjustable hop length for rmvpe
- Custom gradient norm value
- headstart of chosen network's element: G/D
from the training code level;
- more warmup options ( Cosine anneal and so on )
- more customization here n there, more automations that currently's done purely in config etc.
concept stage:
- Configurable discriminators pairing
Currently added:
from the ui level;
- warmup settings
from code level;
- mel similarity metric
wip rn:
- avg loss settings
- refactoring some descs etc.
@night lake pretty much
And beautiful theme ✨ afteral
lol
my lil wayne ai working on with my beat me talking ahj
oo cool thx
mh mh
off topic but, I think I'll ditch mechanic for translations
Am too lazy to dive in and re-type / re-translate all ngl
avg loss is so op
Hey everyone! Been working on the Alignment question for awhile and finally think I have a working framework!
Check it out and share! https://github.com/AlignAGI/Alignment/
how about the log interval option in the ui? (having to edit the config file is quite a hassle)
It's actually automated now in Applio so
No more point in doing it by hand
These however, esp the running loss, I'm pretty sure people will love
Will make picking the right epoch even easier
Unless you mean a custom override?
but then I hardly see any point for that
it's like smoothed loss value?
Nope, something else
Currently the logging is done per last step in an epoch
it's naturally not the best indicator of model's performance
the avg running is averaging the loss per N steps / mini-batches
say, in ur case it's 48 steps per epoch, you'd set the avg for.. idk.. 6
you get the point
better actual epoch performance than just logging of the last which is just biased
Another thing that'd make it better ( and actually correct ) in future is proper evaluation phase with accuracy on unseen / validation set
Cause If I'm wrong ( Noobies might correct me on this ) the smoothed loss is just smoothed running avg of standard logged loss values
whereas this one is per steps within an epoch specifically
If you need graphical representation
( standard behavior ):
i think im having a small issue all the voices and models sound pretty good to me when i hear them by ear but when i use thjem in discord or a game they're choppy
Well ye, that's because of the gpu's overload it seems
either have to upgrade your gpu, tone down with voice changer settings ( sadly, get a higher latency ) or tone down with game's settings
Guide Written by:
Github - Blanc-dot
Discord User - https://discord.com/users/824922747423031359 aka VTArcelia or Arcy for short
Thanks to the following people : lusbert, poopmaster, felt, fazemasta, antasma, shadictl, x_hina, sushi
thanks are for anything added to guide, taken from any talks, s...
Or simply try the fork in case you haven't .. ah ye or that
with discord its not working well either
yea the fork version is more recommended to try for better gpu utilization/less cpu bottleneck (regardless if you even have a 9800X3D)
Can anyone suggest me a bot to do ai covers?
or probably the candlestick graph model lol
I was thinking of each candle bar for each epoch
and then some adopted metrics like moving average, etc.
I mean, there already is moving average
the pic depicted stock behavior of rvc / applio in logging
the running avg would work like so: ( depending on your N value )
It's the most straightforward approach without any gimmicks really
any overcomplication is just diminishing returns really, for next level we need evaluation phase 👀
for example, maybe the fluctuation range, so we could spot some collapses inbetween epoch
that's also no more the case, it's already figured out
Mute files + silences in the dataset
reason is, if some tiny slice ( 36 frames ) happens to be the spotlight one, that " collapse " happens, ye
Solution for that is simple, properly truncate the silence or resign from mutes ( which is yet to be fully tested )
Aside, if you need a direct indicator, it's already a thing actually, My mel similarity % loss
if you see it being any high value such as 70-90~ % range at such spike, then that's that ( but that's still just a diag metric really )
So, don't think there's any need for fluctuation frequency metric, provided users carefully take care of sets ( as it should be, the truncation of silence to reasonable levels )
i have no idea what that is
crypto
Learn then
you're one funny fella arent you.
Skill issue. 
hey
For Virtual Audio Cable, I let a web browser to output its audio to Line 1, from Line 1 to Reaper audio software as microphone to add some effects in real-time. 
With paid version of VAC, I think you can use virtual Line In more than 2 at the same time. I have an idea about this: A program -> Line 1 -> an audio software -> Line 2 -> a voice call program like Discord 
Or something like this: A program that outputs audio -> Line 1 -> W-Okada -> Line 2 -> Discord 
best girl voice realistic
That doesn't exist
There isn't a best one
You can search rvc ai voice models at:
- #1175430844685484042
- In #🔍│find-models , Do /find with @hidden grotto
- https://weights.gg/ (login required)
- https://huggingface.co/models (but watch out cus in hugging face there arent only rvc ai voice models)
- https://voice-models.com/
- https://thevoicemodels.com/ (for Turkish Models, login required with discord and level 2 on their server)
if there isnt one, you can:
- #1159289738314919936
- #1191429836321849435
- make it yourself with our docs guides https://docs.ai-hub.wtf/essentials/how-to-make-voice-models/
:wave: @covert lake, How can I help?
Available Commands:
• @weights find <query> or /find <query> - Search for RVC Voice Models
• /create - Create an AI Cover
• /image - Generate an Image
this won't stop




https://huggingface.co/wok000/vcclient000/discussions its not virus?
no its not
Wokada is the program to use RVC (Retrieval-based-Voice-Conversion, Speech To Speech Models) in realtime for calls
There's the fork (modified version), the deiteris fork which has better performance
that one you sent is the original
@glossy fox what's your pc gpu?
be sure to not follow yt tuts
rtx 4060
Interaction has expired, use the command again for a new interaction.
1st link
its the wokada deiteris fork
meaning it's better
u gotta read it up ofc, there's no updated video
and follow the nividia version as u got an nvidia gpu
@glossy fox for any issues, ask in #🔍│help-w-okada
okay
So this link is safe?
sorry im have bad english I don't understand well
yes but u should use the wokada fork not this one
that one is the original wokada
the wokada fork has better performance
both of those are safe
if you got that link from a yt tut, you shouldn't follow it, bc its old even if it's safe
okay thx
yw
must be 2022 or something lol
which one do i download
covers or real-time?
Man, you gotta love how people ask a question but then have dd status ✨😂
for doing what and what's ur pc gpu?
nahu shouldn't follow that
dd status?
don't disturb
and also playing a game 😭
while asking the most generic question
yup, like, kinda wasting the time of someone who's willing to help
fr
!howtoask
How To Troubleshoot 
- Don't simply mention your issue, like "
my rvc is not working". - Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
- The more context, the better.
- Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
- It's okay if you're frustrated, but don't take it into this server.
- Don't DM without prior consent.
- Don't ask for every little instruction. Put your own effort & test things by yourself.
- Don't ask to ask.
- Check if your answer is a Google search away/on our guides website.
lmao, good one. totally forgot this existed
u gotta elaborate buddy
Don't spam:
This channel isn't really for advertising ~ Perhaps, try #1159290752195633273 ?
hey guys do you know how to create ai art. using reference images? like i send ai these reference images and make it create art? in that style
heres the timestamp of the original song https://youtu.be/qCXtErcqR4Q?si=0ugBz2Oeh0w88X17&t=246
Provided to YouTube by DistroKid
Bahamian Rap City · Joe Hawley
Joe Hawley Joe Hawley
℗ 585-J-HAWLEY
Released on: 2020-11-11
Auto-generated by YouTube.
idk how the fuck it knew those were vocals
the model is MB-Roformer-InstVoc-Duality-v2 with 256 segment size and 50 overlap
the 50 overlap made it take 20 mins to separate
it is a 6 min track though
I mean, because it is vocal afteral
vocoded? yes, chorus, pitch-tune n shit ton of effects? yes, but nevertheless, vocals
AI too chad 😎
its closer to static than vocals
oh, you mean the background elements?
In any case, might still resemble somewhat vocal-type of data in spectrum so
#1159290752195633273 else it will get deleted
@random karma
I have a small question, is it legel to use voice changer reading a novel and post on tiktok, for example using gura's
I suppose this topic's quite complicated, afteral, as far as I know there's currently no universally accepted / correct law in terms of voice clone models
But I suppose, if we speak of pure legality or ethics, it is somewhat " misappropriation of identity " ?
Models on their own are tools afterall, they fall under gray area category ( or so I see it that way at least.), just like cameras and stuff but the results / effects of usage matter the most
Now, hear me out. This is my own personal, thus subjective, opinion.. Using someone's voice for public work can be at times problematic especially if target person deems it as not right for them. if I was to do something similar to your example or really anything of that sort, I'd make sure to not cause any harms or misunderstandings.
( That means, including the label of the material being AI generated and /or a lil disclaimer you're willing to delete / remove it if requested to do so by appropriate authorities )
aHR0cHM6Ly9kaXNjb3JkLmNvbS9iaWxsaW5nL3Byb21vdGlvbnMvZlBEV0VtbUQ2YjdSWHVEVnV1WmZGcEdx (Free nitro) if you can figure out how to get the url
yoink 
already redeemed

I thought it was fake
same lol
it was a epic games nitro gift 
those are still valid????
i thought they weren't after getting abused asf
its a new one 
too slow
oh lol
not like i have billing info 
actually if your not stupid make a bunch of discords and then put cc on it and just use it hella nitro easy for boosting and sh
yea people abuse that alot
bro thats just base64
@stark scarab your ffmpeg installer is based
im trying it on the next unsuspecting ai noob

it's just doucle click xd
doesnt get more simple lol
Ada yg punya model ai anime blue lock?
speak english here
also
You can search rvc ai voice models at:
- #1175430844685484042
- In #🔍│find-models , Do /find with @hidden grotto
- https://weights.gg/ (login required)
- https://huggingface.co/models (but watch out cus in hugging face there arent only rvc ai voice models)
- https://voice-models.com/
- https://thevoicemodels.com/ (for Turkish Models, login required with discord and level 2 on their server)
if there isnt one, you can:
- #1159289738314919936
- #1191429836321849435
- make it yourself with our docs guides https://docs.ai-hub.wtf/essentials/how-to-make-voice-models/
:wave: @covert lake, How can I help?
Available Commands:
• @weights find <query> or /find <query> - Search for RVC Voice Models
• /create - Create an AI Cover
• /image - Generate an Image
@covert lake best de reverb model for dataset and where to run it?
👀
i dont really care about sample rate cutoff cus i already hit it with melband
wait a sec... melband is 40k??
theres no english to speak if they can't speak english to begin with xd
it's fullband 44.1k unless your source is 48k
that is awesome sauce
@gray rover I have a suggestion for applio ui
who threatened blaise with a lawsuit
im gonna find them
huh, that def sounds helpful
You got any reference or a nice tos to re-adapt?
ngl, I suck in such things ( if I am to make one on my own, that is lel
Ctrl c ctrl v applios
oh
I suppose, you'd mean a thing where I nor people associated with the project hold any legal responsibilities for copyright infringement, damages blabla.
Thus, user is fully responsible for the use of the tool
Yea, sounds logical
Yup
Do you poop
Don't ping me for no reason
Any trolling is not something I'll tolerate
There is no simple questions, do not ping me is a do not ping me
This is a warning
No
Too many douches ended up getting banned or kicked lately, if that's what ya want or aim for, eventually, go ahead but immature or annoying people aren't well welcomed here
Mhm good day to you sir
Pov your mom didn't give you enough attention:
lmao
Real
such a ragebait troll
Codename dodging the poop allegations

Hey everyone! Been working on the Alignment question for awhile and finally think I have a working framework!
Check it out and share! https://github.com/AlignAGI/Alignment/
You could actually move it to #1159290752195633273
Thank you, will do!
I think it's UVR-DeEcho-DeReverb
Greetings, can i show the video about my custom bio inspired model of artificial brain right here and ask for a few thoughts about it?
I didn't release it yet so it's nothing to send in promo i guess, i will make that it in a few episodes
isnt it waves clarity vx dereverb xd
If anyone wants premium on Webtoon, here’s a code for one month of premium for free: ZvTWhMzKDaXaqPe (in case someone doesn’t know, Webtoon has comics, videos, etc.).
i need a little help with prompting with chatgpt, if you have any experience please join call
i barely use webtoon so ion need that but u a goat fr for giving codes like that
yo, we finally got rvc v3 (unofficial by codename0)
https://github.com/codename0og/codename-rvc-fork-3
I am looking for X bot developer who also has exp in RAG system.
This is long term project, DM with your fiverr or upwork account and github.
you mean twitter?
I don't have a Fiverr account. But like can you please put your at #1159290752195633273?
can you do this?
No, I'm not a professional.
do you know any professional?
dude just say don't ping me lmfao it's not that hard
I'm not the type to say that. 
your choice ig
real
uh huh
use #1159290752195633273
To promote paid commissions
Hi
Hello
Wake up, everyone! A new RegalHyperus drum model just released!
Fall in Love Alone (Drum model no. 554)
@hidden grotto
:wave: @barren glen, How can I help?
Available Commands:
• @weights find <query> or /find <query> - Search for RVC Voice Models
• /create - Create an AI Cover
• /image - Generate an Image
jak zrobić własny głos w AI?
You should use English
but it's your lucky day.
I'm polish ( mogę pomóc jak chcesz ale to za 15 min gdzieś (( ~ mentioned I can help em in 15 mins ))
My Canserbero AI álbum will release on my channel in 5 hours 😄
Nope, didn't use Suno nor Udio
oooo, lemme know when that happens. send me a dm then 🔥
man I should f reset my phone's kb
SwiftKey been so trash lately

For sure codyyy 🐢❤️
🥬
🐢: bites the lettuce
Good turtle
Try out FUTO keyboard
Local autocorrect model that trains on your device. Very similar experience to gboard. Custom voice to text models too
Yooo now that's more like it
Been waiting for proper solutions like that. Hopefully it's customizable similarily to swift ✨
I haven't used Swiftkey but switching from gboard was easy
I don't do multilingual either
You can export the model to a weight so you can tranfer across devices
tbf, switfkey fucked up the moment microsoft took over
Oh wth
prior to that it was so good damn
DM
I got a question
Anyone know how to make a text to speech bot for discord with voices that it can send through text???
hello I would like to cover my friend's sound for one song but how do I add it to the zip folder?
🤔 For those who know about LLMs, how fast should an RTX 4070 & 32gb ddr5 6400mhz typically take to generate long responses. On Llama 3.2-3B-Instruct
hello everyone
Btw, Can you tell me more about it?
cause it feels like there's something I missed
ai content is used in this
I was asking you to remove the EULA button cus I don't like buttons
But its understandable cus riaa will sue tf out of blaise
I'm not even sure if it was some sort of company (either related to VAs or a record label) that threatened Blaise.
that non-functional option seems just like age check button
real
so im getting into this, i've found beatrice v2, but it only supports toml files, which i havent found any voices in #1175430844685484042 that have a toml file.. am i doing something wrong?
#✨│ai-help ( if you wanna make covers / voice to voice ) #🔍│help-w-okada ( if you want to use real-time voice changer ) and @ me there
Your question didn't specify which AI voice conversation you wanted to use with. There are two RVC programs: Applio the RVC audio conversion, and W-Okada the real-time voice conversation that uses RVC voice model.
CAN ANYONE TWLL ME HOW THEY MADE THESE AI!!!
https://www.facebook.com/reel/1117264942658581?fs=e&s=TIeQ9V&mibextid=0NULKw
Astralabs
I think about 1 to 3 seconds
Or if you want to make your own llm, you would need a big amount of gpus
How long is the prompt and which backend? Q8 gguf model should produce approx 90-100t/s
Which is, well, very fast
An LLM model can be trained as fast as Stable Diffusion image generation with 69 GPUs. 
i have realtime voice changer client and it takes 2 seconds for me
chat my rx 6600 is cooked
Haven't even heard of that
just a gpt wrapper
promote in #1159290752195633273 if you're looking to get hired
@chilly lake do you have any experience prompting
for stabiity AI there are special extensions, also you can ask chatgpt to make a prompt
i am a complete noob, i am just trying to prompt my chatgpt to make scripts in a certain way, and it is going south, if you could hop in the voice channel for 2 minutes i would be grateful, but understandable if you cant
1 <:_:721476118942580777> Dewott
• Lvl. 29 • 52.69%
2 <:_:721476119374594108> Munna
• Lvl. 30 • 21.51%
3 <:_:721474704036069397> Slowpoke
• Lvl. 41 • 54.30%
4 <:_:721476309389148202> Axew
• Lvl. 37 • 42.47%
5 <:_:721476203122524251> Sandile
• Lvl. 20 • 39.25%
6 <:_:721475493525848113> Wailmer
• Lvl. 1 • 66.13%
7 <:_:721474816607256627> Togetic
• Lvl. 39 • 35.48%
8 <:_:721475493710266441> Lunatone
• Lvl. 4 • 45.16%
9 <:_:721476020846329909> Weavile
• Lvl. 9 • 44.62%
10 <:_:721476203592024095> Klink
• Lvl. 23 • 56.45%
11 <:_:721475597095665694> Empoleon
• Lvl. 20 • 56.45%
12 <:_:721476466759434302> Goomy
• Lvl. 34 • 72.58%
13 <:_:721476020649066537> Tepig
• Lvl. 21 • 41.94%
14 <:_:721476119039311884> Lillipup
• Lvl. 9 • 47.85%
15 <:_:721474704220880906> Psyduck
• Lvl. 36 • 46.77%
16 <:_:721474757941395567> Porygon
• Lvl. 22 • 41.40%
17 <:_:721474704354836570> Krabby
• Lvl. 31 • 21.51%
18 <:_:721474816699269181> Hoppip
• Lvl. 31 • 46.24%
19 <:_:721476389156552824> Amaura
• Lvl. 30 • 68.82%
20 <:_:721476203491491920> Gothita
• Lvl. 26 • 47.31%

Any voice model idea for draft training on Weights? I have too many premium voice model training items on Weights. 
just a guide what needs to be done for different model types
you can go to civitai, look for a model, look for generated examples at the bottom, see the prompts used
Like who to make? Maybe Meeple from Brawl stars but idk, I don't really make ai covers anymore
that's for image models tho not LLMs like chatgpt
I wanna do something kanye related
what's your pc gpu? and what are you looking to do (make ai covers or realtime voice changer for calls) ?
For LLMs like chat gpt, you could consider a roleplay technique
Ex: You're a Movie Producer, you are an expert into making scripts for TV Movies. Make me an initial script and brainstorm me ideas for a movie about Batman fighting the Flash
You can also see many prompts only, such as https://github.com/f/awesome-chatgpt-prompts
Link?
astralabs.ai the website doesnt work but they have a working discord bot.
its in this server too
Same as the IG video ?
yoo i appreciate it, i will look more into it!
thanks you!
That's mostly for Image Generation Models like SD tho
you're welcome
ohh fairs, yeah mostly looking for scripting
@covert lake How hard would you say it would be if i have 1000 shortform scripts and i want to "fine tune" or train chatgpt to make similar good scripts
I never really did that tbh
ah fairs, appreciate you for the response
maybe you can check https://platform.openai.com/docs/guides/fine-tuning tho
it's from their guides
thank you!🙏 i will check it out
Modelleri hangi kanlada bulurum
Yes
Speak englishhere
yw
what does this mean:
[VCClient] wait web server...410 http://127.0.0.1:18888/
Traceback (most recent call last):
File "MMVCServerSIO.py", line 260, in <module>
File "subprocess.py", line 1209, in wait
File "subprocess.py", line 1506, in _wait
KeyboardInterrupt
[12972] Failed to execute script 'MMVCServerSIO' due to unhandled exception!
dont do this, obviously
That was fucking stupid and you shouldn't be trolling people like that.
fyi, there's tons of people who don't even realize they have a gpu, having gotten their first pc so like, cmon man
I could see how at least 1 or 2 people in future would search up for similar error and follow it
it ain't funny, it's malicious
and potentially damaging. Refrain from doing such things
Alr, bet
🙂
@covert lake if you may
we've gotten enough of ragebait shitheads goofy ah kiddos
Tone down


nah
tamed
hi
bro i needed some help with ai image generation
panda pfp is so based
#🔍│help-ai-art you'd want to go here
A word used when you agree with something; or when you want to recognize someone for being themselves, i.e. courageous and unique or not caring what others think. Especially common in online political slang.
The opposite of cringe, some times the opposite of biased
^
having a panda pfp is courageous?
wow
in this context i mean your panda pfp is great and i love it, its not cringe and you are a good person for having it
btw i cant upload images here so i might not be able to ask for help for a specific image
You gotta level up first
got it
by talking orrrr, well.. if you need a speedrun I guess
#🤖│bots
And just keep on writing something random I suppose
@covert lake we need cleaning on isle ai-chat
just cleaned it
and wtf was wrong with them
thanks 🙂
thanks for @ me 
What do you need help with exactly?
Hi
Hey, can anyone help me with a video generator?
No sorry
I need to make a music video for an ai song i made
😔
What is the ai
Sono
Who's that
Oh wait no
All I know is perfection
text/image to video AIs:
- Locally (runs on ur pc):
- pyramid flow (Image/Text to Video)
- cogvideox 1.5 5b: Image to Video, Text to Video
- Cloud (remote good pc, running on an online website for example, easier to setup):
- Weights.gg (paid only)
- pyramid flow (Image/Text to Video) (HuggingFace Space)
- OpenAI Sora (paid only, in some countries)
- lumalabs
- Hailoua AI
Suno, its not a voice, its an ai music maker
Oh
I make the lyrics and it generates a beat and the rest
damn it sora isnt available in my country
welp
Welp u can try with others
short question. Is there anywhere a good tutorial or guide for how I can create my own voice model?
need to know this too
allright. I will search here. If u see or know sth please let me know :D
Replied in #✦│chat
What's ur PC GPU
What's ur PC GPU
right now I am on my laptop with 32 gb ram and an ultra 7 and an intel arc. But if it would be better I can get to my pc with an 3060ti and a ryzen 9 with 24 gb ram
I'm guessing with ultra you meant Intel ultra 7
And that laptop doesn't got a GPU, right?
yeah
its just for writing like now or lower stuff
I would do that voice model on my main pc with the rtx 3060ti and my ryzen 9 and 24 gb ram
Would that be enough?
more than enough
tbf, any gpu with 6/8 gigs will do
ah nice thank you
6 gb 1060 can pull through too 
yes, using 4 gig is a misery however
Do you have a guide or know where it could be
uhhh, ye, there's plenty around
on this discord
hmm wait
yeah
- Creating Datasets for RVC using iZotope RX11, by Cauthess
- Gathering and Isolating Audio, by SCRFilms ❄
- Instrumental and vocal & stems separation & mastering guide, by deton24
- Vocal Mixing Tutorial, by Roomie
- https://mvsep.com/
@covert lake What's the current guides source that you typically recommend
just in case
The ultra 7 offers the NPU which is good
But RVC doesn't support it, your desktop would be the best
allright thanks and what of those links should i use
The ai hub docs are https://docs.ai-hub.wtf
Last update: Oct 21, 2024
the first one right
Usually it's recommend to use either mainline or applio, mostly applio
alr thx, gonna keep that in mind
I'd recommend my fork due to the easier training workflow
ok so should i use applio?
I don't really know about your fork but maybe hard for a user without guides
Will ur fork also support NPU btw
Yea
no I meant like, it's still applio but incorporates the avg metrics which
I believe should be easier for newbies than understanding the last logging is the last step from the batch etc etc
ok thank u all guys. Any good vids online btw? And do u know any application for anything for my npu?
I mean if u wanna suggest ur fork u can do that
In that case follow Nick
and once you wanna dive deeper, lemme know
But I rather go with stability
Actually, the fork's more stable as of now and has better readability
no bugs or issues at all
Nope no yt tuts nor vids
Only written guides
All yt tuts are outdated
ok thanks
and what about anything with npu? like not online for voices but anything to use that npu
I think https://github.com/rupeshs/fastsdcpu has NPU support
And well.. that's the only NPU supported program I know
I rarely seen projects support NPU
yeah same :D
allright now I only need 30 Minutes of my voice
And then I am ready to build my own dataset
is the fastsdcpu save to use?
You mean safe to use? Yeah I used it on my laptop and phone, just CPU tho
It's open source, meaning you can even read the code of the project yourself
ah nice thank you
short question. I got the following error: ERROR: Could not find a version that satisfies the requirement mediapipe==0.10.9 (from versions: 0.10.13, 0.10.14, 0.10.18, 0.10.20)
[notice] A new release of pip is available: 24.2 -> 24.3.1
[notice] To update, run: python.exe -m pip install --upgrade pip
ERROR: No matching distribution found for mediapipe==0.10.9. But i updated pip and installed mediapipe 0.10.20
hi guyz im new here. im lloking for a good and self-hostable model (or at least runnable with a python script) which i can use to clone italian voices. do you know any models with a good quality?
Use applio bro so much better
he's not talking about applio man
"mediapipe" isn't used in applio and from the context you can tell it's about 'fastdcpu'
dipende cosa cerchi
Un modello text to voice con capacità di cloning su cui posso fare inferenza direttamente dal mio sistema, no API o altro. Ho le risorse necessarie per farlo
text to voice ti posso consigliare gpt sovits
ah merda in italiano non funziona
mi ero scordata
li manca il pretrain italiano
guarda di open source praticamente non c’è quasi nulla per l’italiano
po p o
Peccato
Proprio zero?
Quindi usate solo servizi qui?
io ho sviluppato la mia robetta ma è più voice to voice
text to voice pure ma usa edge tts alla base
Quindi per fare cover o simili?
yup
Qualità bassa ?
O edge tts intendi un servizio tts esterno
edge tts è il tts di microsoft, usa quello integrato
Temo non faccia al caso mio
Sarebbe carino allora fare un train su qualche modello.
Quali sono i modelli migliori in termini di qualità dai quali partire?
modelli per rvc intendi? la tecnologia di cui ti parlavo
Piú text to voice ma anche in inglese
controlla su #1175430844685484042
Altrimenti rvc non viene un po robotica la voce?
dipende molto dal modello
Understandable
Ora ci do un occhiata e provo qualcosa, grazie mille 😉
di nulla!
I actually have already written it all I just need to make a pr
mh?
to the main applio?
or wut
I've written a guide for your fork and such on the AI hub docs I just need to make a pr and ask ray to merge it
If you want to look at what I've written so far you can look here: https://github.com/Razer1724/docs_updated/blob/updated_docs/assets/RVC/Local/Codename_Fork.md
lemme see
anyone know what ai was used to make these? there's a bunch on the channel
https://www.youtube.com/watch?v=CTdMM2EtEOM
A, actually @night lake
Better to mention the releases
those are always 100% guaranteed by me to be safer than repo
as I always do at least 2-3 checkups and files verif before pushing
Yeah, I haven't been able to update it
I've been busy
yuh all good, just thought it's worth mentioning, just in case
Better be safe then sorry
as for a compiled version.. welp. It's a long story but in a short, I've been with awfully shitty upload speed lately
so, that's a big rip

Ws chat
ws good guy
Rippp
ikr man, it sucks lol
glad I don't really upload anything recently
else I'd be f over
How many Mbps you getting?
.... don't ask.. to put it in a perspective how really bad it is, 3 mb upload takes 20-30 secs
vs 20 dl per sec

mobile data user moment
Yea, long story and you're right
5g my ass, revolution they said 💀 we gonna get our brains melted they said 
For the docs how would i explain how to use the avg loss? Like would i just say 'run a single epoch to find its step count then stop training and then set the avg loss freq to a number where it logs an epoch 3-5 times'?
running 1 epoch in the training loop to completion just to find the number of steps is so RVC... lame
there's len(train_loader), set it automatically
well, so what, you recommend to do a synthetic / dummy steps checker for the training set or wut
if you say "i want to log 5 times every epoch", then you get len(train_loader) /5 and do that
" So, I think it's most common to balance these factors by reporting an averaged loss over N mini-batches, where N is large enough to smooth out the noise of individual batches but not so large that the model performance is not comparable between the first and last batches. "
this should be some good info to start with
put if global step % value == 0, log whatevr
We aren't doing logging X times per epoch
it should be based on known steps per epoch
this is a different methodology than what you have in mind
" 'run a single epoch to find its step count then stop training and then set the avg loss freq to a number where it logs an epoch 3-5 times'? this should not be a thing ever
step count = len(tran_loader)
i was just throwing something at the wall to give an idea 
again
" well, so what, you recommend to do a synthetic / dummy steps checker for the training set or wut " within the ui, before training
I ain't sure if you even get the concept in the first place
You mention that the steps count can be checked by the loader and yeah, that's true
but the idea is for the user to get to know steps, then based on that, deduce and choose what they want
why do you need any step checker?
on average, train len can be estimated beforehand
expecting average users to do this its crazy
user does not even need to decide, you decide how often you wan to calculate what you want to calculate
Then write down the formula and explain it to Razer, then we good
cause if you like to step in my ideas, at least contribute big guy
( smh )
No
Again, it is not the premise to decide how many times you want to log per epoch, epoch is not the main topic here
it is about what every N steps count you want the logging to take the place
it is per steps / per mini-batches logging re-interpreted to match " avg epoch performance " premise
but it depends on the size of the train?
I just can't f get it, why can't you at least once not complain about meaningless bs man
train is 1000 steps, you want to log every 100 steps.. so that's 10/epoch
100 step mini-batch
or whatever
anyway, this is a crazy conversation, you do you
ahead of you, my dude
More like craving attention to prove you're the best engineer around despite not being asked to
Just please for the hell's sake, tone down your ego for f once
May I remind you I am a hobbyist that barely knows shit about coding, I do it for fun, because I want to, I add in what I deem good for me and I give an option to use it, not enforcing it. It doesn't have to be tip-top perfect like it is projected in your head
some rural areas still have 4G LTE tho
dont put your words into my mouth and for fking sake, take a break
You should take a fucking break, tf was you getting the convo in #✨│ai-help scolding down the dude for using pinokio? or whatever stuff, which I already did
If that ain't looking from above at someone goofy, then idk what it is
it was not needed. Just as your ego oversaturation man
also tf was that " applio 'lives' mostly thanks to me " kind of thing?
i think i've seen 4G LTE in my phone when the signal is ass lol😭
Anyway.. you can put in something like this:
"If the total number of steps per epoch is unknown, the user may consider running a short 1-epoch training to determine the corresponding step count. While the choice of the averaging factor 𝑁 ultimately depends on the user's preference, Codename recommends experimenting with an averaging window that accounts for approximately 23% to 32% of the total steps in an epoch."
@night lake
or something, idk
.... ""If the total number of steps per epoch is unknown"... " running a short 1-epoch training "... again I'm looking at this in astonishment
You were all against the averaging in the first place, so please, shut the hell up if you're gonna keep on bsitting around
See? u dunno when to stay quiet man
look, you dont need an exact number... number of samples / batch size gives you an average +/- 2 steps
+/-, that's right
I don't need estimates
all that crappy estimating led to previous crappy auto log syncing, back in the day
len(train_loader) not enough for exact?
Then I ask once again, do you recommend doing an ui based steps calculator / dummy checker prior to user's inputting the n value
either that or running a test run, I only want accurate numbers, not estimates
you're asking the user to do the dumbest thing since 'syng graph'
it wasn't " dumbest " it was what it was at the given time
where were you back then hero
I did not know RVC even existed
ah, I see pinokio is a cloud service like lightning.ai
you dont even need to run 1 epoch to completion
you didn't get the point of what I wrote I think, it was about that I already cleared up the man and noobies input wasn't needed
you just need to start it and stop right after
Sure, but I don't want users to develop bad habits of failure-stopping
ok chill 
that's just facts
was said once, noobies' input wasn't functionally required at all
some people just want to overcomplicate the simplest things
some people have a habit of dismissing certain ideas without having their own input in it
make a decision that does not involve users who dont know and dont care about this thing
thoughts on :
"To use the avg loss you need to know the total number of steps per epochs, you can train one epoch to find the step count. Choosing an averaging factor depends on the user, however Codename recommends experimenting with a window that accounts for around 23% to 32% of total steps in an epoch. If you choose to not use 23-32% of total steps be sure that the logging frequency isn't to small because the losses can vary a ton and it can end up confusing you, and make sure for big loss frequency it isn't to big because it may smoothen the noise to much and not give you accurate results."
Yes, seems good
man, you're really surreal
Honestly, as I said idk, 2-3 weeks ago? I have no time for these nonsenses coming from ya
ight thanks 🙂
so how hard would be put ' size of the averaging window - [ ] 1/5, [ ] 1/4, [ ] 1/3 of the epoch size' instead? 🙂
Np man ✨
23% to 32% seems to be pretty much that?
Not everyone loves math or schematics or whatever
Some people can't even read right with proper attention span
we know you don't give a f about " kindergarten kids " but cmon, don't complicate what does not need to be complicated
@gray rover #✨│ai-help message
I remember SCRFilms used crepe hop length 8 for some reasons
uhhh.. well, I mean, you kinda can, ofc but.. at this 'reso' imagine all the potential error points
imo an overkill
Tbf I personally never go below 64
btw can you share (again?) comparison between rmvpe and crepe hop length (feature extraction for training) in talking, ASMR, singing (from slow genre to metal screaming)?
what comparison?
have I ever made such? ( genuinely
tho, as far as my intuition goes, main reason crepe could be smoother is because it uses gaussian smoothing during training, which can lead to softer pitch contours, I suppose
- adjustable hop ofc plays a role here
I'd really have to ( some other time ) do adjusted rmvp vs crepe, both at 64 and 128 to say something more
but ye, probs the smoothing plays a role here
(means crepe has less tendency of voice cracks than rmvpe?)
one might like it, one might hate it. But that'd explain why rmvpe seems more rough and sharp and crepe is soft ( good for females
the voice cracks are really due to ' losing the track ' of the fundamental / f0
when there's harmonies for instance
rmvpe is without a doubt more robust in that area
you basically gain robustness and accuracy in ' noisy / imperfect ' scenarios over accuracy and smoothness in clean audio scenarios
well, screaming and growling is basically heavier tendency towards " noise " qualities of the audio than harmonic
same as breathing, sibilants etc
if the model's exposed to it, it can do fine
issue's is, vctk which is the base of pretrains naturally has absolutely no idea of what it is and how to interpret it
and when people typically make models targetted at screamos or whispering, they include too little of that data or even mix it in bad ratios
if ratios are more towards the data the gen / disc learned on the og dataset, naturally bias gonna occur ( or so I believe )
so pretty much uhhh, if you get a decent base and / or focus mainly on screamos / provide enough data
it's possible for the model to do just fine
But that's really about anything, even gasping or spitting sounds
whatever you'd want ( but then there's hubert aspects which I won't hide, noobies knows more on it so, would be best if he revised what I said just in case. iirc, if hubert doesn't flag it as something meaningful, it doesn't get through or like, not in the exact same shape or form (( again, might be wrong
Ive found for whispering that you only really need around 10 min of data to get it decent
I'm not sure if KLM includes metal singing
ofc more the better but its a good min
well ye, but then an example
10 mins of whisper, 1.5 hours of normal talk ( a vtuber
model's gonna be biased towards / better in converging to speech
then there's also exposition aspects. More data you feed it, higher the likelihood of the model figuring out the heck it works with
yea, you need a good balance
that's what we mean by proper convergence pretty much
yup
For anyone wondering this was done on a test of a 30 min set 20 of which was speaking
that's kinda a decent balance, provided the 2 subsets of the data type aren't too diverging I suppose ( and the bs isn't too extremely low, then the small bs's induced noise + variability in data induced noise would be a mess )
I mean yea, back in the day I'd recommend a similar ratio, perhaps 10-15 mins of this and 4-6 mins of that
both rmvpe and crepe can be confused on trying to infer harmonies and doubled vocals
I somewhat think there could be a new f0 method to develop that uses something like this: https://arxiv.org/abs/2401.16837v1
(or more precisely, like "singing voice separation")
A novel model was recently proposed by Schulze-Forster et al. in [1] for unsupervised music source separation. This model allows to tackle some of the major shortcomings of existing source separation frameworks. Specifically, it eliminates the need for isolated sources during training, performs efficiently with limited data, and can handle homog...
I think the best idea of mitigating the pitch confusion errors would be to incorporate some kind of context awareness
and past-now comparitive tracking + prediction, kinda lookahead concept
thanks for the paper. Will read it up
Seems like they do have the code but
Apparently there's no inference code anywhere
idk, maybe in some distant future when I feel like it or am done with all the current work, could dive deeper
Anyways, I head to bed and gonna leave training ( mrf test on 3 Okay no, seems to be hours 8 ( oof ) + ranger test and my tweaks eh ye )
in any case, potential updates gonna land in #🔊│ai-development
cheers all. Gnight ~
Huh
I would like to ask how to cold-start an AI product. There has been no user using it.
Are you willing to use my AI product? https://photog.art/
There are different Text To Speech (TTS) AIs:
GPT So Vits: RVC isn't as good as GPT So Vits for tts, but gpt so vits (few shot tts, which means needs just a lil training for models) can't use rvc models (and viceversa), and its only limited to: english, chinese & japanese, if you wanna check gpt so vits instead, read https://docs.ai-hub.wtf/tts/gpt-sovits/
Freemium 11labs: An easy way to do TTS is https://elevenlabs.io/, you can't use RVC model on this but its a mostly premium easy way for good quality TTS
FishSpeech: FishSpeech is a 0 shot (no explicit training needed) TTS, if you got a good pc you can use it locally else use their site
With RVC Models:
RVC is natively for Speech To Speech, but forks such as ilaria rvc mainline & applio have built in tts (using Microsoft Edge TTS to make a generated tts audio, which i suggest you to choose a tts model that is the same gender and language of the rvc model you wanna use, and then convert it with rvc)
If you wanna do tts locally with RVC Voice Models (if you got a good pc):
If you don't got a good pc you can do tts with RVC Voice Models on cloud:
-
Ilaria RVC Zero (Running on A100 GPU, free fasted rvc on cloud) and the guide
-
Use Applio UI Colab (with google colab T4 free daily limit gpu)
-
if you don't wanna use edge tts, you could try another tts ai from our tts index and use the output as an input in rvc
Non c'è quasi nulla in italiano, tranne che Edge TTS, 11labs e XTTS2
Xtts2 é l'unico in locale, potresti provare quello, ma é 0shot e non più aggiornato cmq
Se no dovresti fare tts con I modelli RVC (speech to speech)
Please tell me what's your python version in #🔍│help-ai-art
@night lake Btw why is styletts2, gtts, bark, tortoise and meta voice in the ai hub tts index ?
I removed:
- gtts: Google translate, shitty quality asf
- bark: after talking with @chilly lake (#🧬│ai-chat message) , realized it hallucinated
- tortoise: old, xtts2 is a fork of it and an improved version
- matavoice & styletts2: not that good and no longer updated
https://x.com/metavoiceio/status/1754983953193218193
Meta voice sounds kinda noisy and they literally don't say anything since March 😭
use #1159290752195633273 else it gets deleted
Hey guys. Is there any tutorial on how to setup qwen 2.5 best or any other llm for coding?
Chatgpt is broken for a couple of days already, and I do not how to exactly set params to the LLM and context:(
the 72b model requires A100 like in a huggingface zerogpu space, and I doubt if smaller models could still match GPT-4o
or there's also 32b coder variant
Claude is also another best API-based alternative
a this moment, everything will be better than 4o, as after update they made it literally STUPID asf
is it worth to try buy claude?
What's your PC GPU?


AI HUB Docs


