#🧬│ai-chat

1 messages · Page 346 of 1

polar flax
mellow fog
#

so okada is good

#

right

dapper ginkgo
#

@tepid basin here too

covert lake
weak hare
#

when when i talk it sounds distorted

#

esp when i breath

covert lake
prisma yacht
#

do people here help with making ai image models orr is it just voice

prisma yacht
#

cause i tried training an ai imagae model using flux dev but it doesnt generate me the safetensor file

polar flax
prisma yacht
#

im using pinokio idk what do u mean by save every step

#

theres a sample every step but it doesnt work and the model says it finishes training in like 3 mins

late quarry
#

What model is good for deep voice naration?

polar flax
prisma yacht
#

Oh

#

I dont use comfyui i used pinokio

prisma yacht
#

idk how ot use comfui :/

#

is there like a tutorial how to use it

#

cause i have it installed

solar torrent
#

Blame people who keep telling me to do AI locally, not me.

teal dove
#

Ey

#

I have got a question

#

How do i make voice overs for free?

#

Like those in viral brainrot edits?

solar torrent
#

I've been in disbelief in people from previous AI Hub for a while. Is there anything I can help?

#

You mean AI cover? Well, there are options.

#

If you have GPU that's newer than GTX 10xx series in your PC, you can do RVC locally. If not, you might wanna look for cloud servive like Google Colab and Weights instead.

polar flax
solar torrent
solar torrent
covert lake
prisma yacht
#

HELp

prisma yacht
fair snow
#

yo wsp , im from sa . is available tiago pzk voice ?

fair snow
#

i find it, what is the best easy option to generate ?

fair snow
#

thnx

polar flax
#

or if you have capable gpu, you can try local inference (read the docs below)

rare sorrelBOT
#

Not available yet

polar flax
#

-rvc

rare sorrelBOT
covert lake
fair snow
#

NVIDIA GeForce RTX 3050 Laptop GPU

covert lake
polar flax
#

4 GB, surely

fair snow
#

yes, it is

solar torrent
#

Is Weights still giving me hidden ads when I got premium? Because I see numbers for blocked ads on an adblocker extension I installed. nails

covert lake
# fair snow yes, it is

nvm then it's better to not do it locally, but you got Cloud (remote good pc, easier and faster than ur PC but it's limited):

  • Weights.gg: Partnered with AI Hub, lets u do them easily but u may be in a queue
  • Ilaria RVC Zero: fastest and simplest that you can get for free
  • Applio Colab: max 4 hours, not granted, of GPU

Easiest manual (you still have separate vocals and instrumentals): Ilaria RVC Zero

Easiest possible and automatic: Weights.gg

polar flax
#

it is still fine for inference imo, unless you care of the speed

ebon nacelle
#

Do you guys know if there's an open source f0 editor (like melodyne)? With note moving and modulation editing etc

solar torrent
#

I know not all people who have a very fast PC can spot on where's an ad hidden somewhere. However, my laptop is notably slow, so I launched up Chrome task manager to end task of huge load of ads. skullfacedistorted

fair snow
#

thnxs

polar flax
# ebon nacelle Do you guys know if there's an open source f0 editor (like melodyne)? With note ...

it could be potential for RVC devs to implement, except the polyphonic detection unless they implement a new f0 method which is probably SVS-based: https://arxiv.org/abs/2401.16837v1

ebon nacelle
#

So there's not something that exists already? I would have thought there would be because melodyne has been around 20 years. I don't mean polyphonic, I just mean for vocal editing

rare sorrelBOT
thin warren
#

.

gray rover
ebon nacelle
#

🤘🏼Very interested. Btw did anyone get BigVgGan going on a fork with RVC? I'd be curious cause bigVgGan v2 is out now, supports up to 44k, and is apparently way faster.

night lake
ebon nacelle
#

Says #no access

night lake
gray rover
#

it's a actually much slower than hifigan

ebon nacelle
#

@gray rover I'm saying bigvgan v2 vs bigvgan

gray rover
#

I mean yeah but why comparing bigv 1 to 2

#

in any case, we're having mrf hifigan and refinegan being tested atm

#

so, bigvgan isn't a priority anymore

orchid sparrow
#

Hi what is more recommanded? W-okada or Voice.ai?

gray rover
#

voice AI or kits are things we don't quite support

orchid sparrow
#

I had W-okada i think but i forgor which one i used in the past and when looked for W-okada mostly voice ai vids poped up

#

Thank you very much for letting me know

gray rover
#

-rt

rare sorrelBOT
gray rover
#

check the fork one

gray rover
#

^

#

Majority is wrong or outdated or simply following the trends / hype disregarding performance or cost

orchid sparrow
#

Yeah i see the one that showed W-okada to me first is gone now and now most vids about ai voice changers are promoting Voice.ai

#

By the way why Voice ai is considered the wrong one? Its a scam or smh?

gray rover
#

it's closed source

covert lake
gray rover
#

most of things like kits or voice AI are based upon open source stuff that's been pretty much taken, monetized and then claimed to be 100% own

orchid sparrow
#

Ah yeah i remember that when i looked up "If w-okada is safe" everyone were saying it is open source so its most likely safe

gray rover
#
  • paywalls n scams
covert lake
#

u have to keep always the app in background

glad nebula
#

scam +´cryptomining

orchid sparrow
#

Oh damn

#

Why people keep promoting that?

#

App that basically slowly destroy your GPU

covert lake
covert lake
#

ig money

gray rover
#

👀

covert lake
gray rover
#

wouldn't worry too much tho, you're in right hands here

covert lake
gray rover
#

yuh, fully agreed

covert lake
#

not every single paid program is better than free ones

#

it's just 'easier' and paid

gray rover
#

only really good paid service of this sort I can trust is eleven labs

#

their style copying is just flawless imo

covert lake
#

but true

#

11labs is good asf

gray rover
#

ye, hoping for open source to get closer to El levels ASAP

orchid sparrow
#

I remember using W-okada before for streaming (Got voice owner permission) and had to delate when i was cleaning discs

#

it was pretty fun and good quality

covert lake
#

best quality since like a year

orchid sparrow
#

Can they also be used as Text to speech? I forgot if it was a option

covert lake
orchid sparrow
#

Sure please! Il copy and paste it into notebook and do it

covert lake
# orchid sparrow Sure please! Il copy and paste it into notebook and do it

There are different Text To Speech (TTS) AIs:

GPT So Vits: RVC isn't as good as GPT So Vits for tts, but gpt so vits (few shot tts, which means needs just a lil training for models) can't use rvc models (and viceversa), and its only limited to: english, chinese & japanese, if you wanna check gpt so vits instead, read https://docs.ai-hub.wtf/tts/gpt-sovits/

Freemium 11labs: A easy way to do TTS is https://elevenlabs.io/, you can't use RVC model on this but its a mostly premium easy way for good quality TTS

FishSpeech: FishSpeech is a 0 shot (no explicit training needed) TTS, if you got a good pc you can use it locally else use their site

With RVC Models:

RVC is natively for Speech To Speech, but forks such as ilaria rvc mainline & applio have built in tts (using Microsoft Edge TTS to make a generated tts audio, which i suggest you to choose a tts model that is the same gender and language of the rvc model you wanna use, and then convert it with rvc)

If you wanna do tts locally with RVC Voice Models (if you got a good pc):

  • You can get Applio in our docs
  • While Ilaria RVC Mainline here (no guide as of right now)

If you don't got a good pc you can do tts with RVC Voice Models on cloud:

  • Ilaria RVC Zero (Running on A100 GPU, free fasted rvc on cloud) and the guide

  • Use Applio UI Colab (with google colab T4 free daily limit gpu)

  • if you don't wanna use edge tts, you could try another tts ai from our tts index and use the output as an input in rvc

edgy bloomBOT
#
Congratulations Nick088 [ITA/ENG] by Weights!

Your Charizard is now level 56!

gray rover
#

applio has tts, or does it not work anymore / there's better solutions?

wondering cause I haven't really explored tts field lately

covert lake
#

I sent you different types of TTS

#

If you want to use RVC models, i explained that too

And in the 'tts index', you can find how to use TTS in realtime too for calls

#

i mean, 'realtime', you still have to type

orchid sparrow
#

Alright, thank you very much!

covert lake
gray rover
#

True that

covert lake
#

An actual TTS program would be better than it technically

dapper ginkgo
covert lake
#

Like GPT-SoVITS, F5 TTS or FishSpeech

gray rover
#

perhaps I could sometime integrate gpt sovits into my fork

covert lake
gray rover
#

yeee, been thinking of it for a while

covert lake
gray rover
#

shouldn't be too hard

covert lake
#

I remember seeing ur pfp somewhere

#

is it stream overload or smt

gray rover
#

ah, you probs mean akiyama / yakuza dude

#

he used to have ame too

#

:' )

orchid sparrow
#

pretty good psychological horror/sim

covert lake
#

Would work i guess

covert lake
gray rover
#

that's one way, I just thought more of simple integration of two into 1

covert lake
#

Maybe I should give it a try

gray rover
#

as in, unified interface

orchid sparrow
#

Its a pretty specific game, if you like staring at screen clicking stuff and doing different strategy then it is good

covert lake
#

Wouldn't it make the project a bit too big tho 😭

orchid sparrow
#

But it is not fast or that interesting game if you already dont like slow and monotone games

dapper ginkgo
covert lake
#

I played even visual novels, as long as the story is interesting

orchid sparrow
#

If you love story games then you should like Needy streamer overload, it is story based

covert lake
#

For example I played Bad end Theater, the story was good

covert lake
#

my favorite game would be undertale

orchid sparrow
covert lake
#

undertale has peak story lfg

orchid sparrow
#

Really good game, i loved Ending song

covert lake
#

when i finished it a year ago, i tried searching for a fandom, but it was almost not existent

covert lake
#

I don't remember everything tho boohooh

elder willow
#

@covert lake can you help me

gray rover
covert lake
#

wtf i got the notification 10 mins later 😭

#

discord moment

elder willow
covert lake
elder willow
#

can you help me in a call

covert lake
#

And also what's your pc gpu?

elder willow
#

i would like to download okada

covert lake
elder willow
#

i cant send Pictures in the chat

#

can i add you ?

covert lake
sweet cargo
#

hey i need rvc for pc

#

<@&1159293140440723499>

wooden jolt
river adder
wooden jolt
#

i want to know everything

stark scarab
river adder
marsh moss
#

hlo

dire plume
#

@hidden grotto

hidden grottoBOT
# dire plume <@1138318590760718416>

:wave: @dire plume, How can I help?

Available Commands:
@weights find <query> or /find <query> - Search for RVC Voice Models
/create - Create an AI Cover
/image - Generate an Image

dire plume
#

finally found

gray rover
#

Hmm... @minor blade I wonder, is blase in here under some different name or, totally ditched discord?

gray rover
#

hmmm.. then who's the main maintainer after Blaise who's authorized to do any changes to mainline?

stray violet
#

Yo guys is there a website for making ai cover?

gray rover
#

A, ye, Vidal then

#

Thanks turt 🥬

minor blade
minor blade
gray rover
#

covert lake
random karma
#

Hi,

I’m an Senior AI engineer, and I’m open to work. I’m excited about the opportunity to contribute to innovative projects.

If you have a great idea and need the expertise of a senior engineer, feel free to DM me.

Thanks!

dapper ginkgo
ruby tapir
#

hey if anyone here working w a reference model hmu

sage fulcrum
#

Good day, can i ask if you guys have bot that humanize texts or paragraphs effectively?

thin tiger
#

What is the best AI image generator right now?

gray rover
#

@night lake hey man, as I'm on adding features to the ui, aside of configurable warmup and avg loss per gen / disc, do you see any other gimmick like that being added?

#

or perhaps @tepid basin or @glad nebula

#

Any propositions will do, as long it's within my capabilities

tepid basin
#

ill have a good idea at some point

gray rover
tepid basin
#

i am bad at thinking

edgy bloomBOT
#
Congratulations UnitedShoes (by Weights)!

Your Ivysaur is now level 25!

New move!

Your Ivysaur can now learn Take Down!

night lake
gray rover
#

Oooooo, that sounds good actually

#

#

'll see what I can do 'bout it
for now, it lands in the wip category tho. I believe I'd have to prepare some presets or more stuff required to have it configurable

glad nebula
gray rover
#

added to the list ✅

night lake
gray rover
#

The avg loss ?

night lake
#

yeah

gray rover
#

Actually

#

that ( configurable ) + use of warmup ( custom duration ) + toggleable mute files
will be in initial fork's release

#

( soon, hopefully )

#

man, ngl, applio's handling it really well, the args n modularity 😌
Glad I made the switch

elder willow
#

😔 how do I make rvc not sound like siri when using discord

gray rover
#

nails Siri...

#

Can you elaborate?

#

oh yea raz, what would you say on ability to delay discriminator or gen? idk, sometimes can be useful
I suppose, I could also see custom grad norm setting too

elder willow
#

Its kinda hard to explain. Was gonna do voice trolling on a friend but the audio sounded super robotic when it came out.

gray rover
#

oh, well, then that's due to A) wrong pitch settings B) your inability to mimick the source speaker / aka adaptation or C) the model is just trash

night lake
gray rover
gray rover
#

yet, that's most certainly C if you mention super robotic / rough / metalic and so on

night lake
elder willow
gray rover
#

If a model's trained on not diverse set or on a one that's lacking in pitch / expressiveness department..
there's nothing you can do

#

( man that f delay is triggering smh, completely not fitting my writing style oof )

elder willow
#

can I train one with an amd card or no?

gray rover
#

Tho ye Tachi, it's a matter of how the model was made and on what set

gray rover
#

provided it ain't 4 gigs

elder willow
#

I have a 7800xt

gray rover
#

There's a section for AMD but again, I am not the one who worked it through, ideally you'd want to ask Noobies 5663

#

I think

elder willow
#

alright thanks for the help

night lake
gray rover
#

oh yea

#

tho I am still unsure if rmvpe even likes that

#

afteral, it's still not there ( as in, slider for that )

#

'll need to test it sometime n see how it behaves

chilly lake
#

you can definitely change rmvpe's hop len

gray rover
#

In that case, lands into wip

night lake
gray rover
#

wip/todo list:
from the ui level;

  • Different optimizers choice
  • Custom independent learning rate for G and D
  • Adjustable hop length for rmvpe
  • Custom gradient norm value
  • headstart of chosen network's element: G/D

from the training code level;

  • more warmup options ( Cosine anneal and so on )
  • more customization here n there, more automations that currently's done purely in config etc.

concept stage:

  • Configurable discriminators pairing
#

Currently added:
from the ui level;

  • warmup settings

from code level;

  • mel similarity metric

wip rn:

  • avg loss settings
  • refactoring some descs etc.
#

@night lake pretty much

#

And beautiful theme ✨ afteral
lol

glass junco
#

my lil wayne ai working on with my beat me talking ahj

night lake
gray rover
#

mh mh

#

off topic but, I think I'll ditch mechanic for translations

#

Am too lazy to dive in and re-type / re-translate all ngl

glad nebula
#

avg loss is so op

gray rover
#

😌

#

yes, wish people had realized it sooner :L

analog current
tacit needle
#

guys how i get a relistic girl voice

#

i want troll my firends

polar flax
gray rover
#

No more point in doing it by hand

#

These however, esp the running loss, I'm pretty sure people will love

#

Will make picking the right epoch even easier

gray rover
polar flax
#

it's like smoothed loss value?

gray rover
#

Nope, something else

#

Currently the logging is done per last step in an epoch

#

it's naturally not the best indicator of model's performance

#

the avg running is averaging the loss per N steps / mini-batches

#

say, in ur case it's 48 steps per epoch, you'd set the avg for.. idk.. 6

#

you get the point

#

better actual epoch performance than just logging of the last which is just biased

#

Another thing that'd make it better ( and actually correct ) in future is proper evaluation phase with accuracy on unseen / validation set

gray rover
#

whereas this one is per steps within an epoch specifically

#

If you need graphical representation
( standard behavior ):

cedar mist
#

i think im having a small issue all the voices and models sound pretty good to me when i hear them by ear but when i use thjem in discord or a game they're choppy

gray rover
#

either have to upgrade your gpu, tone down with voice changer settings ( sadly, get a higher latency ) or tone down with game's settings

polar flax
# cedar mist i think im having a small issue all the voices and models sound pretty good to m...
gray rover
#

Or simply try the fork in case you haven't .. ah ye or that

cedar mist
#

with discord its not working well either

polar flax
#

yea the fork version is more recommended to try for better gpu utilization/less cpu bottleneck (regardless if you even have a 9800X3D)

elder willow
#

Can anyone suggest me a bot to do ai covers?

gray rover
#

boohooh Don't question my paint skills ™️

polar flax
#

I was thinking of each candle bar for each epoch

#

and then some adopted metrics like moving average, etc.

gray rover
#

I mean, there already is moving average

#

the pic depicted stock behavior of rvc / applio in logging

#

the running avg would work like so: ( depending on your N value )

#

It's the most straightforward approach without any gimmicks really

#

any overcomplication is just diminishing returns really, for next level we need evaluation phase 👀

polar flax
#

for example, maybe the fluctuation range, so we could spot some collapses inbetween epoch

gray rover
#

Mute files + silences in the dataset

#

reason is, if some tiny slice ( 36 frames ) happens to be the spotlight one, that " collapse " happens, ye

#

Solution for that is simple, properly truncate the silence or resign from mutes ( which is yet to be fully tested )
Aside, if you need a direct indicator, it's already a thing actually, My mel similarity % loss

#

if you see it being any high value such as 70-90~ % range at such spike, then that's that ( but that's still just a diag metric really )

#

So, don't think there's any need for fluctuation frequency metric, provided users carefully take care of sets ( as it should be, the truncation of silence to reasonable levels )

tacit needle
#

crypto

gray rover
ionic pumice
#

you're one funny fella arent you.

tiny blaze
#

W forlorn

solar torrent
#

Skill issue. yummy

barren mauve
#

hey

solar torrent
#

For Virtual Audio Cable, I let a web browser to output its audio to Line 1, from Line 1 to Reaper audio software as microphone to add some effects in real-time. voidblep2

#

With paid version of VAC, I think you can use virtual Line In more than 2 at the same time. I have an idea about this: A program -> Line 1 -> an audio software -> Line 2 -> a voice call program like Discord trolley

#

Or something like this: A program that outputs audio -> Line 1 -> W-Okada -> Line 2 -> Discord nails

lament knoll
#

best girl voice realistic

covert lake
#

There isn't a best one

#

You can search rvc ai voice models at:

if there isnt one, you can:

hidden grottoBOT
robust chasm
#

hello ai hub by weights

worthy coyote
#

this won't stop

dense heart
solar torrent
minor blade
#

🐢

glossy fox
covert lake
#

Wokada is the program to use RVC (Retrieval-based-Voice-Conversion, Speech To Speech Models) in realtime for calls

There's the fork (modified version), the deiteris fork which has better performance

#

that one you sent is the original

#

@glossy fox what's your pc gpu?

#

be sure to not follow yt tuts

glossy fox
covert lake
#

-rt

rare sorrelBOT
covert lake
#

1st link

#

its the wokada deiteris fork

#

meaning it's better

#

u gotta read it up ofc, there's no updated video

#

and follow the nividia version as u got an nvidia gpu

glossy fox
#

okay

glossy fox
#

sorry im have bad english I don't understand well

covert lake
#

that one is the original wokada

#

the wokada fork has better performance

#

both of those are safe

covert lake
glossy fox
#

okay thx

covert lake
#

yw

glossy fox
#

yeah from yt

#

this video upload in 15 october

polar flax
#

must be 2022 or something lol

untold arrow
#

which one do i download

gray rover
#

Man, you gotta love how people ask a question but then have dd status ✨😂

covert lake
covert lake
gray rover
#

don't disturb

covert lake
#

while asking the most generic question

gray rover
#

yup, like, kinda wasting the time of someone who's willing to help

covert lake
#

fr

covert lake
pine acornBOT
# covert lake !howtoask

How To Troubleshoot AIHC_WaitWhat

__**GIVE CONTEXT.**__ 📝
  • Don't simply mention your issue, like "my rvc is not working".
  • Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
  • The more context, the better.
__**BE POLITE.**__ <:matsuripray:1159685390156967936>
  • Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
  • It's okay if you're frustrated, but don't take it into this server.
  • Don't DM without prior consent.
__**BE PRODUCTIVE.**__ 🤝
  • Don't ask for every little instruction. Put your own effort & test things by yourself.
  • Don't ask to ask.
  • Check if your answer is a Google search away/on our guides website.
gray rover
#

lmao, good one. totally forgot this existed

covert lake
#

u gotta elaborate buddy

gray rover
#

Don't spam:

half charm
#

hey guys do you know how to create ai art. using reference images? like i send ai these reference images and make it create art? in that style

steady surge
#

its perfect

gray rover
#

wtf is this nails

#

what kind of power is this..

steady surge
#

idk how the fuck it knew those were vocals

#

the model is MB-Roformer-InstVoc-Duality-v2 with 256 segment size and 50 overlap

#

the 50 overlap made it take 20 mins to separate

#

it is a 6 min track though

gray rover
#

vocoded? yes, chorus, pitch-tune n shit ton of effects? yes, but nevertheless, vocals

#

AI too chad 😎

steady surge
gray rover
#

oh, you mean the background elements?

#

In any case, might still resemble somewhat vocal-type of data in spectrum so

covert lake
edgy bloomBOT
#
Congratulations Nick088 [ITA/ENG] by Weights!

Your Charizard is now level 58!

covert lake
#

@random karma

vagrant plank
#

I have a small question, is it legel to use voice changer reading a novel and post on tiktok, for example using gura's

gray rover
#

Models on their own are tools afterall, they fall under gray area category ( or so I see it that way at least.), just like cameras and stuff but the results / effects of usage matter the most

#

Now, hear me out. This is my own personal, thus subjective, opinion.. Using someone's voice for public work can be at times problematic especially if target person deems it as not right for them. if I was to do something similar to your example or really anything of that sort, I'd make sure to not cause any harms or misunderstandings.

( That means, including the label of the material being AI generated and /or a lil disclaimer you're willing to delete / remove it if requested to do so by appropriate authorities )

elder willow
#

aHR0cHM6Ly9kaXNjb3JkLmNvbS9iaWxsaW5nL3Byb21vdGlvbnMvZlBEV0VtbUQ2YjdSWHVEVnV1WmZGcEdx (Free nitro) if you can figure out how to get the url

analog cosmos
#

yoink trolley

covert lake
#

lmfao u actually got it

analog cosmos
covert lake
#

I thought it was fake

analog cosmos
#

same lol

analog cosmos
covert lake
#

i thought they weren't after getting abused asf

analog cosmos
elder willow
covert lake
#

not like i have billing info trolley

elder willow
#

actually if your not stupid make a bunch of discords and then put cc on it and just use it hella nitro easy for boosting and sh

tepid basin
#

@stark scarab your ffmpeg installer is based

#

im trying it on the next unsuspecting ai noob

stark scarab
#

it's just doucle click xd

tepid basin
#

doesnt get more simple lol

astral ravine
#

Ada yg punya model ai anime blue lock?

covert lake
#

also

#

You can search rvc ai voice models at:

if there isnt one, you can:

hidden grottoBOT
tepid basin
#

@covert lake best de reverb model for dataset and where to run it?

#

i dont really care about sample rate cutoff cus i already hit it with melband

#

wait a sec... melband is 40k??

ionic pumice
polar flax
tepid basin
#

that is awesome sauce

#

@gray rover I have a suggestion for applio ui

#

who threatened blaise with a lawsuit

#

im gonna find them

gray rover
#

You got any reference or a nice tos to re-adapt?

#

ngl, I suck in such things ( if I am to make one on my own, that is lel

night lake
gray rover
#

oh

#

I suppose, you'd mean a thing where I nor people associated with the project hold any legal responsibilities for copyright infringement, damages blabla.
Thus, user is fully responsible for the use of the tool

#

Yea, sounds logical

night lake
#

Yup

gray rover
#

we've got time, dw

gray rover
#

Any trolling is not something I'll tolerate

nimble iron
#

Yes or no

gray rover
#

This is a warning

nimble iron
gray rover
#

Too many douches ended up getting banned or kicked lately, if that's what ya want or aim for, eventually, go ahead but immature or annoying people aren't well welcomed here

nimble iron
#

Mhm good day to you sir

night lake
gray rover
#

lmao

polar flax
tepid basin
#

Codename dodging the poop allegations

gray rover
#

ikr, smooth chacha

glad nebula
analog current
analog current
covert lake
rigid spire
#

Greetings, can i show the video about my custom bio inspired model of artificial brain right here and ask for a few thoughts about it?

#

I didn't release it yet so it's nothing to send in promo i guess, i will make that it in a few episodes

ionic pumice
edgy bloomBOT
#
Congratulations kar@those who are jolly : 🎄!

Your Dewott is now level 28!

covert lake
amber crag
#

If anyone wants premium on Webtoon, here’s a code for one month of premium for free: ZvTWhMzKDaXaqPe (in case someone doesn’t know, Webtoon has comics, videos, etc.).

meager remnant
#

i need a little help with prompting with chatgpt, if you have any experience please join call

ionic pumice
ebon coyote
wanton cloak
#

I am looking for X bot developer who also has exp in RAG system.
This is long term project, DM with your fiverr or upwork account and github.

polar flax
#

you mean twitter?

solar torrent
solar torrent
wanton cloak
solar torrent
analog cosmos
solar torrent
ionic pumice
solar torrent
ionic pumice
#

your choice ig

solar torrent
#

No nut November, but I nutted.

analog cosmos
#

real

ionic pumice
#

uh huh

covert lake
river verge
#

Wake up, everyone! A new RegalHyperus drum model just released!
Fall in Love Alone (Drum model no. 554)

barren glen
#

@hidden grotto

hidden grottoBOT
# barren glen <@1138318590760718416>

:wave: @barren glen, How can I help?

Available Commands:
@weights find <query> or /find <query> - Search for RVC Voice Models
/create - Create an AI Cover
/image - Generate an Image

fervent nova
#

jak zrobić własny głos w AI?

gray rover
#

but it's your lucky day.
I'm polish ( mogę pomóc jak chcesz ale to za 15 min gdzieś (( ~ mentioned I can help em in 15 mins ))

minor blade
#

Nope, didn't use Suno nor Udio

gray rover
#

oooo, lemme know when that happens. send me a dm then 🔥

#

man I should f reset my phone's kb

#

SwiftKey been so trash lately

minor blade
gray rover
#

🥬

minor blade
gray rover
#

Good turtle

tepid basin
#

Local autocorrect model that trains on your device. Very similar experience to gboard. Custom voice to text models too

gray rover
#

Been waiting for proper solutions like that. Hopefully it's customizable similarily to swift ✨

tepid basin
#

I haven't used Swiftkey but switching from gboard was easy

#

I don't do multilingual either

gray rover
#

Oh yea, I do

#

soooo we'll see how it goes

tepid basin
#

You can export the model to a weight so you can tranfer across devices

gray rover
#

tbf, switfkey fucked up the moment microsoft took over

tepid basin
#

Oh wth

gray rover
#

prior to that it was so good damn

gray rover
stark wind
#

I got a question

#

Anyone know how to make a text to speech bot for discord with voices that it can send through text???

pearl condor
#

hello I would like to cover my friend's sound for one song but how do I add it to the zip folder?

echo kraken
#

Hello

#

Do you think it's worth subscribing to Perplexity?

tacit widget
#

🤔 For those who know about LLMs, how fast should an RTX 4070 & 32gb ddr5 6400mhz typically take to generate long responses. On Llama 3.2-3B-Instruct

lime patrol
#

hello everyone

gray rover
#

cause it feels like there's something I missed

slow swan
#

ai content is used in this

tepid basin
#

But its understandable cus riaa will sue tf out of blaise

gray rover
#

it ain't eula tho

#

it's simple terms of use

tepid basin
#

Terms of service, ya know

#

I just like complaining hehe

minor blade
polar flax
gray rover
#

man, it's just one checkbox, chille

#

it ain't some death sentence

polar flax
#

real

simple cliff
#

so im getting into this, i've found beatrice v2, but it only supports toml files, which i havent found any voices in #1175430844685484042 that have a toml file.. am i doing something wrong?

severe yacht
#

excuse me, where i can use the ai voice models?

#

😦

gray rover
solar torrent
glass junco
full bridge
#

Astralabs

stuck frigate
#

Or if you want to make your own llm, you would need a big amount of gpus

snow sedge
#

Which is, well, very fast

solar torrent
#

An LLM model can be trained as fast as Stable Diffusion image generation with 69 GPUs. Baffled

hallow gorge
#

i have realtime voice changer client and it takes 2 seconds for meskullsob

#

chat my rx 6600 is cooked

covert lake
#

just a gpt wrapper

meager remnant
#

@chilly lake do you have any experience prompting

chilly lake
#

for stabiity AI there are special extensions, also you can ask chatgpt to make a prompt

meager remnant
#

i am a complete noob, i am just trying to prompt my chatgpt to make scripts in a certain way, and it is going south, if you could hop in the voice channel for 2 minutes i would be grateful, but understandable if you cant

edgy bloomBOT
#
Congratulations MW!

Your Squirtle is now level 15!

New move!

Your Squirtle can now learn Water Pulse!

ionic pumice
#

bro has a squirtle

#

@edgy bloom p

edgy bloomBOT
# ionic pumice <@716390085896962058> p
Your pokémon

1 <:_:721476118942580777> Dewottmale • Lvl. 29 • 52.69%
2 <:_:721476119374594108> Munnamale • Lvl. 30 • 21.51%
3 <:_:721474704036069397> Slowpokefemale • Lvl. 41 • 54.30%
4 <:_:721476309389148202> Axewmale • Lvl. 37 • 42.47%
5 <:_:721476203122524251> Sandilefemale • Lvl. 20 • 39.25%
6 <:_:721475493525848113> Wailmermale • Lvl. 1 • 66.13%
7 <:_:721474816607256627> Togeticmale • Lvl. 39 • 35.48%
8 <:_:721475493710266441> Lunatoneunknown • Lvl. 4 • 45.16%
9 <:_:721476020846329909> Weavilemale • Lvl. 9 • 44.62%
10 <:_:721476203592024095> Klinkunknown • Lvl. 23 • 56.45%
11 <:_:721475597095665694> Empoleonmale • Lvl. 20 • 56.45%
12 <:_:721476466759434302> Goomymale • Lvl. 34 • 72.58%
13 <:_:721476020649066537> Tepigmale • Lvl. 21 • 41.94%
14 <:_:721476119039311884> Lillipupfemale • Lvl. 9 • 47.85%
15 <:_:721474704220880906> Psyduckmale • Lvl. 36 • 46.77%
16 <:_:721474757941395567> Porygonunknown • Lvl. 22 • 41.40%
17 <:_:721474704354836570> Krabbyfemale • Lvl. 31 • 21.51%
18 <:_:721474816699269181> Hoppipfemale • Lvl. 31 • 46.24%
19 <:_:721476389156552824> Amauramale • Lvl. 30 • 68.82%
20 <:_:721476203491491920> Gothitafemale • Lvl. 26 • 47.31%

ionic pumice
#

my dewott could probably beat yours 😎

#

/lh

meager remnant
#

dont make me bring my lugia

meager remnant
solar torrent
#

Any voice model idea for draft training on Weights? I have too many premium voice model training items on Weights. nails

chilly lake
#

just a guide what needs to be done for different model types

#

you can go to civitai, look for a model, look for generated examples at the bottom, see the prompts used

covert lake
safe wigeon
#

Yo guys

#

What program do i use for ai voices

covert lake
safe wigeon
#

I wanna do something kanye related

covert lake
covert lake
# meager remnant i am a complete noob, i am just trying to prompt my chatgpt to make scripts in a...

For LLMs like chat gpt, you could consider a roleplay technique
Ex: You're a Movie Producer, you are an expert into making scripts for TV Movies. Make me an initial script and brainstorm me ideas for a movie about Batman fighting the Flash

You can also see many prompts only, such as https://github.com/f/awesome-chatgpt-prompts

GitHub

This repo includes ChatGPT prompt curation to use ChatGPT better. - f/awesome-chatgpt-prompts

glass junco
dapper ginkgo
#

its in this server too

dapper ginkgo
#

astralabs is AI covers only

#

they only do audio

meager remnant
meager remnant
covert lake
covert lake
meager remnant
#

@covert lake How hard would you say it would be if i have 1000 shortform scripts and i want to "fine tune" or train chatgpt to make similar good scripts

meager remnant
#

ah fairs, appreciate you for the response

covert lake
#

it's from their guides

meager remnant
#

thank you!🙏 i will check it out

long ibex
#

Modelleri hangi kanlada bulurum

full bridge
covert lake
edgy bloomBOT
#
Congratulations Nick088 [ITA/ENG] by Weights!

Your Charizard is now level 60!

covert lake
mossy wyvern
#

what does this mean:

chilly lake
#

dont do this, obviously

mossy wyvern
#

i figured it out

#

i needed to delete this file:

stored_setting.json

#

and it loaded

gray rover
#

That was fucking stupid and you shouldn't be trolling people like that.
fyi, there's tons of people who don't even realize they have a gpu, having gotten their first pc so like, cmon man

#

I could see how at least 1 or 2 people in future would search up for similar error and follow it

#

it ain't funny, it's malicious

#

and potentially damaging. Refrain from doing such things

#

Alr, bet

#

🙂

#

@covert lake if you may

#

we've gotten enough of ragebait shitheads goofy ah kiddos

#

Tone down

night lake
#

it is fr

#

brochacho? skullsob

gray rover
#

dude's believing in friendship power

night lake
#

you have to be a llm for that

#

ikr

#

the temptation to say your mom

covert lake
gray rover
covert lake
#

nah

gray rover
#

tamed

warped pagoda
#

hi

gray rover
#

sup

warped pagoda
night lake
gray rover
warped pagoda
#

wdym by based?

night lake
# warped pagoda based?

A word used when you agree with something; or when you want to recognize someone for being themselves, i.e. courageous and unique or not caring what others think. Especially common in online political slang.

The opposite of cringe, some times the opposite of biased

gray rover
#

^

warped pagoda
#

wow

night lake
warped pagoda
#

btw i cant upload images here so i might not be able to ask for help for a specific image

warped pagoda
#

got it

gray rover
#

by talking orrrr, well.. if you need a speedrun I guess
#🤖│bots
And just keep on writing something random I suppose

night lake
#

@covert lake we need cleaning on isle ai-chat

covert lake
#

and wtf was wrong with them

night lake
covert lake
dapper ginkgo
urban ether
#

Hi

red monolith
#

Hey, can anyone help me with a video generator?

urban ether
#

No sorry

red monolith
#

I need to make a music video for an ai song i made

red monolith
urban ether
#

What is the ai

red monolith
#

Sono

urban ether
#

Who's that

red monolith
#

Oh wait no

urban ether
#

All I know is perfection

covert lake
red monolith
#

Suno, its not a voice, its an ai music maker

urban ether
#

Oh

red monolith
#

I make the lyrics and it generates a beat and the rest

red monolith
#

welp

covert lake
torn peak
#

anyone know any swamp izzo models

#

pls @ me in ur reply btw

rustic sphinx
#

short question. Is there anywhere a good tutorial or guide for how I can create my own voice model?

rustic sphinx
covert lake
edgy bloomBOT
#
Congratulations Nick088 [ITA/ENG] by Weights!

Your Charizard is now level 61!

covert lake
rustic sphinx
# covert lake What's ur PC GPU

right now I am on my laptop with 32 gb ram and an ultra 7 and an intel arc. But if it would be better I can get to my pc with an 3060ti and a ryzen 9 with 24 gb ram

covert lake
rustic sphinx
#

yeah

#

its just for writing like now or lower stuff

#

I would do that voice model on my main pc with the rtx 3060ti and my ryzen 9 and 24 gb ram

#

Would that be enough?

gray rover
#

tbf, any gpu with 6/8 gigs will do

rustic sphinx
gray rover
#

I work with 3060 and can assure you it's enough yuh

#

np man

tepid basin
#

6 gb 1060 can pull through too trolley

edgy bloomBOT
#
Congratulations UnitedShoes (by Weights)!

Your Ivysaur is now level 27!

gray rover
#

yes, using 4 gig is a misery however

rustic sphinx
gray rover
rustic sphinx
#

on this discord

gray rover
#

hmm wait

rustic sphinx
#

yeah

rare sorrelBOT
gray rover
#

@covert lake What's the current guides source that you typically recommend

#

just in case

covert lake
# rustic sphinx yeah

The ultra 7 offers the NPU which is good

But RVC doesn't support it, your desktop would be the best

rustic sphinx
#

allright thanks and what of those links should i use

covert lake
rustic sphinx
#

the first one right

covert lake
#

Usually it's recommend to use either mainline or applio, mostly applio

gray rover
#

I'd recommend my fork due to the easier training workflow

rustic sphinx
#

ok so should i use applio?

covert lake
#

Will ur fork also support NPU btw

covert lake
gray rover
#

no I meant like, it's still applio but incorporates the avg metrics which

#

I believe should be easier for newbies than understanding the last logging is the last step from the batch etc etc

covert lake
#

I'd honestly suggest applio

#

It got surely more community support

rustic sphinx
#

ok thank u all guys. Any good vids online btw? And do u know any application for anything for my npu?

covert lake
#

I mean if u wanna suggest ur fork u can do that

gray rover
#

In that case follow Nick
and once you wanna dive deeper, lemme know

covert lake
#

But I rather go with stability

gray rover
#

Actually, the fork's more stable as of now and has better readability
no bugs or issues at all

covert lake
#

Only written guides

#

All yt tuts are outdated

rustic sphinx
#

ok thanks

#

and what about anything with npu? like not online for voices but anything to use that npu

covert lake
#

And well.. that's the only NPU supported program I know

#

I rarely seen projects support NPU

rustic sphinx
#

yeah same :D

#

allright now I only need 30 Minutes of my voice

#

And then I am ready to build my own dataset

rustic sphinx
covert lake
#

It's open source, meaning you can even read the code of the project yourself

rustic sphinx
#

ah nice thank you

#

short question. I got the following error: ERROR: Could not find a version that satisfies the requirement mediapipe==0.10.9 (from versions: 0.10.13, 0.10.14, 0.10.18, 0.10.20)

[notice] A new release of pip is available: 24.2 -> 24.3.1
[notice] To update, run: python.exe -m pip install --upgrade pip
ERROR: No matching distribution found for mediapipe==0.10.9. But i updated pip and installed mediapipe 0.10.20

regal chasm
#

hi guyz im new here. im lloking for a good and self-hostable model (or at least runnable with a python script) which i can use to clone italian voices. do you know any models with a good quality?

gray rover
#

"mediapipe" isn't used in applio and from the context you can tell it's about 'fastdcpu'

regal chasm
# hollow tangle dipende cosa cerchi

Un modello text to voice con capacità di cloning su cui posso fare inferenza direttamente dal mio sistema, no API o altro. Ho le risorse necessarie per farlo

hollow tangle
regal chasm
#

Ci do un occhiata

#

In italiano come se la cava?

hollow tangle
#

ah merda in italiano non funziona

#

mi ero scordata

#

li manca il pretrain italiano

#

guarda di open source praticamente non c’è quasi nulla per l’italiano

strong ivy
#

po p o

regal chasm
#

Proprio zero?

#

Quindi usate solo servizi qui?

hollow tangle
#

text to voice pure ma usa edge tts alla base

regal chasm
hollow tangle
regal chasm
#

O edge tts intendi un servizio tts esterno

hollow tangle
#

edge tts è il tts di microsoft, usa quello integrato

regal chasm
#

Temo non faccia al caso mio

#

Sarebbe carino allora fare un train su qualche modello.

#

Quali sono i modelli migliori in termini di qualità dai quali partire?

hollow tangle
#

modelli per rvc intendi? la tecnologia di cui ti parlavo

regal chasm
#

Piú text to voice ma anche in inglese

hollow tangle
regal chasm
#

Altrimenti rvc non viene un po robotica la voce?

hollow tangle
#

dipende molto dal modello

regal chasm
#

Ora ci do un occhiata e provo qualcosa, grazie mille 😉

hollow tangle
#

di nulla!

night lake
gray rover
#

to the main applio?

#

or wut

night lake
# gray rover mh?

I've written a guide for your fork and such on the AI hub docs I just need to make a pr and ask ray to merge it

gray rover
#

oh, in that sense

#

yea, sounds like a good idea

#

Thanks man

night lake
gray rover
#

lemme see

grizzled sapphire
gray rover
#

A, actually @night lake

#

Better to mention the releases

#

those are always 100% guaranteed by me to be safer than repo

#

as I always do at least 2-3 checkups and files verif before pushing

night lake
#

I've been busy

gray rover
#

yuh all good, just thought it's worth mentioning, just in case

night lake
gray rover
#

as for a compiled version.. welp. It's a long story but in a short, I've been with awfully shitty upload speed lately

#

so, that's a big rip

tight canyon
#

Ws chat

gray rover
#

ws good guy

gray rover
#

ikr man, it sucks lol

#

glad I don't really upload anything recently

#

else I'd be f over

night lake
gray rover
#

.... don't ask.. to put it in a perspective how really bad it is, 3 mb upload takes 20-30 secs

#

vs 20 dl per sec

edgy bloomBOT
#
Congratulations Razer by Weights!

Your Grotle is now level 27!

New move!

Your Grotle can now learn Mega Drain!

polar flax
#

mobile data user moment

gray rover
#

Yea, long story and you're right

#

5g my ass, revolution they said 💀 we gonna get our brains melted they said skullsob

night lake
# gray rover lemme see

For the docs how would i explain how to use the avg loss? Like would i just say 'run a single epoch to find its step count then stop training and then set the avg loss freq to a number where it logs an epoch 3-5 times'?

chilly lake
#

running 1 epoch in the training loop to completion just to find the number of steps is so RVC... lame

#

there's len(train_loader), set it automatically

gray rover
chilly lake
#

if you say "i want to log 5 times every epoch", then you get len(train_loader) /5 and do that

gray rover
#

this should be some good info to start with

chilly lake
#

put if global step % value == 0, log whatevr

gray rover
#

it should be based on known steps per epoch

#

this is a different methodology than what you have in mind

chilly lake
#

" 'run a single epoch to find its step count then stop training and then set the avg loss freq to a number where it logs an epoch 3-5 times'? this should not be a thing ever

gray rover
#

I never said " where it logs an epoch 3-5 times "

#

let's start with that

chilly lake
#

step count = len(tran_loader)

night lake
#

i was just throwing something at the wall to give an idea boohooh

gray rover
#

" well, so what, you recommend to do a synthetic / dummy steps checker for the training set or wut " within the ui, before training

#

I ain't sure if you even get the concept in the first place
You mention that the steps count can be checked by the loader and yeah, that's true
but the idea is for the user to get to know steps, then based on that, deduce and choose what they want

chilly lake
#

why do you need any step checker?

gray rover
#

Alr then big guy, then write a consise strategy for an average user / noob

chilly lake
#

on average, train len can be estimated beforehand

glad nebula
chilly lake
#

user does not even need to decide, you decide how often you wan to calculate what you want to calculate

gray rover
#

Then write down the formula and explain it to Razer, then we good

#

cause if you like to step in my ideas, at least contribute big guy
( smh )

gray rover
#

it is per steps / per mini-batches logging re-interpreted to match " avg epoch performance " premise

chilly lake
#

but it depends on the size of the train?

gray rover
#

I just can't f get it, why can't you at least once not complain about meaningless bs man

chilly lake
#

train is 1000 steps, you want to log every 100 steps.. so that's 10/epoch

#

100 step mini-batch

#

or whatever

#

anyway, this is a crazy conversation, you do you

gray rover
#

You stepping in for no reason is crazy

#

talking about efficiency ey
fix that up first

chilly lake
#

ahead of you, my dude

gray rover
#

More like craving attention to prove you're the best engineer around despite not being asked to

#

Just please for the hell's sake, tone down your ego for f once

#

May I remind you I am a hobbyist that barely knows shit about coding, I do it for fun, because I want to, I add in what I deem good for me and I give an option to use it, not enforcing it. It doesn't have to be tip-top perfect like it is projected in your head

polar flax
chilly lake
#

dont put your words into my mouth and for fking sake, take a break

gray rover
#

You should take a fucking break, tf was you getting the convo in #✨│ai-help scolding down the dude for using pinokio? or whatever stuff, which I already did

#

If that ain't looking from above at someone goofy, then idk what it is

#

it was not needed. Just as your ego oversaturation man

#

also tf was that " applio 'lives' mostly thanks to me " kind of thing?

glad nebula
gray rover
#

Anyway.. you can put in something like this:
"If the total number of steps per epoch is unknown, the user may consider running a short 1-epoch training to determine the corresponding step count. While the choice of the averaging factor 𝑁 ultimately depends on the user's preference, Codename recommends experimenting with an averaging window that accounts for approximately 23% to 32% of the total steps in an epoch."

#

@night lake

#

or something, idk

chilly lake
#

.... ""If the total number of steps per epoch is unknown"... " running a short 1-epoch training "... again I'm looking at this in astonishment

gray rover
#

You were all against the averaging in the first place, so please, shut the hell up if you're gonna keep on bsitting around

#

See? u dunno when to stay quiet man

chilly lake
#

look, you dont need an exact number... number of samples / batch size gives you an average +/- 2 steps

gray rover
#

+/-, that's right

#

I don't need estimates

#

all that crappy estimating led to previous crappy auto log syncing, back in the day

chilly lake
#

len(train_loader) not enough for exact?

gray rover
#

Then I ask once again, do you recommend doing an ui based steps calculator / dummy checker prior to user's inputting the n value

#

either that or running a test run, I only want accurate numbers, not estimates

chilly lake
#

you're asking the user to do the dumbest thing since 'syng graph'

gray rover
#

it wasn't " dumbest " it was what it was at the given time

#

where were you back then hero

chilly lake
#

I did not know RVC even existed

gray rover
#

too bad then, you'd do way more good back then

#

¯_(ツ)_/¯

polar flax
chilly lake
#

you dont even need to run 1 epoch to completion

gray rover
chilly lake
#

you just need to start it and stop right after

gray rover
gray rover
#

was said once, noobies' input wasn't functionally required at all

chilly lake
#

some people just want to overcomplicate the simplest things

gray rover
#

some people have a habit of dismissing certain ideas without having their own input in it

chilly lake
#

make a decision that does not involve users who dont know and dont care about this thing

night lake
# gray rover or something, idk

thoughts on :

"To use the avg loss you need to know the total number of steps per epochs, you can train one epoch to find the step count. Choosing an averaging factor depends on the user, however Codename recommends experimenting with a window that accounts for around 23% to 32% of total steps in an epoch. If you choose to not use 23-32% of total steps be sure that the logging frequency isn't to small because the losses can vary a ton and it can end up confusing you, and make sure for big loss frequency it isn't to big because it may smoothen the noise to much and not give you accurate results."

gray rover
#

Honestly, as I said idk, 2-3 weeks ago? I have no time for these nonsenses coming from ya

night lake
chilly lake
#

so how hard would be put ' size of the averaging window - [ ] 1/5, [ ] 1/4, [ ] 1/3 of the epoch size' instead? 🙂

gray rover
chilly lake
#

23% to 32% seems to be pretty much that?

gray rover
#

Some people can't even read right with proper attention span

#

we know you don't give a f about " kindergarten kids " but cmon, don't complicate what does not need to be complicated

polar flax
gray rover
#

imo an overkill

#

Tbf I personally never go below 64

polar flax
gray rover
#

have I ever made such? ( genuinely

#

tho, as far as my intuition goes, main reason crepe could be smoother is because it uses gaussian smoothing during training, which can lead to softer pitch contours​, I suppose

#
  • adjustable hop ofc plays a role here
#

I'd really have to ( some other time ) do adjusted rmvp vs crepe, both at 64 and 128 to say something more

#

but ye, probs the smoothing plays a role here

polar flax
#

(means crepe has less tendency of voice cracks than rmvpe?)

gray rover
#

one might like it, one might hate it. But that'd explain why rmvpe seems more rough and sharp and crepe is soft ( good for females

#

the voice cracks are really due to ' losing the track ' of the fundamental / f0

#

when there's harmonies for instance

#

rmvpe is without a doubt more robust in that area

#

you basically gain robustness and accuracy in ' noisy / imperfect ' scenarios over accuracy and smoothness in clean audio scenarios

polar flax
#

I saw some ppl have made a choir model

#

and how about in metal screams?

gray rover
#

well, screaming and growling is basically heavier tendency towards " noise " qualities of the audio than harmonic

#

same as breathing, sibilants etc

#

if the model's exposed to it, it can do fine

#

issue's is, vctk which is the base of pretrains naturally has absolutely no idea of what it is and how to interpret it

#

and when people typically make models targetted at screamos or whispering, they include too little of that data or even mix it in bad ratios

#

if ratios are more towards the data the gen / disc learned on the og dataset, naturally bias gonna occur ( or so I believe )

gray rover
#

it's possible for the model to do just fine
But that's really about anything, even gasping or spitting sounds

#

whatever you'd want ( but then there's hubert aspects which I won't hide, noobies knows more on it so, would be best if he revised what I said just in case. iirc, if hubert doesn't flag it as something meaningful, it doesn't get through or like, not in the exact same shape or form (( again, might be wrong

night lake
polar flax
#

I'm not sure if KLM includes metal singing

night lake
#

ofc more the better but its a good min

gray rover
#

10 mins of whisper, 1.5 hours of normal talk ( a vtuber

#

model's gonna be biased towards / better in converging to speech

gray rover
night lake
gray rover
#

that's what we mean by proper convergence pretty much

gray rover
night lake
gray rover
#

that's kinda a decent balance, provided the 2 subsets of the data type aren't too diverging I suppose ( and the bs isn't too extremely low, then the small bs's induced noise + variability in data induced noise would be a mess )

#

I mean yea, back in the day I'd recommend a similar ratio, perhaps 10-15 mins of this and 4-6 mins of that

polar flax
# gray rover when there's harmonies for instance

both rmvpe and crepe can be confused on trying to infer harmonies and doubled vocals
I somewhat think there could be a new f0 method to develop that uses something like this: https://arxiv.org/abs/2401.16837v1
(or more precisely, like "singing voice separation")

gray rover
#

I think the best idea of mitigating the pitch confusion errors would be to incorporate some kind of context awareness

#

and past-now comparitive tracking + prediction, kinda lookahead concept

#

thanks for the paper. Will read it up

#

Seems like they do have the code but

#

Apparently there's no inference code anywhere

#

idk, maybe in some distant future when I feel like it or am done with all the current work, could dive deeper

#

Anyways, I head to bed and gonna leave training ( mrf test on 3 Okay no, seems to be hours 8 ( oof ) + ranger test and my tweaks eh ye )
in any case, potential updates gonna land in #🔊│ai-development
cheers all. Gnight ~

pseudo pier
#

Huh

merry spruce
#

I would like to ask how to cold-start an AI product. There has been no user using it.

covert lake
# regal chasm hi guyz im new here. im lloking for a good and self-hostable model (or at least ...

There are different Text To Speech (TTS) AIs:

GPT So Vits: RVC isn't as good as GPT So Vits for tts, but gpt so vits (few shot tts, which means needs just a lil training for models) can't use rvc models (and viceversa), and its only limited to: english, chinese & japanese, if you wanna check gpt so vits instead, read https://docs.ai-hub.wtf/tts/gpt-sovits/

Freemium 11labs: An easy way to do TTS is https://elevenlabs.io/, you can't use RVC model on this but its a mostly premium easy way for good quality TTS

FishSpeech: FishSpeech is a 0 shot (no explicit training needed) TTS, if you got a good pc you can use it locally else use their site

With RVC Models:

RVC is natively for Speech To Speech, but forks such as ilaria rvc mainline & applio have built in tts (using Microsoft Edge TTS to make a generated tts audio, which i suggest you to choose a tts model that is the same gender and language of the rvc model you wanna use, and then convert it with rvc)

If you wanna do tts locally with RVC Voice Models (if you got a good pc):

  • You can get Applio in our docs
  • While Ilaria RVC Mainline here (no guide as of right now)

If you don't got a good pc you can do tts with RVC Voice Models on cloud:

  • Ilaria RVC Zero (Running on A100 GPU, free fasted rvc on cloud) and the guide

  • Use Applio UI Colab (with google colab T4 free daily limit gpu)

  • if you don't wanna use edge tts, you could try another tts ai from our tts index and use the output as an input in rvc

#

Non c'è quasi nulla in italiano, tranne che Edge TTS, 11labs e XTTS2

#

Xtts2 é l'unico in locale, potresti provare quello, ma é 0shot e non più aggiornato cmq

#

Se no dovresti fare tts con I modelli RVC (speech to speech)

covert lake
#

@night lake Btw why is styletts2, gtts, bark, tortoise and meta voice in the ai hub tts index ?

#

I removed:

  • gtts: Google translate, shitty quality asf
  • bark: after talking with @chilly lake (#🧬│ai-chat message) , realized it hallucinated
  • tortoise: old, xtts2 is a fork of it and an improved version
  • matavoice & styletts2: not that good and no longer updated
covert lake
drowsy hawk
#

Hey guys. Is there any tutorial on how to setup qwen 2.5 best or any other llm for coding?
Chatgpt is broken for a couple of days already, and I do not how to exactly set params to the LLM and context:(

polar flax
#

or there's also 32b coder variant

#

Claude is also another best API-based alternative

drowsy hawk
#

is it worth to try buy claude?

solar torrent