#Rigel pretrained Model (Base and FT)

1 messages · Page 1 of 1 (latest)

stray orchid
#

Sharing with my personal pretrained model with everyone, now in public beta ||English or Spanish?||

Dataset:

  • Size: 1921 hrs of speech & vocals
  • Languages:
    • Arabic (~70 hrs)
    • Chinese (Mandarin) (~70 hrs)
    • English (~800 hrs)
    • French (~42 hrs)
    • German (~35 hrs)
    • Hindi (~30 hrs)
    • Indonesian (~53 hrs)
    • Japanese (~140 hrs)
    • Korean (~80 hrs)
    • Portuguese (~40 hrs)
    • Russian (~188 hrs)
    • Spanish (~200 hrs)
    • Tagalog (~30 hrs)
    • Singing (All) (~190 hrs)
    • Common (Unknown)
  • Sampling Rate: 32kHz (done) / 40kHz (retraining)

Models:

Base Model: for fine tuning

  • Data: 1921 hours (low-mid quality)
  • Steps: 3,890,220
  • Batch: 40
  • Precision: FP32
  • Sampling Rate: 32k
  • *RMVPE **

Fine-Tuned Model: for regular models

  • Data: 102 hours (high quality)
  • Steps: 2,854,856
  • Batch: 20
  • Precision: FP32
  • Sampling Rate: 32k
  • *RMVPE **

Hardware:

  • CPU: AMD EPYC 9754
  • RAM: 256GB
  • GPUs: 1x H100, 4x L40s, 1x RTX 4080, 1x RTX 4070 Ti

Links
https://huggingface.co/MUSTAR/Rigel-rvc-base-pretrained-model

Rigel Base model (32k) - https://huggingface.co/MUSTAR/Rigel-rvc-base-pretrained-model/tree/main/Rigel_32k_Base_and_FineTuned/Base-model_32k_fp32
Rigel Fine Tuned (32k) - https://huggingface.co/MUSTAR/Rigel-rvc-base-pretrained-model/tree/main/Rigel_32k_Base_and_FineTuned/FineTuned-model_32k_fp32

Nanashi ft on Rigel base #1254252587973083187 message

(little note, do not use 40k version till it retrained)

Credits

  • 0x2E
  • Aleks don Pedro
  • Blaise
  • Eugene Starky
  • Leo_Frixi
  • Litsa_the_dancer_UwU
  • Mikhail
  • Player1444
  • Prosto Dead Artem
  • RomanKrukovsky
  • SCRFilmsE
  • Shirou
  • Сергей Electrik
  • Warlock700
  • 서울스트리밍스테이션
    (if i forgot to mention someone Thank you and I'm going to remind you in advance that I'm sorry and i apologize for the inconvenience of me forgetting to put you in the credits tab)
    (no tests for now, sorry currently doing them)
calm jay
#

NICEEEEEE

#

Will Rigel be able to handle datasets of any quality no matter if it got some noise?

stray orchid
#

yes, FT version is robust

calm jay
stray orchid
#

exactly

calm jay
#

I'll test the pretrain

calm jay
stray orchid
calm jay
stray orchid
#

yep

#

Important notice - do not use Base model for the regular models. Base is for fine tuning

rocky vault
#

for example Italian

calm jay
#

But it has french.

#

42 hours of french.

stray orchid
#

all languages phonetically close, will work

calm jay
sweet drift
calm jay
#

Kinda smart tbh.

#

So, 1921 hours + 100 hours of hq audio for finetuning = 2021 hours of dataset.

#

XD

calm jay
#

It's mandatory to enable fp32 on RVC to use Rigel?

stray orchid
#

No

#

16 and 32 is ok

calm jay
#

Nice.

#

Sorry i got too many questions haha

stray orchid
calm jay
stray orchid
calm jay
stray orchid
amber hemlock
#

ima add it to the rvc pretrain guide

amber hemlock
abstract fable
#

Finally, great job mustar 🙂

stray orchid
dense furnace
#

lfg!

modern sleet
#

40k version?

#

Oh.

#

What about polish?

swift orchid
calm jay
#

Rigel still needs further training.

calm jay
swift orchid
#

doesnt really matter. if ur audio is bad then ur model wont sound any good with or without the data

rugged nexus
#

@sharp palm would add this pretrain to RVC V2 disconnected

amber hemlock
amber hemlock
coral falcon
#

Which better Rigel FT 32k or Rigel FT 40k alpha?

calm jay
#

Mustar is gonna keep training the Base for some time to see if it gets any improvement.

spark current
#

oh hell yes

#

imma use this shit

#

got a good dataset

#

and i wana see how it sounds

#

since it looks like it got way more data then TITAN itself clearly

spark current
#

it has more data

#

way more data

#

then titan

#

shouldnt it be better

calm jay
#

It got some issues so.. Use OG instead.

spark current
#

or is titan better

#

bc i am doing singing models

#

no talking models

#

and this pretrain here rigel gots singing in it

#

which makes me think my model could perform better

#

honestly

calm jay
bronze finch
#

Will there be specific pretrains for training voice models in Hebrew, Persian, Turkish?

spark current
#

As stated as long as there phonetically close gang

#

Jus speak almost similar like Spanish or , French

#

it could work

#

and since there’s a ton of data

#

it could work

spark current
#

And it actually performed really good

#

The graph was a little weird

#

but every 100 epochs with 6-8mins of dataset did 4k steps

#

Which is actually insane

analog hornet
#

The result of this pretrain could be better, but I hope the creator makes the correction. It has a strange sound, but the body of the voice seems better, there are missing low notes and more low voices to even it out.

calm jay
#

Mustar is currently training the Base further.

analog hornet
#

What do you recommend to improve the sound of the model? What I had the honor of testing was missing low notes and more voice body, it sounded like it was a falsetto instead of a normal voice

#

So far the Titan is the best. But it may be that in a few days it will no longer be

calm jay
calm jay
analog hornet
#

Where is the OG?

calm jay
#

😐

viral flicker
#

Original

spark current
#

wow

#

so none of the pretrains that are made from scratch that have even better quality in them do nothing

#

and if im correct the original was trained on some noise

calm jay
stray orchid
nova spindle
#

Is it better than Titan?

#

Nvm, it's kinda undertrained so i'll use titan just in case.

viral flicker
#

We can wait for rigel 40k 48k

dense furnace
nova spindle
#

it works on english too but honestly titan and klm are considered the best on my opinion

sage moss
#

I used the same dataset btw

sage moss
#

@nova spindle have you tested this pretrain before?

nova spindle
#

if it beats og pretrain, i'll use it

nova spindle
#

i don't recommend using rigel for now

sage moss
#

Alright

nova spindle
#

Keep using OG pretrains, they're 100% better tbh