#✨│ai-help

1 messages · Page 214 of 1

low shard
#

Is RefineGAN not stable enough yet?

simple ore
#

well, it was great in small scale tests, but large tests not so much

simple ore
#

i'm testing some changes

#

but that means yet another retrain

#

at this point it is no longer "refine" gan

jaunty iris
#

so thats why i couldnt get it to work 😭

#

i was going insane earlier i reinstalled like 7 times

analog obsidian
old sigil
#

hey all

simple ore
jaunty iris
#

Oh ok

simple ore
#

change this back to True

old sigil
#

i am new in rvc and I want help to understand how to setup it on either local or collab?
can somebody guide me please

simple ore
jaunty iris
simple ore
#

but it is no longer does what the original paper did

#

my current version is using interpolation and a parallel resblock, so it is more like hifigan that has solved the problem with horizontal lines at 4, 8 and 12KHz

old sigil
#

?

simple ore
#

???

old sigil
#

I want to setup rvc and train a model with my voice

#

can you give me a rough idea on how to do this ?

simple ore
old sigil
#

thanks!

kindred eagle
#

The orignal one made by rvc boss

low shard
#

Your Nvidia GPU is good enough to do inference (use models) locally (on ur pc), not the best to train (make models) even if still possible

You can:

  • Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides):
    • Applio: A fork of RVC with some extra features like Applio TTS, kinda faster and simpler but same quality tho
    • Mainline: The original RVC
  • Cloud (remote good pc, easier and faster than ur PC but it's limited):

Easiest possible (automatically separates vocals & instrumentals) : weights.gg
easiest cloud: Ilaria rvc zero
easiest local: Applio

#

I would suggest you applio for more updates

fluid topaz
#

I know rx but which one is vx? Kinda lost

low shard
glacial pollen
#

waves' Clarity vx - dereverb pro mono

#

( Reminder; it is for mono audio. So you take one channel and work on it (( as you should anyway )) )

fluid topaz
glacial pollen
#

old sigil
low shard
#

RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models

Wokada = uses RVC for realtime inference

#

always elaborate your requestes when asking, people can't know your pc nor what u want to do

old sigil
low shard
#

those are 2 different program names

#

rvc does not equal to "realtime voice changer"

old sigil
low shard
#

there is a go realtime for the original/mainline rvc, but it's way worse optimized than original wokada, and way worse than wokada deiteris fork

low shard
old sigil
#

but Yeah!
thanks for your help

I'll have a look on wokada for my usecase

low shard
brittle wing
#

Yo guys,
I’m tweaking the Batch Size setting and not sure what to pick. 4 = better accuracy but slower, 8 = faster but "standard".
Does 4 actually improve sound quality, or is 8 just as good?
Would appreciate a simple explanation!

oak sandal
#

i used to do up to 20 batch size

#

i think theres barely any difference between 4 and 8 so if you have the power to spare just do 8

oak sandal
low shard
low shard
#

because that's old since 2023

oak sandal
#

once you get to cloud training you never go back

oak sandal
#

also yea i'd rather use the old mangio fork even in 2025

#

Still crepe and still rmvpe, i see we dont have a new extraction method yet after all this time

#

the "embedder" is new to me tho

#

iirc it was by default on chinese hubert

analog obsidian
oak sandal
#

now we got.. contentvec?

low shard
oak sandal
analog obsidian
oak sandal
#

but when your dataset is 2 hours long

analog obsidian
#

rvc always used contentvec

#

hubert = contentvec

#

same thing

low shard
oak sandal
#

not google colab

#

actual cloud servers with RTX server gpu's

#

lmao

low shard
oak sandal
#

i used colab only like 3 or 4 times when RVC wasnt a thing and SVC is all we had

analog obsidian
analog obsidian
#

even the logging is bugged

oak sandal
#

i made great models even with the outdated code and everything being wrong + bugged logging in 2023

#

i seriously doubt there was much improvement since then

analog obsidian
#

because it wasnt outdated in 2023

#

mangio stopped receiving updates around that time

low shard
analog obsidian
#

it was on pair with mainline back then

#

now its even behind mainline

#

and mainline is also extremely outdated

oak sandal
#

i would love hearing audio difference between 2023 mangio-crepe and nowadays RMVPE on applio or whatever yall prefer now 👀

analog obsidian
#

rmvpe its just a f0 estimation

#

is not a quality thing

oak sandal
#

it had pitch issues

low shard
analog obsidian
#

rmvpe has been always the same

#

🥹

oak sandal
#

not RMVPE itself

#

the implementation

#

it was a mess

oak sandal
unkempt sapphire
#

does sup1 even work

#

only sup2 is good

#

and it sounds so ass with sup2 on

low shard
unkempt sapphire
#

oops

oak sandal
#

bro i legit checked on Weights.gg, my old model trained on mangio's fork is still magnitudes better than the newer models

low shard
# unkempt sapphire oops

RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models

Wokada = uses RVC for realtime inference

oak sandal
#

😭

#

most of the work is separating the vocals, not using the right version of RVC

#

im p sure i wouldn't be able to replicate that level of clean vocals again, kim vocal 1 was goated back then

low shard
low shard
oak sandal
oak sandal
#

i used multiple models on FLACs files, then manually fixed issues in FL Studio + izotope rx

#

most people just download random mp3 128kbps acapellas from youtube and be done with it

#

every other people that tried to make an "updated" version of my model failed miserably, the audio that gets generated is a mess, both in sound quality and voice similiarity

oak sandal
#

this one

crude flame
#

oh a singer

no wonder people mess it up

oak sandal
#

then how come i didn't mess it up?

#

🤔

crude flame
#

you prob used studio sessions

oak sandal
#

no

crude flame
#

you tried cleaning the dataset

oak sandal
#

no studio session, all manual labour and AI

#

from the song themselves

#

i just downloaded the whole album in FLAC quality

#

then one by one, minute by minute, cleaned the dataset

analog obsidian
crude flame
#

oh

#

wow

analog obsidian
low shard
oak sandal
#

ok so, after 2 years the advancements were... better GUI and slight code optimization?

#

what happened to that one dude who was developing RVC "v3"

low shard
low shard
#

idk why ur so attached to mangio fork

oak sandal
#

i think last time they were trying vocos

#

iirc that was the name

oak sandal
#

atleast for RVC

low shard
oak sandal
#

which is probably best for speaking only

low shard
#

ofcourse it's not such an entire new type of architecture, but i would rather use code that gets updated and improvements rather one that doesn't get any at all

oak sandal
#

i dont think we are gonna see any decent or sizable improvements in the next year

#

maybe 2027

glacial pollen
#

As for f0 extractors.. yeah, there's nothing really better as of now, afaik

#

As for vocos, it is not worth it.
Quite tricky to get properly working and tbf, potential phase reconstruction issues aren't worth it, similarly stft and I believe istft vocoders

simple ore
#

most generators are only useful for mel to wav reconstruction, not encoded latents to wav

#

hifigan uses NN filters to predict the waveform from the encoded latents

#

vocos does not have enough capacity to predict

tough fiber
#

-kaggle

karmic oliveBOT
# tough fiber -kaggle

Suggestions for @distant turtle

📘 Kaggle Notebooks

Note: Kaggle limits GPU usage to 30 hours per week.

unique rock
#

Can someone tell me why I get this in Applio Voice Blend?

model_blender(model_name, pth_path_1, pth_path_2, ratio)
ValueError: too many values to unpack (expected 2)

I tried to blend a 200 epoch and 48k sample rate model with a 270 epoch model the same sample rate

simple ore
#

there may be an issue if the models came from different sources

icy vessel
#

what pretrain does the applio colab use by default ?

low shard
icy vessel
deft flare
#

Hi everyone I hope I'm texting the right chat.. I have had to cancel kits.ai due to being ridiculously expensive.. anyone knows good complete walkthrough tutorial how to make a model from scratch? kits would sort everything for me so I feel kinda lost

low shard
#

First of all what's your PC GPU

deft flare
#

macbook pro m1

low shard
deft flare
#

oh?

#

i'll bite the bullet i guess

low shard
# deft flare oh?

Train (make) RVC Models on cloud:

  1. Prepare the Dataset
  2. Setup RVC:
    Choose a cloud way to use RVC,
  • Google Colabs (4 hours of daily gpu for free, not much hours, but easy to use):
  • Kaggles (a bit harder to use and needs phone number but gives 30 hours weekly of better gpus):
  • Lightning.ai (Kinda hard, needs login, no issue with web uis or anything, but only free 15 credits monthly):
  1. Be sure to know about the tensorboard

Google Colab = Easier but risk of getting disconnected
Kaggle = Harder but way more gpu time
If you are looking for the easiest way and for free, is using https://weights.com/ which ofc uses RVC

RVC Inference (use models) on pre-recorded audio on Cloud

You can use either:

deft flare
#

so I want to train after this musician.. perhaps the whole discography not faring more than 30-40 minutes tota

low shard
deft flare
#

I actually downloaded this earlier this evening but it's all so overwhelming

#

i have uvr5 as well just don't know whether there's an ubiquitous setting to extract clearest lead vocals

#

if you could guide me step by step mate, I'd paypal you or something if that suits you

low shard
deft flare
#

uvr5 seems to work when I open it haha

low shard
#

Hopefully it runs on MPS using the integrated m1 pro chip rather than the CPU

#

Else it's gonna be even slower

deft flare
#

right

#

(pretends to understand)

low shard
#

@viscid moss hey sorry to disturb, do you remember if UVR supported MPS for Macs, or does your version of it support it?

low shard
# deft flare ur so kind thank u ❤️

I was asking the other staffer if he remembers if that program supports it, or if his own version supports it since he made a separate own version of UVR by the way

deft flare
#

yeah I gotchu, I was just moved how there's actually someone being understanding towards complete n00b

#

lost kind of kindness on the internet for most part

low shard
#

Unfortunately unless you spend like 4k dollars, macs are pretty shit for AI

#

And even if you spend that much and get the most powerful Mac made for user consumption, the issue is not many AI programs support Mac at all

deft flare
#

when I was experimenting with google collab stuff it seemed to do decent like?

low shard
#

Nvidia GPUs are the best in terms of performance and support in AI field

deft flare
#

Ahh

low shard
#

Did you install applio locally, or did you just use a Google colab? And this for UVR too?

deft flare
#

both locally

low shard
#

Tbh, I don't know how much I could suggest you to use them locally, At this point cloud would be faster than your Mac, but the only issue is limited GPU time

deft flare
#

I can leave it overnight or w/e not to worry about that really ❤️ guidance is pivotal for me

low shard
#

I would personally suggest you to use cloud for faster processing, but your choice

#

I don't know how much time it could take to be honest, not sure if it's going to be overnight or more or less, so I can't guarantee you much on that

viscid moss
#

UVR5 UI probably works with MPS but I haven't tested it because I don't have a way to test it.

#

So I recommend just use UVR5 for mac

low shard
viscid moss
rapid spade
#

is there a guide on how to download the software needed for ai stuff

dire juniper
#

anyone know why i cant upload a rvc model to my voice changer ?

karmic flax
#

i get alot of errors when installing RCV with the TroubleChute one line command. and im to stupid to install it manually. i used RVC on my old pc (win10) before and now on win11 everything just wont work. it openes the website but it wont ever finish converting

hallow thistle
karmic flax
#

Retrieval-based-Voice-Conversion-WebUI

#

if that wasnt the correct answer im sorry lol im not rlly deep into software stuff..

hallow thistle
#

It would be better to use Applio the RVC instead.

#

What is your PC GPU?

karmic flax
#

RTX 5080

hallow thistle
#

Damn. Unfortunately, there's no known stable version of Applio for this specific GPU. But I think you can use Applio with CPU instead.

brittle wing
#

i want a model that sounds realistic please, i dont mind the size of the file

karmic flax
brittle wing
karmic flax
#

no im wondering about stuff cuz i dont have a clue lol

brittle wing
#

that's like one of the most powerful if not the most poweful cpu rn

#

ik cuz i have one myself

karmic flax
#

yeah for gaming but for AI stuff idk?

brittle wing
#

youre thinking its like nvidia vs amd but its different from gpus

karmic flax
#

i just heard for productivity stuff core count is more important but ig not

hallow thistle
brittle wing
#

what do you want to do with it

hallow thistle
brittle wing
#

i flex my 7900xtx sometimes

#

flexing is ok as long it dose not include lying imo

hallow thistle
#

Imagine thinking "my laptop GPU is Intel HD Graphics 3000" is a lie. misc_skull_distorted

brittle wing
#

yea imagine

brittle wing
#

imagine imagining stuff that isn't true and just assuming it over the conversation

simple ore
#

but you can install something for applio

karmic flax
#

so the issue is my gpu? dang lol

hallow thistle
simple ore
#

you can download applio compiled version, then update torch to cu128

#

manually

karmic flax
#

would the best bet be to use Applio? i have it installed but i dont rlly get it haha

simple ore
#

actually not compiled, clone the repo

#

then run the installer

#

hopefully it does not error out

#

then update torch

karmic flax
#

skull_sob "just do it" already confused lol

hallow thistle
#

Applio is the only way to get converted audio done very fast with proper GPU. With RTX 50 GPU, you'll have to do some code a bit.

karmic flax
#

saddies. so there is no way for me lol

hallow thistle
#

You can use any other RVC program, but all of them will only use your PC CPU because neither of them have complied for RTX 50.

karmic flax
#

i mean i wouldnt care what it uses as long if it just works

hallow thistle
#

That's all good.

karmic flax
#

i wont be training my own models and stuff. i just need stuff to be converted into voices via pretrained models

simple ore
karmic flax
#

but hooow

#

u linked me a file i cant even do anything with lol

#

cuz tf is a .whl file

simple ore
#

clone the repository

#

run installer

karmic flax
simple ore
#

once that is done run the command I provided from command prompt

#

download zip

karmic flax
#

step one completed successfully lol

#

then ig run-install.bat

#

then running ur command in a terminal with admin perms

#

module "env" couldnt be loaded

karmic flax
karmic flax
karmic flax
#

but how do i even get there

simple ore
#

open command prompt

karmic flax
#

yeah

simple ore
#

you need to download the wheel into the same folder

karmic flax
#

okay did that

#

dang i did it. im a hacker. lets hope it works lol

#

greatly appreciate the help tho 🫂

simple ore
#

FCPE wont work until there's a updated torchaudio

#

but hopefully the rest would

karmic flax
#

hm now it doesnt start up anymore lol

simple ore
#

give it a bit

karmic flax
#

it throws an error that it doesnt find smt

karmic flax
#

ig "env/lib/site-packages/torchaudio/lib/libtorchaudio.pyd" wasnt found lol

#

ig i need to do the same with libtorchaudio but idk what version

carmine shuttle
#

Guys is it normal for a voice model in zip that i heavy 259MB ?

karmic flax
#

id still appreciate help with getting any voice conversation to work on my gpu. cirnoblush

brittle wing
#

Anyone know the python version requirement if I want to run this version of UVR5 locally?

https://huggingface.co/spaces/TheStinger/UVR5_UI/tree/main

I cloned the repo and tried to install the requirements with the lastest python 3.13.2 but failed:

ERROR: Could not find a version that satisfies the requirement torch<2.5,>=2.3 (from audio-separator) (from versions: 2.5.0, 2.5.1, 2.6.0)
ERROR: No matching distribution found for torch<2.5,>=2.3
hallow thistle
simple ore
simple ore
knotty moth
knotty moth
glass igloo
#

Hello. I've encountered an issue while creating an index file. I have approximately 7 hours of audio for voice training, and the training process went smoothly. However, when trying to create the index file, the process stops at around 1,300 files out of 48,000, and then an index file is generated, which is only 30 megabytes in size. When using this index file, the voice often converts with artifacts. In other models I've trained on 10-20 minutes of data, the index files weigh 120+ megabytes. What should I do in this situation?

simple ore
#

you can always run inference without an index and check if that comes okay

glass igloo
simple ore
#

the voice comes from the model

#

index is just an accent

#

7 hours of audio is too much for a finetune and not enough for training from scratch

flint glade
#

guys

#

can anyone help me

#

on the voice changer

glass igloo
brittle wing
flint glade
#

i downlaoded it but when i will open it is doesnt open

brittle wing
flint glade
brittle wing
flint glade
#

i have downloaded the voice changer and the guy on the video says open it

#

i open it but it doesnt open

low shard
#

First of all, you want a realtime voice changer for calls?

flint glade
#

can you say me a good voicechanger i will make a girl voice

flint glade
low shard
flint glade
#

okay

#

for

#

discord

#

i will

low shard
#

RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models

Wokada = uses RVC for realtime inference

flint glade
#

ok

simple ore
#

new model, obviously

knotty moth
empty parrot
# simple ore index is just an accent

Yo, i didn’t know index is for accent. I always thought .pth always needs it to work. Is there a good documentation to learn some of this? Id like to understand general idea how it works but not the too deep explanation for like engineers.

hallow thistle
#

A pth file always required to work, of course it is.

#

An index file is a file that stores voice accent and used for specific voice model. It can be achived during voice training process.

brittle wing
#

Yo guys,
I still don’t get how Batch Size really works.
Does 4 actually improve sound quality, or is it just a performance thing?

odd shale
#

(I'm talking about quality)

#

It will mostly be useful if you have a short dataset below 5-4 mins.

#

If your dataset is around 10+ mins, go for 8 on batch

analog obsidian
#

too high in a small dataset might hurt the model ability to generate new audio

odd shale
#

Si es buena idea usar batch 4 con cualquier dataset que vaya debajo de 30 minutos?

#

Esperate leí mal XD

analog obsidian
#

si el dataset es menor a 30 mins usa batch size 4
si el dataset es mayor a 30 mins usa batch size 8

brittle wing
#

Got it! So for less than 30 mins, Batch 4 is best, and for more than 30 mins, Batch 8 is better. Just to be sure, using Batch 4 on a longer dataset wouldn’t really improve quality, right?

analog obsidian
#

so the model would sound weird

brittle wing
odd shale
analog obsidian
#

there are times where bs 4 gives better results than 8 and viceversa

odd shale
brittle wing
#

Thanks a lot!

earnest stone
#

how do you make rvc?\

low shard
karmic flax
low shard
#

People need to understand that helpers might be busy sometimes and they can't reply in 2 mins

#

Do you need any help?

karmic flax
#

yeah..

karmic flax
# low shard Do you need any help?

well. i got help before but im sure he was also busy. i couldnt solve my issue but im also veeery nooby when it comes to software so yeah

#

from what i understood is that torch/torchlibaudio doesnt have working versions for the 50series nvidia cards? and id need to update manually to a nightly version of both

#

buutt yeah

low shard
karmic flax
low shard
karmic flax
#

if u have the patience to handle me haha

low shard
karmic flax
low shard
# karmic flax Yes

try going back to where you had that libtorchaudio missing issue, open CMD, run env\python -m pip install https://download.pytorch.org/whl/nightly/cu128/torchaudio-2.6.0.dev20250308%2Bcu128-cp310-cp310-win_amd64.whl

#

then, try running applio

#

@karmic flax btw be aware of the missing ROPs and melted connectors for the 50 serie, many 50 serie gpus are having that issues

#

you should prob check out if it's happening to you too

karmic flax
low shard
karmic flax
#

hmm i have errors but i cant dm u and i dont have perms to upload a pic

low shard
karmic flax
low shard
#

can you try uploading the pic now?

#

weird, @karmic flax can you try to run the same command #✨│ai-help message , but add after install, add --force-reinstall then leave everything else as it was

karmic flax
#

already confused cirnoblush

low shard
karmic flax
#

so i needa get that file first

low shard
# karmic flax

oh I thought you downloaded #✨│ai-help message

well, you can run env\python -m pip install https://huggingface.co/w-e-w/torch-2.6.0-cu128.nv/resolve/main/torch-2.6.0+cu128.nv-cp310-cp310-win_amd64.whl

#

seems like you don't need to force-reinstall, because you didn't install it in the first place

karmic flax
#

i was looking for it but i couldnt find the correct version

low shard
#

could you try the command I just told you?

karmic flax
#

yeah its downloading rn

#

its doing something at least lol

low shard
karmic flax
#

i mean red is always bad right. but ill try running it lol

low shard
low shard
# karmic flax

shit

what if you try
env\python -m pip install https://download.pytorch.org/whl/nightly/cu128/torchaudio-2.6.0.dev20250306%2Bcu128-cp310-cp310-win_amd64.whl

#

this build is earlier than the one I sent you before

karmic flax
low shard
# karmic flax

my last guess would be running: env\python -m pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128

#

see if that works

karmic flax
#

well dang. it worked

#

uchuClap thank uuu aloooot 🫂

low shard
low shard
simple ore
violet badger
low shard
#

what's your pc gpu

violet badger
#

3070

low shard
# violet badger 3070

yeah you wouldn't even need to use Ilaria RVC, it's a cloud (remote good pc) service, but your pc is good enough

#

As you got a good PC, you can use RVC locally, you can choose between:

  • Applio: A fork of RVC with some extra features like Applio TTS, kinda faster and simpler but same quality tho
  • Mainline: The original RVC
violet badger
#

oh

low shard
violet badger
low shard
violet badger
#

or is that fixed

low shard
#

it's all completely fixed and safe

violet badger
low shard
violet badger
low shard
#

yw

brittle wing
#

Yo quick question about my RVC training. I’m at 27k steps now. Loss kept dropping but has been super slow since 10k-12k.

analog obsidian
#

if its overtrained the model is going to sound robotic

#

while in epochs that arent overtrained they're going to sound just fine

brittle wing
analog obsidian
#

if you use spek its easier to spot overtraining, there'll be missing frequencies in the result

brittle wing
low shard
knotty moth
analog obsidian
#

overtrained models generate noise instead of harmonics at some point

#

🦈

knotty moth
small vortex
#

Best anime latina egirl mommy wifey voice model?

hallow thistle
#

Y'all be popping out and asking for E-girl voice model just to troll and cat the damn fish someone.

small vortex
#

Like everything you do is just answering people who asks for e-girl models

#

lol

#

just actually searched your messages lol

tame mica
#

mf predicted what he would say

#

lowkey ts funny i fw u doe

hallow thistle
#

Like if asking anything would make you more money though.

#

Shit. You acting like if I do this everyday huh.

turbid root
#

Hi! How do you know if the model is overtrained?

knotty moth
strong shadow
#

Hey does, anyone know how i can get multiple (5) .VOB files and combine them into 1 whole video? No one seems to. I've tried clipchamp but there's a 1second gap/pause thats a mess.

long gazelle
#

@low shard

low shard
long gazelle
#

ah okay

#

also could you explain the license thing

brittle wing
#

Anyone know the required python version to run https://huggingface.co/spaces/TheStinger/Ilaria_RVC/tree/main

I use python 3.10.16 and run into errors while installing requirements:

Building wheels for collected packages: omegaconf, samplerate, srt, antlr4-python3-runtime
  Building wheel for omegaconf (pyproject.toml) ... done
  Created wheel for omegaconf: filename=omegaconf-2.0.6-py3-none-any.whl size=36882 sha256=0b988ea25770e060c1ad0bde20dfbd7da84924e620f08626d4507bff6e337ece
  Stored in directory: /tmp/pip-ephem-wheel-cache-mgtyolse/wheels/ee/67/d9/a68a521e487bb78d6599d3a157f5bb01d0760c689a9c2ac78f
  Building wheel for samplerate (pyproject.toml) ... error
  error: subprocess-exited-with-error
  
  × Building wheel for samplerate (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [4 lines of output]
      running bdist_wheel
      running build
      running build_ext
      error: [Errno 2] No such file or directory: 'cmake'
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for samplerate
  Building wheel for srt (setup.py) ... done
  Created wheel for srt: filename=srt-3.5.3-py3-none-any.whl size=22483 sha256=2ca8125c77c760943695358d90630f7dbf1a8fc14d0c479a94cc8bbaa9d08d93
  Stored in directory: /home/yui/.cache/pip/wheels/d7/31/a1/18e1e7e8bfdafd19e6803d7eb919b563dd11de380e4304e332
  Building wheel for antlr4-python3-runtime (setup.py) ... done
  Created wheel for antlr4-python3-runtime: filename=antlr4_python3_runtime-4.8-py3-none-any.whl size=141246 sha256=5175ca3614fd8bc7ba3b90590c4920a09065e395cd7a667f3d53289ec8ed5974
  Stored in directory: /home/yui/.cache/pip/wheels/a7/20/bd/e1477d664f22d99989fd28ee1a43d6633dddb5cb9e801350d5
Successfully built omegaconf srt antlr4-python3-runtime
Failed to build samplerate
ERROR: Failed to build installable wheels for some pyproject.toml based projects (samplerate)
#

Also if there's a better alternative please let me know

low quail
#

so a question, haven't touched ai voices in a while...
how do you make a cover of a song again? colab doesn't work for me anymore, whenever I try to load a model it gives me not found

hallow thistle
brittle wing
low shard
long gazelle
#

ah oaky

low shard
low shard
#

Also what's your PC GPU and what do you want to do

#

AMD moment

#

What's your PC GPU

tulip cloak
#

@low shard hey is mailine back, the colab one

low shard
#

Maybe @simple ore knows, he used to have AMD till he recently switched to finally Nvidia

low shard
tulip cloak
low shard
#

Well the batch size does depend also on the dataset length, but 8 shouldn't make your PC crash at all

#

It's fine

outer isle
#

I’m going to create (Mostly) Blackiana model with Apollo, is there a voice file?

low shard
#

Tensorboard to check how the model goes

low shard
hallow thistle
#

You use Tensorboard to track the process, to make sure the model won't be too overtrained or undertrained.

#

Oh, so that's what Tensorboard looked like if run locally. cat_stare

low shard
brittle wing
low shard
brittle wing
low shard
brittle wing
low shard
#

Are you sure it's using your GPU

hallow thistle
#

If your GPU percent goes high in Task Manager, it's definitely working.

analog obsidian
#

gpu and batch size?

#

also dataset size matters

#

its normal
Amd is just very slow

#

150 epochs if every slice is 3s

#

if the slices are different lengths
around 200 epochs

#

ok
yes, around 200 epochs

#

as long the cmd is open, its fine

#

500 epochs is too much for small models, even when using the automated slice

#

ai is random so u cant just predict when its going to sound fine

#

is always a good practice to compare the epochs
at some point the model will naturally overtrain, which makes final epochs sound very robotic

#

in that case your final model would be any epoch before overtraining

simple ore
#

read the note at the bottom of the AMD installation instructions

low shard
#

AMD moment

simple ore
#

45min set on 6700xt was taking ~4min/epoch

#

yeah, it was an overnight training to 200e

low shard
simple ore
#

it is faster then colab

#

not as fast as kaggle

low shard
#

yeah 30 hours of GPU weekly

#

way better than google colab that has random daily gpu with a max of 4 hours daily

simple ore
#

use scalars tab

upbeat chasm
#

Does enyone know why when I import rvc to voice.ai app every voice sounds almost the same

low shard
#

I mean wokada ones are fine, they are open source

upbeat chasm
#

Then what free service should I use

low shard
#

you can also use it on google colab and kaggle, yes

#

rvc realtime from mainline/original rvc is pretty old

upbeat chasm
#

I want voice change Just for mayself to use not to use on calls

simple ore
#

expand losses, or avg_50 if you have that

low shard
#

still, tell your pc gpu

#

@upbeat chasm You can check your pc gpu via:
ctrl+shift+esc (task manager) -> Performance tab -> GPU

simple ore
#

does not look good at all

#

depending on the model

simple ore
#

flat mel and g loss means the model is close to the dataset in the type of the data it has

#

generally both have to go down

low shard
#

@upbeat chasm #🔍│help-w-okada message since you got a 3060 laptop, and don't need to use the models in realtime for games/calls

Your Nvidia GPU is good enough to do inference (use models) locally (on ur pc), not the best to train (make models) even if still possible

You can:

  • Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides):
    • Applio: A fork of RVC with some extra features like Applio TTS, kinda faster and simpler but same quality tho
    • Mainline: The original RVC
  • Cloud (remote good pc, easier and faster than ur PC but it's limited):

Easiest possible (automatically separates vocals & instrumentals) : weights.gg
easiest cloud: Ilaria rvc zero
easiest local: Applio

simple ore
#

in at least the first 5k steps both mel loss and g loss have to go down

#

if they dont there's something wrong with the dataset or something else

#

too big of the batch size o something

#

g total

#

depends on the dataset size/batch size

#

What's your batch size?

#

too much for 15min

#

use 4

blazing solar
#

-colab

karmic oliveBOT
# blazing solar -colab
📒 Google Colab Notebooks
ℹ️ Note

While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

simple ore
#

and now your model exploded

#

because you're training with FP16

#

you can stop now, it is dead

rough parrot
#

! C:\Users\user\Downloads\vcclient_win_cuda_2.0.76-beta.zip: C:\Users\user\AppData\Local\Temp\Rar$EXa23844.2909.rartemp\dist\main\web_front\assets\
yall know the reason for this?

rough parrot
#

alr

dim nacelle
#

Sorry, I want to know how to control the speed of the voice, and can I produce the vocie and srt together? Thank you!

low shard
#

the file name seems to be the latest version of Original Wokada (Wokada is used for using RVC models in realtime for calls/games)
But the wokada deiteris fork is way better than the original wokada in performance and quality

#

@rough parrot tell your pc gpu in #🔍│help-w-okada , this is also the wrong channel since RVC and Wokada aren't the same program

coral viper
#

What is the step to use voice models in voice.ai app?

quasi condor
coral viper
#

Then what should I use?

quasi condor
#

or if u wanna make a cover use applio

coral viper
#

Where can I download?

quasi condor
coral viper
#

Where can I see my GPU?

quasi condor
coral viper
quasi condor
#

there should be gpu 0

#

click that and tell me what's ur gpu

coral viper
#

Can you just send me the pic of the step?

quasi condor
#

@low shard u can take it from here

coral viper
#

I'm not smart

#

Now you speak in my language

#

RTX 3050, I guess

dim nacelle
#

Could Applio 3.1.1 control the speed of ai voice? and can I produce the vocie and srt file together?

coral viper
#

That's what you mean right?

#

Or the processor?

#

And then what? How to instal Applio

#

Ok

low shard
coral viper
#

But I'll try another fun I guess

low shard
# coral viper No, for song cover

Your Nvidia GPU is good enough to do inference (use models) locally (on ur pc), not the best to train (make models) even if still possible

You can:

  • Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides):
    • Applio: A fork of RVC with some extra features like Applio TTS, kinda faster and simpler but same quality tho
    • Mainline: The original RVC
  • Cloud (remote good pc, easier and faster than ur PC but it's limited):

Easiest possible (automatically separates vocals & instrumentals) : weights.gg & rvc-ai-cover-maker-ui
easiest cloud: Ilaria rvc zero
easiest local: Applio

coral viper
#

Download compiled version right?

#

Kinda sus

#

Seriously please

#

On it

#

Why is that?

#

ApplioV3.2.8-bugfix.zip right?

#

Oh

#

Can we request a model? Is there a payment?

#

After download Applio zip, what else?

#

And then?

#

Why is it take so long to extract?

analog fossil
#

Where can I get rvc v2 voice models?

coral viper
#

@primal barn I didn't find batfild

low shard
# analog fossil Where can I get rvc v2 voice models?

You can search rvc ai voice models at:

if there isnt one, you can:

earnest muskBOT
coral viper
#

Oh

#

@primal barn The screen is black

#

Then what?

#

It's just dark with "Applio" on the left up corner

#

Yes I guess

#

Wait, is it require internet?

#

its like this

#

Oh yes, now it open in browser

Now what?

#

In browser?

#

Yes, and then what?

#

Why not in the program itself instead?

#

Yeah

#

So it's online then?

#

Oh cool

#

@primal barn Now how to put the model?

stray raven
low shard
stray raven
#

my bad

coral viper
#

How to put the model?

#

@primal barn So I must move the model file into that folder?

#

Just the pth? Not the index too?

#

Put the index and pth alongside the reference and the mute?

#

folders

#

Aight aight. So after I put the model ...

Then I put the song or audio I want there

And then just convert?

#

How to do that?

#

Dang, it sounds goofy cat_blush

#

@primal barn It sounds goofy. And some part there's like a glitchy sound

#

Can I fix it to sound more smoothly?

lime wadi
#

hi i tried to download RVC but when i start it it gives me an error saying that the 'thinker' module is missing does anyone know how to fix this?

coral viper
#

Both

glacial pollen
# coral viper

This is quite likely due to harmonies / vocal layering / and echo / harsh reverb smearing the f0 traces too much

coral viper
#

The song. Cuz I don't know how to separate

quasi condor
#

u could've just separated it on uvr5 or mvsep

coral viper
quasi condor
#

using melband

quasi condor
coral viper
quasi condor
coral viper
#

How?

#

I hate watching youtube tutorial

quasi condor
coral viper
#

You said that I should pay for it

glacial pollen
#

Imho, currently the best one

#

Also, don't use the link they gave you, it won't have support for newest newest models

#

first this one

glacial pollen
#

oh

#

there

#

Once you dona all, you wanna get a model I'll link

coral viper
#

I don't understand how to patchcute_dogwave

glacial pollen
#

those are installs my man

#

install one, install the 2nd one

#

No manual work involved in that part

#

and download this:

#

It's the fv4 model for isolation I mentioned

coral viper
glacial pollen
#

first is the base ish, 2nd is the patch

glacial pollen
#

It is a model for separation

#

which you'll use in uvr

#

the fv4 model I mentioned

#

Now, focus.
Once you install the uvr, you'll have to find it's folder and then, put the model ( fv4 ) in here:

#

so, uvr's folder / models / mdx_net_models

#

( yes, you'll just cut n paste the model file in there )

glacial pollen
coral viper
glacial pollen
#

.pth is just a format of models used in pytorch

#

pth = pytorch

#

.ckpt is another format meaning a checkpoint

#

don't worry about that part

coral viper
#

Oh god, this is too much for my 12 y.o brain

glacial pollen
#

well.. If you wanna hop into vc, I might help you if you screenshare

#

Other than that, you gotta follow the text instructions

coral viper
#

That's just make things worse

I'll just follow your tutorial from what you sent above

If I stuck, I'll just ping you

glacial pollen
#

sounds good

glacial pollen
#

like this

#

In the model type you'll have to select " mel band roformer " ( just not the v2 variant )

coral viper
glacial pollen
#

that part is for the first box

#

the " select model param "

#

there'll be an option to open / use the yaml config

#

and you'll use the one I sent you

#

Lastly, you'll be clicking ok or was it apply for all windows

#

And that's that from adding custom models

#

Now as for configuring the uvr for usage

#

( do it after properly setting up fv4 model

#

you could also do this:

#

so, setting the wav type to 32 bit float

#

( in case you'll be using uvr for making datasets / samples for model training, else you can keep it as 16 bit )

#

Oh yea, in here you can use 11 or 16, I'd recommend 16 tho

#

Now... if you need an authentic guide for uvr, models n stuff..
https://docs.google.com/document/d/17fjNvJzj8ZGSer7c7OFe_CNfUKbAxEh_OBv94ZdRG5c/edit?tab=t.0
But I warn you, it's a messy spaghetti 👀

glacial pollen
#

Open the website up and you'll understand what I mean : )

#

as for what it is, well, exactly what it says

#

Documentary, just docs on uvr and models

coral viper
#

Aren't the steps you sent above before enough?

glacial pollen
#

I mean ye, it's enough, but if you wanna know more about everything, that's the place

coral viper
#

Oh ok

glacial pollen
#

mh mh

coral viper
glacial pollen
#

I mean

#

I already wrote there which ones

#

Did you actually read my msgs

coral viper
#

I did, but it confusing, looks like not in order

glacial pollen
#

👀 because someone cut into my msgs making it more confusing than it should be.

Model type: mel-band roformer

model param box: from yaml config / config file or something along that

coral viper
#

In Select Model Param, which one I should click?

Or is that where I put yaml?

#

After I download yaml file, where I should put it?

In what folder?

glacial pollen
#

send ss

coral viper
glacial pollen
#

the model param box, open it and show an ss of it

#

pro tip, shift+win+s lets you select the area to make a screenshot out of / crop

#

then ctrl+v in here

#

well, taking you long

#

Don't want to sound rude but, can't be staying here for a whole day 🙄 got stuff to do

coral viper
coral viper
glacial pollen
#

install new yaml

coral viper
glacial pollen
#

this is what you always use when first adding new model

#

so, any custom models

coral viper
glacial pollen
#

then ye, set the type as mel-band roformer ( again, not v2 ) and confirm all

#

and that's all

coral viper
glacial pollen
#

One I provided

coral viper
#

Oh ok

coral viper
glacial pollen
#

Save config

#

then confirm

coral viper
#

Confirm where?

#

There is no confirm button

glacial pollen
#

Unless the window disappeared then that's fine

#

Either way, once you close all the windows ( if any ) you'll have the main ui

coral viper
#

Save config and close?

glacial pollen
#

ye

coral viper
#

Then how about the model type?

glacial pollen
#

......

coral viper
glacial pollen
#

Why so many people lately have such attention issues

coral viper
#

Wait wait

#

It's 3 AM here

coral viper
glacial pollen
#

got the main ui?

coral viper
glacial pollen
#

yes

#

now

#

mdx-net, and the box below, fv4 model

coral viper
#

On it

glacial pollen
#

as for 1 and 2

#

you can drag n drop your music / song onto 1 area

#

and you can also drag some empty folder ( for instance, if you wanted it for outputs ) into 2 area

#

much quicker than doing it the other way

coral viper
glacial pollen
#

well no

coral viper
#

Good

#

@glacial pollen Oh, anyway. What version is it again?

5.6.1 right?

glacial pollen
#

of the uvr?

#

it's 5.6 but beta variant

#

just name it 5.6_beta if you want

coral viper
#

The patch you said to install was installed by me with the same way like the full version one

Is it right?

glacial pollen
#

something doesn't work?

#

both uvr and patch is installed the same way as anything else that is an installer

coral viper
glacial pollen
#

you'll know if you did all right once u run an isolation

#

If you get no errors, means you did all right

coral viper
glacial pollen
#

ye, then it works

coral viper
glacial pollen
#

Aside, I never said it'll be fast

#

it takes time, esp at overlap 11 or 16

coral viper
glacial pollen
#

1-3 mins is pretty normal

#

for most songs

#

for instance, 3~ ish min song for me, takes 1.2 to 1.5 mins or something ( rtx 3060

#

so there's nothing wrong about it

coral viper
#

Aight, I take it maybe because I want it in flac

#

@glacial pollen So, the result will be vocal and instrument?

knotty moth
glacial pollen
#

No, vocal

#

fv4 is a vocal model, haven't tested it in instru mode so can't promise anything

#

and if you need a model that yeets the backing vocals, try this:

#

it is downloaded from this section

#

you download it, then hit refresh list and done. It'll appear in the model list

flint solar
glacial pollen
#

I don't maintain kaggle or colabs so, can't help sadly

analog obsidian
knotty moth
glacial pollen
coral viper
glacial pollen
#

karaoke 2 is super nice and, tbf not sure how it compares to classic bve ( been ages

#

but kara 2 has an awful noise

#

so, maybe the one u pointed out is a lil better

glacial pollen
#

check the docs

coral viper
glacial pollen
#

ctrl+f and type in instru or instrumental

glacial pollen
#

because fv4

#

is a voc type model

#

it was made to handle vocals, mainly ( or rather, is mostly good at vocals and I can't promise anything on instrumentals' quality. Haven't tried it for that

coral viper
glacial pollen
coral viper
knotty moth
glacial pollen
# coral viper Voice model I mean

prev-gen models were rather universal ish, mostly
Nowadays tho, more and more models are being made with a specific specialization in mind

#

Instrumentals, vocals, stems, sfx, backing vocals / harmonies

#

and some diverge into having specific properties, such as fullness of vocals, or noise, bleedless etc

#

In this case, fv4 is primarily a voice / vocal model

#

If you need more details and explanations / overviews, the doc I sent you has all of that.
All you need to do is search for a keyword ( ctrl + f ) and have some read

broken urchin
#

can someone help me

#

inference doesnt work on Applio

#

i put in my audio and convert it and it just gives out a blank voice recording

glacial pollen
#

Also, which applio, newest one? or precompiled package

broken urchin
#

newest applio

broken urchin
glacial pollen
#

firstly, try to change the mp3s name

#

no spaces

#

use _s as alternative

knotty moth
# broken urchin

check if the file really exist in that directory, or if not sure convert it to wav first

broken urchin
#

yeah the file is in the folder

#

its there

knotty moth
glacial pollen
#

well, supposedly ye
yet most times issues of a " file doesn't exist " sort happen, it's either the naming, corrupted file, path or lack of ffmpeg

simple ore
#

likely it is not .mp3

#

it may be just .wav renamed to mp3

glacial pollen
#

gotta exclude if the file itself's not an issue

broken urchin
coral viper
broken urchin
#

still the same thing

#

just gives out a blank recording

#

also same error in the console

#

i also tried on different format

glacial pollen
#

it is download from within the uvr

glacial pollen
#

these

broken urchin
#

i have both of those

glacial pollen
#

Also, again, which applio you running

broken urchin
#

the one from the server

#

from ai hub website

coral viper
broken urchin
glacial pollen
broken urchin
#

this one

glacial pollen
#

lemme rephrase. Did you download a zip

#

aka, unpack n run

broken urchin
#

yeah it was a zip file

glacial pollen
#

( doesn't show in my case cause I already have it downloaded, the karaoke 2

glacial pollen
#

Normally people don't encounter such issues

#

nowadays at least

broken urchin
#

so what do i do lmao

glacial pollen
#

well, I can propose checking my fork maybe
if you're up for it

#

but that'll result in some downloading 🤔

#

cause like, having ffmpeg, checking the path / naming, validating on other files
that's pretty much all there is to diagnosing the issue

broken urchin
#

is that the only way?

glacial pollen
#

Those audio related problems are pretty obscure

#

and unclear

glacial pollen
#

as it has to work 100%, my fork, no other way

#

that'd indicate an issue with something else, other than applio itself at least

#

Well, it'd end up on downloading stuff anyways

cause you'd have to get normal applio from repo ( not the package

#

but while we're at it, imma recommend my fork cause why not ¯_(ツ)_/¯

#

Else I'm out of ideas

broken urchin
#

so i cant really fix this

broken urchin
#

its something in my pc

#

thats causing my issue

#

its not applio

coral viper
#

I did scroll, but it reached limit and no more showing

glacial pollen
#

What is written there 🙂

glacial pollen
# coral viper

Because you show me the main ui, not the settings section

glacial pollen
#

Applio works within it's own environment

#

it's independent from ur pc

#

Unless you're not using it that way 🤔 ( for whatever reason

#

you have env folder in applio?

broken urchin
#

yes i have env

glacial pollen
#

for infer ^

#

put it in assets/audios

broken urchin
#

alright hold up

#

what language is that i dont understand anything

glacial pollen
#

That is surprising 🤔

#

considering ur anime pfp

#

It's japanese, the language

#

Anyways, did it work?

#

or nah

broken urchin
#

yeah it worked

glacial pollen
#

Then that's the case

broken urchin
glacial pollen
#

It's either the path u screwed up

#

or the naming

broken urchin
#

it gave out a recording actually

glacial pollen
#

as I said before

#

oh

#

recording?

#

wdym

#

It didn't do the inference? did output the input file?

broken urchin
#

yeah it did the inference

glacial pollen
#

well then, it works

#

so again

coral viper
glacial pollen
low shard
glacial pollen
#

Show me how you previously would input the path to your audio

#

In here

broken urchin
#

i would just record a voice message of my voice in voice recorder and just put it in the audios folder and thats it

glacial pollen
#

well

knotty moth
glacial pollen
#

then name it differently

#

try "test"

#

for the name

broken urchin
#

ok

glacial pollen
#

if that fails, then those mp3s are being screwed up in some way

#

or applio from precompiled had issues with mp3s? ( doubt it but Nothing surprises me anymore

broken urchin
#

ill try to rename it to test then

broken urchin
#

yeah i just tried them out on my voice recordings again but doesnt work lol

glacial pollen
#

karaoke 2 is for backing vocals

#

fv4 is for vocals

#

hq4 for instrus or really any other the docs recommend or mention

broken urchin
#

also the files are different for some reason

glacial pollen
#

bruh

#

I mean, it's just metadata so

#

you can always use ffmpeg to convert it, maybe could help

#

you got ffmpeg installed and added to path, on ur pc?

broken urchin
#

nah

#

i dont even know what that is honestly hahaha

glacial pollen
#

or idk, take the file to the same location you have the ffmpeg in

broken urchin
#

i just have it in the applio folder

glacial pollen
#

then open up the cmd in there

#

then enter and you'll get cmd

#

now, having both ffmpeg and ur mp3 there

#

type in the cmd:
ffmpeg.exe -i test.mp3 test.wav

#

the " -i " means input, that's your file right after it

#

and then, there's output, in this case test but as wave