#✨│ai-help

1 messages · Page 213 of 1

vapid mantle
#

I'm using Applio NoUI, I don't know what the pretrain type is selected. What is the model currently training as? @low shard

#

Original, TITAN, Ov2 Super?

low shard
#

perhaps it's better to ask @nocturne mural since he made that notebook

vapid mantle
#

Oh good then. It is better to use the original

nocturne mural
vapid mantle
#

I tried 2 of them and didn't like them

#

TITAN, Ov2 Super

#

The model says some letters roboticly

low shard
nocturne mural
#

fixed

low shard
hallow thistle
#

Dev announcement about Google Colab just dropped.

tame mica
#

like that wasnt the message nick sent literally before yours ? 😭

hallow thistle
#

This is the highest quality image I could find.

#

I don't know what MMVC does stand for. But for VC, I think it stands for voice changer.

#

I can think for the name of it: W-Okada.

warm crescent
#

Where can I download the newest rvc voice changer?

#

Or is w Okada better?

low shard
low shard
# warm crescent Or is w Okada better?

RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models

Wokada = uses RVC for realtime inference

edgy tangle
#

I think this happens when you resume training

simple ore
river trout
#

Is applio main better or codename fork better?

simple ore
analog obsidian
#

so you can resume from that epoch avoiding that problem

#

but its just a visual bug

#

the model itself is fine

tawdry stirrup
#

how to overcome this Welcome to ColabMod
Timer: 00:00:23DEPRECATION: omegaconf 2.0.6 has a non-standard dependency specifier PyYAML>=5.1.. pip 24.0 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of omegaconf or contact the author to suggest that they release a version with a conforming dependency specifiers. Discussion can be found at https://github.com/pypa/pip/issues/12063
Timer: 00:00:24warning: The --system flag has no effect, a system Python interpreter is always used in uv venv
Using CPython 3.10.12 interpreter at: /usr/bin/python3.10
Creating virtual environment at: .venv
Activate with: source .venv/bin/activate
Timer: 00:00:45DEPRECATION: omegaconf 2.0.6 has a non-standard dependency specifier PyYAML>=5.1.
. pip 24.0 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of omegaconf or contact the author to suggest that they release a version with a conforming dependency specifiers. Discussion can be found at https://github.com/pypa/pip/issues/12063

Cloning the repository...

edgy tangle
edgy tangle
analog obsidian
#

as long you resumed the training using the same batch size, yeah nothing to worry about

edgy tangle
#

Well, I leave it at 4, since my RTX 3050 only has 4GB of VRAM

river trout
analog obsidian
#

but only for adamw if i remember well

#

currently both applio and the fork uses radam

#

and that already does warmup by itself (don't enable warmup epochs in the UI atm, its meant to be used with adamw)

#

besides that uhh

#

its just applio

#

ah and the fork has the mel spectogram similarity metric, i forgot about that 🦈

jaunty shale
#

RVC URL doesn't work apparently.

#

any ideas why?

#

oh I'm not the only one then..

distant hamlet
jaunty shale
#

probably colabs are having issue

#

i guess we have to wait for a bit tehn

dusk sphinx
#

Hina's rvc on colab errors like this also
Timer: 00:03:28/content/voice-changer/server/HVoice.py:3: DeprecationWarning: The distutils package is deprecated and slated for removal in Python 3.12. Use setuptools or check PEP 632 for potential alternatives
from distutils.util import strtobool
Traceback (most recent call last):
File "/content/voice-changer/server/HVoice.py", line 10, in <module>
from downloader.SampleDownloader import downloadInitialSamples
File "/content/voice-changer/server/downloader/SampleDownloader.py", line 12, in <module>
from voice_changer.RVC.RVCModelSlotGenerator import RVCModelSlotGenerator
File "/content/voice-changer/server/voice_changer/RVC/RVCModelSlotGenerator.py", line 4, in <module>
import torch
ModuleNotFoundError: No module named 'torch'
WARNING:pyngrok.process.ngrok:t=2025-02-27T21:20:22+0000 lvl=warn msg="Stopping forwarder" name=http-46499-3f12fb39-2175-40bc-8140-83858962dbee acceptErr="failed to accept connection: Listener closed"
--------- SERVER STOPPED! ---------

simple ore
#

uv is broken on colab, so it does not install nothing

fervent rover
#

So y’all are saying that the mainline Colab is not working?

#

Okay, so I guess I just wait until you guys fixed the mainline Colab y’all

brittle wing
#

-colab

karmic oliveBOT
# brittle wing -colab
📒 Google Colab Notebooks
ℹ️ Note

While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

ashen badger
#

Is there any guide to making ai covers and stuff that’s updated and is there anything I can use for text to speech (I think this is the correct channel, correct if wrong)

weary peak
#

FIxed old issue on collab if you run out of time but it trained enough epochs how do you download the file because all of them are not visible even to the download script

fervent rover
#

Okay Guys, I’m going to test out RVC Mainline since I saw those guys saying that the RVC Mainline Colab Is Not Working, so I’m going to tried it out for myself, I will kept you guys posting for updates to know if it worked or not

#

Okay NeverMind

quaint radish
jaunty mason
#

-rvc

karmic oliveBOT
fervent rover
quaint radish
fervent rover
#

It hasn’t working according to one of em on discord

fervent rover
abstract flame
#

-colab

karmic oliveBOT
# abstract flame -colab
📒 Google Colab Notebooks
ℹ️ Note

While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

rotund zephyr
#

guys is the colab version working?

#

this says it for me

low shard
#

Also Womada isn't rvc

#

RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models

Wokada = uses RVC for realtime inference

cerulean hedge
#

gt 1030

#

i want to convert an audio file into someone else voice

#

i dont need training

low shard
# cerulean hedge i want to convert an audio file into someone else voice

Your Nvidia GPU is good enough to do inference (use models) locally (on ur pc), not the best to train (make models) even if still possible

You can:

  • Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides):
    • Applio: A fork of RVC with some extra features like Applio TTS, kinda faster and simpler but same quality tho
    • Mainline: The original RVC
  • Cloud (remote good pc, easier and faster than ur PC but it's limited):

Easiest possible (automatically separates vocals & instrumentals) : weights.gg
easiest cloud: Ilaria rvc zero
easiest local: Applio

weak cipher
simple ore
#

sample is a bit short though

weak cipher
#

bro

simple ore
#

fish speech, f5-tts, xtts v2 from coqui

#

depends on the language though, most tts are just english and chinese

jade rose
#

how do i make the ai voice more expressive in Applio? Cuz now it only reads the text like a robot. Should I use something else?

low shard
#

the way applio uses it for TTS is because they actually generate an audio first with Microsoft Edge TTS API, then, use that audio as an input in rvc

#

edge tts is multilingual and good quality, but not emotional

jade rose
#

then what should i use

#

I used TTS with a voice model on Applio

low shard
# jade rose then what should i use

There are different Text To Speech (TTS) AIs:

GPT So Vits: RVC isn't as good as GPT So Vits for tts, but gpt so vits (few shot tts, which means needs just a lil training for models) can't use rvc models (and viceversa), and its only limited to: english, chinese & japanese, if you wanna check gpt so vits instead, read https://docs.ai-hub.wtf/tts/gpt-sovits/

Freemium 11labs: Easy way to do TTS is https://elevenlabs.io/, you can't use RVC model on this but its a mostly premium easy way for good quality TTS

FishSpeech: FishSpeech is a 0 shot (no explicit training needed) TTS, if you got a good pc you can use it locally else use their site

You can check TTS in our tts index

With RVC Models:

RVC is natively for Speech To Speech, but forks such as ilaria rvc mainline & applio have built in tts (using Microsoft Edge TTS to make a generated tts audio, which i suggest you to choose a tts model that is the same gender and language of the rvc model you wanna use, and then convert it with rvc)

If you wanna do tts locally with RVC Voice Models (if you got a good pc):

  • You can get Applio in our docs
  • While Ilaria RVC Mainline here (no guide as of right now)

If you don't got a good pc you can do tts with RVC Voice Models on cloud:

#

The best way would be using 11labs tbh, but it's paid

#

else you could give gpt so vits, f5 tts, fish speech, a try

jade rose
#

Ok, thanks

weak cipher
low shard
#

I explained it above and also sent a message about tts

#

check it out

simple ore
lost lagoon
#

RVC S2S

noble jay
#

hello guys am i at the right place to ask a question ?

dusk sphinx
#

Anybody got restricted by Colab ?

low shard
dusk sphinx
#

This account has been blocked from accessing Colab runtimes due to suspected abusive activity. This does not impact access to other Google products. If you believe this action was taken in error, review the usage limits and appeal . @low shard

low shard
low shard
#

U could try the Kaggle or do it locally if u got a good PC gpu

viral flame
#

how to fix

/content/voice-changer/server/HVoice.py:3: DeprecationWarning: The distutils package is deprecated and slated for removal in Python 3.12. Use setuptools or check PEP 632 for potential alternatives
  from distutils.util import strtobool
Traceback (most recent call last):
  File "/content/voice-changer/server/HVoice.py", line 10, in <module>
    from downloader.SampleDownloader import downloadInitialSamples
  File "/content/voice-changer/server/downloader/SampleDownloader.py", line 12, in <module>
    from voice_changer.RVC.RVCModelSlotGenerator import RVCModelSlotGenerator
  File "/content/voice-changer/server/voice_changer/RVC/RVCModelSlotGenerator.py", line 4, in <module>
    import torch
ModuleNotFoundError: No module named 'torch'
WARNING:pyngrok.process.ngrok:t=2025-03-01T01:16:00+0000 lvl=warn msg="Stopping forwarder" name=http-40611-85bb3119-0fa0-4dba-a7f3-4e73487e3dc0 acceptErr="failed to accept connection: Listener closed"
--------- SERVER STOPPED! ---------```
simple ore
#

colab is fked

flint seal
#

does anyone know how to get one speaker from an audio file? the speakers aren't overlapping but i just want a file with one person talking and don't want to do it manually

simple ore
#

who's gonna sort the speakers out if you dont want to do it manually? some magic?

fervent rover
#

Just Asking

rain urchin
fervent rover
paper pasture
#

guys idk which voice changer im using but im assuming its this one

#

start_http is taking too long to load

#

anyone know why

rain urchin
simple ore
#

uv is messed up

#

also there's no compatible version of faiss-cpu for python 3.11

#

you either need to downgrade the environment to 3.10 or change the version to install to the one supporting 3.11

#

1.7.4 supports 3.11

#

@rain urchin

fading vault
#

how to start in kaggle

tame mica
#

ctrl + k

hallow thistle
low shard
low shard
hallow thistle
#

A lot of people keep mistaken RVC for realtime voice changer.

low shard
hallow thistle
#

I've been telling people to go to #🔍│help-w-okada if they wanna talk anything about W-Okada, for many times. Sure, I get it not everyone knows what RVC and W-Okada the realtime voice changer even are. But if they read more enough words instead of just one line, they should've been able to figure it out by themselves.

weak cipher
#

Have any of guys tried Livekit?

halcyon barn
#

Hi I'm getting this error "ModuleNotFoundError: No module named 'gradio" on chrome browser, how to fix this?

low shard
#

You can check your pc gpu via:
ctrl+shift+esc (task manager) -> Performance tab -> GPUYou can check your pc gpu via:
ctrl+shift+esc (task manager) -> Performance tab -> GPU

halcyon barn
#

I am using Hina_Mod_AICoverGen_colab on chrome browser windows 10

low shard
#

also I asked for the PC GPU not the Operative System nor browser

halcyon barn
#

any replacement of that?

low shard
#

if you got a good pc gpu you could run RVC locally (runs on your pc) instead of cloud (runs on remote good pc) services like colab

low shard
#

read it up

halcyon barn
#

Cool

brave swallow
#

guys no matter what I keep getting that noise in the end !
I'm using applio rvc inference

brave swallow
#

what could be I'm doing wrong

simple ore
#

did you clean it up?

#

did you clean the inference audio?

brave swallow
#

you can notice here this part fixed but the other broken after changing Pitch extraction algorithm

simple ore
#

post the source audio for inference

brave swallow
#

1 sec

simple ore
brave swallow
#

it seems like audacity

simple ore
#

there's some blip at the end

#

so it gets inferred into something weird

brave swallow
#

is it fixable or nope

simple ore
#

it is separated vocal or something?

brave swallow
simple ore
#

well, bad separation then

#

use mvsep

#

and if the source before separation was mp3 then it is even worse

brave swallow
simple ore
#

holes say it was lossy compression

brave swallow
brave swallow
simple ore
#

audacity spectrogram view

brave swallow
simple ore
#

why the files are not matching?

brave swallow
#

before inference

simple ore
#

well, whatever method you used for cleaning did mess it up and added that blip at the end

brave swallow
#

gotcha

#

then I need to be more carefull

knotty moth
simple ore
#

becuase audio_1 is shorter

#

silences removed

brave swallow
#

what should I look for while cleaning

simple ore
#

but why would you remove silences for inference?

brave swallow
#

or what is the best example for good cleaning

brave swallow
simple ore
#

yeah, use the proper cleaning method

#

not audacity

knotty moth
brave swallow
brave swallow
#

mvsep ?

#

UVR5 ?

simple ore
#

denoise_mel_band_roformer_aufr33_aggr_sdr_27.9768.ckpt

simple ore
#

just place it into models

#

and then select on UI

swift patio
#

how to make it compatable

brave swallow
swift patio
#

like mp4

#

or whatever

#

for voice changer to work

low shard
#

also, this isn't the right channel, I literally explained oyu everything 2 mins ago in #🔍│help-w-okada

swift patio
#

ohh

#

i realized when u pinged me

knotty moth
brave swallow
brave swallow
#

congrats 😄

brave swallow
#

almost 2H

brave swallow
#

idk if there's way to make it use gpu too

#

do I need to run this clean up on both , training dataset and inference audio ?

#

so I have to train my module in RVC over ?

brave swallow
#

for some reason it keeps running over cpu

simple ore
#

may need to replace it with cuda version

#

requirements does not have cuda index, so it most likely got 2.0.1 cpu installed

weak cipher
#

Guys I want anime tts to have a free api, is that possible?

brave swallow
simple ore
brave swallow
#

import torch
torch.cuda.is_available()
#True
torch.cuda.device_count()
#1
torch.cuda.current_device()
#0
torch.cuda.get_device_name(0)

simple ore
#

activate the environment

#

then pip install cuda torch

brave swallow
simple ore
#

whatever that is

brave swallow
#

ohhhh

#

one sec

simple ore
#

pip install torch==2.3.1 torchaudio==2.3.1 --upgrade --index-url https://download.pytorch.org/whl/cu121

brave swallow
#

the env must be activated by default

#

it seems I have newer version !?

simple ore
#

torchvision is higher, but it does not matter

#

it is only used for images

brave swallow
#

oh cool then

simple ore
#

but now you have cuda torch and torchaudio

brave swallow
#

yep its seems it works

#

cuda:0

#

thanks alot for helping

#

I hope it works and fix my issues

#

I been trying for long time

#

does this looks better ?

#

meh same issue still

crude bolt
#

where can i find pretrains

crude bolt
#

hm we are looking for an arabic pretrain

#

esp singing

#

found : Rigel

#

trying ;3

analog obsidian
crude bolt
#

you got a better 1 ?

crude bolt
#

hm ... not sure if refinegan is at its current state actually usable. No experience with it

analog obsidian
#

refine has better singing range than hifi

crude bolt
#

yea i understand.

#

Thanks for elaborating 🙂

analog obsidian
#

buuuut i don't personally use it because of the electric/metallic sound it gives to models

crude bolt
#

maybe in future those kinks will be ironned out.

brave swallow
analog obsidian
simple ore
#

i'm running another model test right now

#

one without adding noise to the generator

brave swallow
#

interesting

analog obsidian
iron cobalt
#

How can I prevent Ngrok from exceeding Data Transfer Out monthly limit using Applio on Kaggle? I'm facing only problems with training my voice models via this
idk how to decrease inbound connection volume without having to upgrade my account plan for additional capacity on ngrok

low shard
iron cobalt
#

For unknown reasons for me more than 300 MB data transferred out.

#

And this is just in early March 2025

reef pier
#

i'm planning on getting an rtx 4070 pc

#

will it work for training and inference?

pastel oak
reef pier
#

btw for those that have a gtx 1650 (like i do): inference will work. very slowly, though.

simple ore
reef pier
knotty bridge
#

for example here

#

how can i use her voice for my phrase? @low shard

low shard
#

You can just click create

summer mirage
#

hi, i'm training some models but my results are bad. can someone help me? this are my settings. i also tried to edit my pitch but that don't make sense.

compact tapir
#

cual es el mejor voice changer?

summer mirage
#

no

low shard
#

Mangio Fork is a fork (modified version) of Mainline RVC (the original project) which has been discontinued since 2023

summer mirage
#

oh

low shard
#

absolutely delete that, and never look youtube tuts for RVC/Wokada

#

what's your pc gpu?

summer mirage
#

what should i use then?

low shard
low shard
summer mirage
#

a nvidia rtx 3060

low shard
# summer mirage a nvidia rtx 3060

As you got a good PC, you can use RVC locally, you can choose between:

  • Applio: A fork of RVC with some extra features like Applio TTS, kinda faster and simpler but same quality tho
  • Mainline: The original RVC
#

I would personally suggest Applio

summer mirage
#

what is the difrence?

low shard
summer mirage
#

yes

low shard
#

Mainline is the original project of RVC

Applio is a fork, with an easier user interface that gets more updates

#

basically Applio is more maintained

summer mirage
#

Okey! i will try that! thank you

analog obsidian
# summer mirage what is the difrence?

mainline havent got a real update since 2 years (recent ones has been dependency fixes and not training related stuff)
applio in the other hand has new updated training code which can give faster and better results than mainline, its constantly getting new updates

summer mirage
#

a post says that i need to install it on my ssd, why is that?

analog obsidian
low shard
#

SSDs 🙏

analog obsidian
#

rvc has to write two big files during training, if you train in your HDD is going to take a couple of seconds to write them (around 4 seconds)

#

but on a ssd is almost instant

#

it doesn't slow training speed but slows down the process a bit (basically the training will pause everytime the two big files are overwritten)

#

bc it has to wait until the files are written
besides that, it works just fine in a hdd

low shard
#

explained poorly misc_trolley

summer mirage
#

i got this error after installing succesfull

tight beacon
#

Hi there, so I have trained some voices for TTS in the past, but I was thinking of trying to train the same voices for use in w-okada to use in DnD games. However the guide seems to say you need hours of clean voice samples for it to work... is this still the case?
(Most of my recent decent TTS ones have been done with about a minute of audio or less, mostly cos there isn't more than that available haha)

unique rock
#

how do i use kaggle?

analog obsidian
low shard
low shard
tight beacon
unique rock
low shard
crisp lynx
#

my output doesnt sound like the models. i have a 4070 super and i7 14700k. idk why its doing this

#

are my specs good enought to run this

knotty moth
knotty moth
low shard
#

Just in phones misc_trolley

#

-# more correctly, 128gb on phones

knotty moth
low shard
#

I was just talking about storage capacity lol

#

Anyways, it's just a reddit meme I googled 😭

knotty moth
#

it sounds like mid 10's

unique rock
#

Well, I'm using the Kaggle interface, haha, to use Applio within it, this GPU T4x2 thing.

low shard
#

-# yes I use reddit not only to search random shi

knotty moth
low shard
knotty moth
hallow thistle
#

With the similar price, you'll get either a SSD with lower capacity but faster speed or a hard drive with larger capacity but lower speed. yt_nails

brittle wing
#

-colab

karmic oliveBOT
# brittle wing -colab
📒 Google Colab Notebooks
ℹ️ Note

While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

brittle wing
#

-rvc

karmic oliveBOT
brittle wing
#

-colab

karmic oliveBOT
# brittle wing -colab
📒 Google Colab Notebooks
ℹ️ Note

While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

carmine siren
#

-kagle

#

-kaggle

karmic oliveBOT
# carmine siren -kaggle
📘 Kaggle Notebooks

Note: Kaggle limits GPU usage to 30 hours per week.

worthy quartz
hallow thistle
cold cave
#

I've been waiting for 6 minutes and it didn't even start training

worthy quartz
spice siren
#

If I cancel my annual membership, will I get my money back?

knotty moth
cold cave
worthy quartz
#

Does anyone know how to use applio if so can I pm you because I’m a bit confused lol

cold cave
#

It's not that, its the custom_pretrained button and had to turn off pretrained

hallow thistle
knotty moth
knotty moth
worthy quartz
#

Does anyone know how to use applio if so can I pm you because I’m a bit confused lol

hallow thistle
worthy quartz
#

Thought no one saw sorry 😬

hallow thistle
spice siren
#

Want to talk to the website owner

worthy quartz
#

Okay so I’m trying to upload an existing voice model that I already made to applio so I can make an ai cover of a song but I do not know where to put everything at.

hallow thistle
distant turtle
#

-colab

karmic oliveBOT
# distant turtle -colab
📒 Google Colab Notebooks
ℹ️ Note

While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

lethal shale
#

Heloooo! I finally made my own female model and it sounds natural and amazing. But then I made singer's voice and it is metallic. I researched and found out it is because sample frequency rate.

#

There was a website to learn the frequency rate of an audio file. Can someone send me the link?

mint yew
#

what is good chunk and extra settings

lusty sun
#

Can anyone guide me how to install the voice model once I have downloaded it from voice-models .com ? I am using tortoise tts btw

low shard
#

@brittle wing @carmine siren please use #🤖│bots for using multiple commands

low shard
#

The only thing you could do, is make an audio with tortoise tts, then use that as an input in an RVC like Mainline or Applio

#

But tortoise is pretty old

#

There are different Text To Speech (TTS) AIs:

GPT So Vits: RVC isn't as good as GPT So Vits for tts, but gpt so vits (few shot tts, which means needs just a lil training for models) can't use rvc models (and viceversa), and its only limited to: english, chinese & japanese, if you wanna check gpt so vits instead, read https://docs.ai-hub.wtf/tts/gpt-sovits/

Freemium 11labs: Easy way to do TTS is https://elevenlabs.io/, you can't use RVC model on this but its a mostly premium easy way for good quality TTS

FishSpeech: FishSpeech is a 0 shot (no explicit training needed) TTS, if you got a good pc you can use it locally else use their site

You can check TTS in our tts index

With RVC Models:

RVC is natively for Speech To Speech, but forks such as ilaria rvc mainline & applio have built in tts (using Microsoft Edge TTS to make a generated tts audio, which i suggest you to choose a tts model that is the same gender and language of the rvc model you wanna use, and then convert it with rvc)

If you wanna do tts locally with RVC Voice Models (if you got a good pc):

  • You can get Applio in our docs
  • While Ilaria RVC Mainline here (no guide as of right now)

If you don't got a good pc you can do tts with RVC Voice Models on cloud:

#
  • You could try another tts from our tts index and use the output as an input in rvc
#

What's your PC GPU btw

low shard
# mint yew what is good chunk and extra settings

Wrong channel

RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models

Wokada = uses RVC for realtime inference

Show a screenshot of your WOKADA in #🔍│help-w-okada and be sure to not follow yt tuts

low shard
low shard
low shard
#

Also I just noticed you got a TCOAAL pfp, I played it too lol

lusty sun
#

Thanks man

lusty sun
low shard
#

Rtx 4060? Yeah you're good then

lusty sun
#

Yeah

lusty sun
low shard
eager crown
#

-colab

karmic oliveBOT
# eager crown -colab
📒 Google Colab Notebooks
ℹ️ Note

While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

maiden idol
#

Hi Help RVC Members,

I hope you’re all doing well. My name is Vikha, and I’ve been exploring voice conversion using Retrieval-Based Voice Conversion (RVC). I encountered an issue while trying to merge two PTH models—one trained for 200 epochs and the other for 150 epochs—into a 50-50 balanced blend. However, the resulting audio quality didn’t meet my expectations, and I’m unsure why the quality degraded despite the models being quite close in training epochs.

I’ve been experimenting with various fusion approaches, but I haven’t been able to achieve the desired results. I’m reaching out to you because I came across your profile and noticed your work in this field. I believe your insights could help me understand the potential issues that might be causing the problem in my fusion process.

If you have experience working with similar models or have any suggestions on improving the process, I would be extremely grateful for your guidance. Additionally, any resources, tutorials, or techniques you could share would be invaluable as I continue troubleshooting.

Thank you so much for taking the time to read this message. I hope we can connect!

stuck crypt
#

Hey bro, I need help with this issue.
When I input VB-Cable into Discord or other apps, my voice becomes choppy and sounds weird.
Does anyone know how to fix it? 😭

low shard
formal wind
#

I've seen so many tips on how to detect overtraining but I have no idea what is most effective. I've read the tutorials but I just want to be sure yk yk?

analog obsidian
#

compare them

#

overtraining is very easy to hear

#

the model starts to sound robotic

formal wind
#

Thats what I used to do but It never really turned out well

formal wind
#

I just don't have a good ear when it comes to listening for overtraining

crude mist
#

hi, why does like every rvc model sound so weird when laughing and whats the solution?

crude mist
#

so thats normal and theres no fix?

analog obsidian
crude mist
#

damn cuh thats unfortunate

crude flame
crude mist
#

I guess I'll rp as a mentally unstable egirl with missing laugh muscles

analog obsidian
#

lmao

crude mist
#

lemme ask chatgpt what the condition is that makes u not able to laugh

#

Akinetic Mutism yup I got that

#

thats me

#

been had that

marble forge
#

anyone know why rvc won't launch?:

low shard
#

mangio rvc fork is discontinued since 2023

#

what's your pc gpu?

marble forge
#

rtx 3060

#

which one should i use instead

simple ore
marble forge
#

i have the 12gb one

simple ore
#

can train stuff

marble forge
simple ore
#

do you need a realtime voice changer or trainer/voice changer for files?

marble forge
#

wdym

#

i dont even know the new one what is the new one

simple ore
#

you tried running

#

that's realtime voice changer

marble forge
#

yea ik im saying what is the new one

simple ore
#

that changes voices for calls/discrodd

marble forge
#

the one i have is hella old

simple ore
marble forge
#

is that the official one?

simple ore
#

no, it is an optimized fork

#

official one is old and crappy

marble forge
#

oh

#

best settings for it @simple ore

marble forge
#

wait it isnt real time

#

i want a realtime one

#

@simple ore

#

you got one or no?

#

also the one you sent doesnt allow pitch change

low shard
low shard
#

wokada is a program which is better than the mainline rvc realtime, which is better than the mangio fork rvc realtime

#

and the deiteris fork is better than the original wokada

marble forge
#

well idk why but its sounds really bad

#

how do i fix it

low shard
cold cave
#

Everytime I use ApplioNoUI, my storage keeps getting full instantly.

#

That's because all these G and D files keep duplicating

fervent rover
#

Is The RVC Mainline Colab Working?

#

Just Asking

low shard
#

tbh just use Appllio meanwhile

fervent rover
#

I just play the waiting game, I guess

low shard
#

I don't really understand what's wrong with it, but your choice

fervent rover
#

Okay

#

There’s nothing wrong with it, but I prefer more on doing RVC Mainline Colab

low shard
#

alright

outer isle
#

If I train a singing voice model will it contains another dataset the same as RVC2 Disconnected on Applio?

simple ore
outer isle
simple ore
#

nobody in the right mind would do that

#

pretrain provides a base for your voice model

polar sage
#

Hello, can anyone tell me which colab they currently use to make AI COVERS?

warm oar
#

-colab

karmic oliveBOT
# warm oar -colab
📒 Google Colab Notebooks
ℹ️ Note

While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

weary peak
#

sorry i keep asking for help, trying to retrain on collab, getting this error
NameError Traceback (most recent call last)
<ipython-input-2-5294ebea29b0> in <cell line: 0>()
26 print('Paste model link and try again!')
27
---> 28 if not os.path.exists(f'/content/sample_data/{Model_Name}.tar.gz'):
29 print("File not found.")
30 else:

NameError: name 'os' is not defined

mild sleet
#

Imagine there was a way you could train accents with a model

#

like for different artists

#

like an ai to train not just voice but another to train accents

knotty moth
simple ore
hallow thistle
knotty moth
# low shard <#1159380240271953940>

it doesn't always mean to relate the problem. the colab author might forget to include import os in the cell unless the previously run cell that contains import os failed (in that case yea might relate to the problem you state)

#

still I suggest him adding import os to see if it works or not

low shard
white bough
#

Would you guys recommend to keep whispering in the model or would it mess with the training? I am afraid that if I do that, the whispering tone/voice will come out when it is not supposed to...

cerulean cedar
#

Does anyone know what the "assets/hubert/hubert_base.pt" file is? When I run the command "python gui_v1.py " through cmd it writes to me that this file was not found, and it is. It's not there, where can I find it, and what is it?

low shard
#

you're using rvc realtime from the mainline/original project, which is worse than wokada, which is worse than the wokada deiteris fork

outer isle
#

I don’t wanna do the pre train, I want to train fresh

#

-Colab

karmic oliveBOT
# outer isle -Colab
📒 Google Colab Notebooks
ℹ️ Note

While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

simple ore
#

with multiple speakers and some amazing variety of the content?

outer isle
#

Yes and no

brave swallow
#

does 48Khz has more noise than 32Khz ?

mild sleet
mild sleet
#

the index stores the accent of the model you trained on RVC.

hallow thistle
grand solstice
#

May I use a zip file with .index/.npy/.pth(I downloaded somewhere... it's a rvc model zip) to generate modification? Also, on weights.com, if a model isn't exaggerated as I expected, should I use it multiple times to enhance the modification on voice? or is there a better way to make the modification within the website?

#

somehow this channel doesnt allow me to upload a image😂

patent pasture
brave swallow
low shard
knotty moth
#

the .npy file is an intermediate file produced during index training in mainline rvc, not the final result, so better remove it to reduce the file size

grand solstice
#

this is where i downloaded from@low shard

wheat lion
#

what up lads, is anyone else running RVC on arch using rocm? I'm facing a few strange issues, this is my full log, from start up of the web UI to trying to process a vocals file:

https://termbin.com/r0p9

wheat lion
#

i'll give it a go

#

ty

#

yeah that's a problem, I think that version of torch is incompatible with the latest hip runtime (ImportError: libamdhip64.so: cannot enable executable stack as shared object requires: Invalid argument)

i'll try and see if I can get a docker container going

raven condor
#

In Applio where do i select refinegan as the architecture to train a model with? i've downloaded some pre-trained refinegan models, matched the sample rate to the dataset but when attempting to train i get: The parameters of the pretrain model such as the sample rate or architecture do not match the selected model.

simple ore
grand solstice
#

and i just figured out how to use weights.com/... today earilier it did not change my voice after I uploaded wav, thats why i kept wondering if there is another alternative for rvc

wheat lion
#

I think I discovered the issue, my GPU is supported by rocm but only windows, wtf ayymd

low shard
wheat lion
#

I am aware, some things never change

low shard
wheat lion
#

yeah i tried mainline, i'm trying applio on windows now

low shard
wheat lion
#

it does but I assume the same problems would be there, the issue seems to reside in the hip runtime just not supporting the 7800 XT properly on linux

low shard
#

Nvidia is better than AMD, the only issue is their prices

low shard
simple ore
#

7800xt has options - HIP SDK + Zluda + patched cuda torch on Windows, WSL2, ROCM on Linux

wheat lion
#

I've tried ROCM on linux, but I do get a random segfault when doing the actual conversion, I've found other people who make their own ML projects having random segfaults as well, so I reckon it's an issue with their stuff

simple ore
#

yeah, more or less AMD fully supports only their top of the line GPU

#

can use Zluda on windows, should be fine

#

I think for linux you gotta use ROCM5.7, then set environment variable HSA_OVERRIDE_GFX_VERSION=11.0.0

simple ore
brave swallow
#

which is better for rcv training ?

#

red one or the other one

fathom raft
#

Guys

#

Why mine's not working?

#

The audio

brave swallow
modest vector
#

Yo,
I’m looking for a good way to create a realistic AI voice, but I don’t know what to use or how to set it up to sound natural. Any tips?

frank olive
#

hello link rvc?

low shard
modest vector
low shard
# modest vector Rtx 3070

As you got a good PC, you can use RVC locally, you can choose between:

  • Applio: A fork of RVC with some extra features like Applio TTS, kinda faster and simpler but same quality tho
  • Mainline: The original RVC
modest vector
low shard
slender hearth
#

Please how can earn iq points in this channel

slender hearth
low shard
slender hearth
low shard
slender hearth
cold cave
#

How do I fix this?

#

I'm using Mainline Colab

simple ore
#

mainline has not been fixed yet

oblique venture
simple ore
#

depends on the implementation

#

but it either raises f0 values by some amount or also nudges them to match regular note frequencies

merry eagle
#

@acoustic scarab can you give me google colab web rvc links?

low shard
#

he can’t help much on cloud

#

first, what’s your pc gpu and what do you want to do? To check if you got a good enough pc

merry eagle
low shard
merry eagle
low shard
merry eagle
low shard
# merry eagle both btw i have the realtime one i need only pre recorded

Your Nvidia GPU is good enough to do inference (use models) locally (on ur pc), not the best to train (make models) even if still possible

You can:

  • Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides):
    • Applio: A fork of RVC with some extra features like Applio TTS, kinda faster and simpler but same quality tho
    • Mainline: The original RVC
  • Cloud (remote good pc, easier and faster than ur PC but it's limited):

Easiest possible (automatically separates vocals & instrumentals) : weights.gg
easiest cloud: Ilaria rvc zero
easiest local: Applio

#

gave you both local and cloud ways

merry eagle
low shard
# merry eagle how can i make models do you have any guide?

Train (make) RVC Models on cloud:

  1. Prepare the Dataset
  2. Setup RVC:
    Choose a cloud way to use RVC,
  • Google Colabs (4 hours of daily gpu for free, not much hours, but easy to use):
  • Kaggles (a bit harder to use and needs phone number but gives 30 hours weekly of better gpus):
  • Lightning.ai (Kinda hard, needs login, no issue with web uis or anything, but only free 15 credits monthly):
  1. Be sure to know about the tensorboard

Google Colab = Easier but risk of getting disconnected
Kaggle = Harder but way more gpu time
If you are looking for the easiest way and for free, is using https://weights.com which ofc uses RVC

RVC Inference (use models) on pre-recorded audio on Cloud

You can use either:

merry eagle
low shard
low shard
merry eagle
low shard
merry eagle
hallow thistle
knotty moth
low shard
jaunty shale
#

I tried to use mainline kaggle, but it gives me the same issue like in colab.

#

been trying to figure this out for days now

#

nvm i got it

pseudo steppe
#

i need best rvc

low shard
pseudo steppe
#

ram 16gb

low shard
pseudo steppe
low shard
# pseudo steppe yeah

Wrong channel then
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models

Wokada = uses RVC for realtime inference

knotty moth
low shard
pseudo steppe
#

ohh i though it is real time voice changer

pseudo steppe
low shard
low shard
#

The great majority of AI programs use web uis

#

Use that channel

pseudo steppe
#

earlier there was one

low shard
pseudo steppe
#

can i get that?

low shard
#

The only difference is the User Interface

pseudo steppe
low shard
#

Only the old original wokada which is way worse uses the web user interface in it's own window

#

Don't use this channel

knotty moth
hallow thistle
hallow thistle
#

That's what I can say if you want the app version of it.

pseudo steppe
#

nah earlier there was an app

#

that's why i asked

merry eagle
sudden cave
#

mine neither

low shard
hallow thistle
hallow thistle
#

No, thanks.

#

I leave all the time.

stable remnant
#

i cannot start server using ngrok it allways says "server stopped"

low shard
hallow thistle
stable remnant
#

is there any guide how to run kaggle?

broken urchin
#

which tool should i use to make a high quality voice model?

low shard
broken urchin
low shard
low shard
broken urchin
low shard
# broken urchin no no, desktop

Your Nvidia GPU is good enough to do inference (use models) locally (on ur pc), not the best to train (make models) even if still possible

You can:

  • Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides):
    • Applio: A fork of RVC with some extra features like Applio TTS, kinda faster and simpler but same quality tho
    • Mainline: The original RVC
  • Cloud (remote good pc, easier and faster than ur PC but it's limited):

Easiest possible (automatically separates vocals & instrumentals) : weights.gg
easiest cloud: Ilaria rvc zero
easiest local: Applio

#

Train (make) RVC Models on cloud:

  1. Prepare the Dataset
  2. Setup RVC:
    Choose a cloud way to use RVC,
  • Google Colabs (4 hours of daily gpu for free, not much hours, but easy to use):
  • Kaggles (a bit harder to use and needs phone number but gives 30 hours weekly of better gpus):
  • Lightning.ai (Kinda hard, needs login, no issue with web uis or anything, but only free 15 credits monthly):
  1. Be sure to know about the tensorboard

Google Colab = Easier but risk of getting disconnected
Kaggle = Harder but way more gpu time
If you are looking for the easiest way and for free, is using https://weights.com/ which ofc uses RVC

RVC Inference (use models) on pre-recorded audio on Cloud

You can use either:

broken urchin
#

i have that downloaded

low shard
#

Don't follow YouTube tutorials

#

That program isn't maintained since 2023 and is one of the oldest RVCs you could use

broken urchin
#

so its bad

low shard
#

We even removed it from our docs, the creator doesn't maintain it nor fixes any bug, don't use it at all

#

@broken urchin just read what I told you, I gave you all the options

broken urchin
#

yeah i read it thanks

low shard
#

Yw

knotty moth
broken urchin
carmine hearth
#

I'm using machine translation. I apologize if the sentences are awkward.
Dear Ai Hub intelligentsia, do you have any guesses as to why people who were once using RVC mainline or Mangio fork feel that Applio sucks after using it? It's hard for me to understand what exactly is wrong with them, because most of the people who make this claim usually treat me as a brainless worshipper of applio, or are so inexperienced with Applio (they're new to it) that they blame the problem on Applio as a whole, rather than on some feature of Applio.
One thing I can be sure of is that they are having an experience that makes them feel that the output from Applio is clearly inferior to the RVC mainline. Does anyone know why? I feel bad for them that they are giving up the conveniences that Applio's developers have worked so hard to create and either going back to the mainline or giving up on using RVC altogether.

civic kelp
#

How do I train a voice with TITAN? I have RVC training software already installed but the models I have are until rmvpe

analog obsidian
#

training is better tho

low shard
#

what rvc did you download and what's your pc gpu?

verbal widget
#

-Colab

karmic oliveBOT
# verbal widget -Colab
📒 Google Colab Notebooks
ℹ️ Note

While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

low shard
#

@crude flame is there really no newer precompiled for rvc mainline?

#

atp shouldn't the docs explain how to do it via source?

#

@tough fiber if you want you could try applio, which is a more updated fork of mainline rvc

tough fiber
#

on local

crude flame
civic kelp
low shard
#

As you got a good PC, you can use RVC locally, you can choose between:

  • Applio: A fork of RVC with some extra features like Applio TTS, kinda faster and simpler but same quality tho
  • Mainline: The original RVC
#

It's better you get Applio

#

also, in the docs it will be explained how to use pretrains

fringe summit
#

yoo i need help

low shard
patent trellisBOT
# low shard !howtoask

How To Troubleshoot AIHC_WaitWhat

__**GIVE CONTEXT.**__ 📝
  • Don't simply mention your issue, like "my rvc is not working".
  • Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
  • The more context, the better.
__**BE POLITE.**__ <:matsuripray:1159685390156967936>
  • Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
  • It's okay if you're frustrated, but don't take it into this server.
  • Don't DM without prior consent.
__**BE PRODUCTIVE.**__ 🤝
  • Don't ask for every little instruction. Put your own effort & test things by yourself.
  • Don't ask to ask.
  • Check if your answer is a Google search away/on our guides website.
low shard
civic kelp
low shard
#

you should delete it and get something newer and supported like Applio

civic kelp
#

Noted. Thank you so much for the info again.

valid spruce
#

Could someone help me make the models' breathing more natural without that robotic sound?

crude flame
valid spruce
#

Should I do these one after the other?

crude flame
valid spruce
#

Okay, I always remove it because they say it adds noise to the model.

knotty moth
formal wind
#

How do I get the model master role?

formal wind
#

I'll check it out thanks!

sonic prawn
#

Can I use RVC in python code? I want to automate something using python, I generate text using LLM then TTS using RVC

dull plume
#

why does the perf thing not appear on my voice changer client

low shard
#

@dull plume
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models

Wokada = uses RVC for realtime inference

simple ore
lean hornet
low shard
# lean hornet https://www.instagram.com/reel/DG0cTHIyd78/?igsh=d2t3N3c1YjB4Zm9r Is there a voi...

You can search rvc ai voice models at:

if there isnt one, you can:

earnest muskBOT
twilit forge
#

i was afk while my model trained the and the runtime after it finished where can i find the pth file

low shard
#

You might have lost it, check if there's any files on Google drive

twilit forge
night rune
low shard
low shard
# night rune Help?

Elaborate:

  • ur PC GPU
  • what guide are u using
  • what u did step by step
  • what do u want to do
night rune
#

To train a new voice model

unique rock
low shard
night rune
low shard
# unique rock

I also just got a GitHub issue with the same issue for my facefusion online ports....

#

Welp, new cloud issue, I gotta check this

nocturne mural
# night rune Help?

I will check the error although it is most likely an internal gradio problem.

low shard
#

@night rune btw I asked ur PC GPU because if it's good enough you can do it locally without relaying on cloud

night rune
low shard
knotty moth
# night rune

try uploading the dataset to applio/assets/datasets through the (imjoy) file manager

night rune
night rune
brittle wing
#

Can anyone tell me which one I download from git? I don't have a video card.

low shard
night rune
#

Why does my browser say the files are infected?

pastel oak
#

So it gives a warning in case

low shard
heavy arrow
#

ik hina's not working rn, is there a webui thats currently working without many problems? my pc is probably not good enough to do anything on my computer, i use a gtx 1660 ti.

low shard
#

Are you going to use wokada for games? And if so, which?

heavy arrow
#

im creating AI content trying to make a model for transforming instrumentals into beatboxes

#

i already got the model created now though, i just need to run it on my computer, its trained on applio rvc
looking at the documentation its actually remarkable thinking that i could use this for real-time voice changing. the only thing ive seen like this is that really shitty voice.ai app

low shard
#

You need to elaborate always what you are using, hina created many things

#

nvm you mean realtime, so you were talking about the hina mod original wokada

#

then yeah, you got also the wrong channel

#

RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models

Wokada = uses RVC for realtime inference

heavy arrow
#

after searching up just 'hina' i realized that a little too late haha ^^;; i was talking about the rvc one, just mentioned realtime because i saw it come up in what i was reading 🙏

#

assuming i cant use wokada for ai covers or changing instrumentals to beatboxes (realtime), would any rvc fork still work with my GTX 1660 Ti? @low shard

analog obsidian
brittle wing
#

-colab

karmic oliveBOT
# brittle wing -colab
📒 Google Colab Notebooks
ℹ️ Note

While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

low shard
#

please elaborate next time

#

I just tried to guess the most probable one

heavy arrow
#

of course 🙏 my fault original gangster. i got you confused twice in a row because i wasnt explaining it right @_@

low shard
# heavy arrow assuming i cant use wokada for ai covers or changing instrumentals to beatboxes ...

Your Nvidia GPU is good enough to do inference (use models) locally (on ur pc), not the best to train (make models) even if still possible

You can:

  • Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides):
    • Applio: A fork of RVC with some extra features like Applio TTS, kinda faster and simpler but same quality tho
    • Mainline: The original RVC
  • Cloud (remote good pc, easier and faster than ur PC but it's limited):

Easiest possible (automatically separates vocals & instrumentals) : weights.gg
easiest cloud: Ilaria rvc zero
easiest local: Applio

low shard
heavy arrow
brittle wing
low shard
low shard
brittle wing
#

I just needed applio no UI to see the code of one cell

oblique venture
#

Is it ok if i put most of the dataset normal speech (20-30mins) instead of singing?

night rune
#

Is that good or bad?

#

Batch Size 6

#

Dataset 2 min

analog obsidian
heavy arrow
#

damnn applio with my gtx 1660 ti is actually faster than it was in a collab

night rune
heavy arrow
#

with applio i gotta separate all the audio first, right? its been a cpl years since ive used a normal fork that doesnt split the instrumental for you

analog obsidian
heavy arrow
#

oh fr?? thats sick
does it keep the instrumental file?

analog obsidian
#

yes but

#

are you aware this is for speech for speech

heavy arrow
#

yes. im using it to turn instrumentals into beatbox

#

it sounds cooler than you think trust me

#

i got a herbert the pervert model merged for beatboxing and it goes CRAZY

analog obsidian
#

yes im aware this thing can clone instruments but still lol

woeful wave
#

guys i wanna start making songs using ai what platform or what should i download to do this and also do i put raw vocals or mixed and instrumentals or not ???

broken crane
#

anyone know why my voice changer dosent work??

knotty moth
broken crane
low shard
#

RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models

Wokada = uses RVC for realtime inference

#

@broken crane elaborate:

  • ur pc gpu
  • what guide did u follow (I hope not those old ass youtube tuts)
  • the issue

in #🔍│help-w-okada

heavy arrow
heavy arrow
#

i ended up just getting what i know works from a while ago, de-echo and de-reverb by foxjoy and kim vocal 2

#

its a LOT faster on my computer than on a colab, i thought itd be slow since i got a gtx 1660 ti ngl but it takes 12 seconds for a 3 minute long audio file

#

also, is there a local version of the AICoverGen? id like to get that one as well for when im not doing the beatbox thingymabobber

brave swallow
#

what could be the reason for the letter changing

dusk rock
#

hello, it's just me or aicovergen got error on google colab?

heavy arrow
#

i couldnt even get it working locally

dusk rock
heavy arrow
#

i tried another one and this one works locally and has a colab available incase you wanted to try it instead:

#

it has youtube links working again too ^^ woot woot

heavy arrow
#

of course JoePray2 gl gl

glass igloo
#

Hi, can you tell me what files I need to upload to share a model I have trained with other users?

low shard
broken crane
#

can someone help me, when in call and my voice changer is on, whoever is talking can hear there self through there voice

#

@low shard ??

low shard
#

You can't expect me to help without any type of info

frozen ledge
#

-colab

karmic oliveBOT
# frozen ledge -colab
📒 Google Colab Notebooks
ℹ️ Note

While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

formal wind
#

Is it hard to get model master. Because i feel like if I don't get it, its because of my dataset cleaning process.
Oh also, how will I go about knowing how I did. Like will someone message me

crude flame
formal wind
#

Yeah last time I tried it was through a discord bot lol

kindred eagle
#

i wanna know what is the best pretrain model to train a voice on.

glacial pollen
#

Original pretrains, klm hifigan or those experimental refinegan ones is what I personally can recommend.
Best for you to just try and see

#

But as always, it's a good habit to start with original ones and only try customs if the results aren't good enough ( and after you made sure you've exhausted your opnions; aka, it's not user-error )

kindred eagle
#

like i've tried orignal, ov2super and rigel, sonce my data has like mixed languages, i tried it on different voices and like it worked but it had tearing or artificats in it.

#

i dunno what do i do to make my dataset to sound better

#

so other than those 3 pretrain are there any which has like low artifact rate and can produce a better voice model quality

#

and also do I use the rvc disconnected colab (as its says it's outdated) or the mainline colab?

knotty moth
kindred eagle
#

DAMN

#

so ov2super it is

knotty moth
glacial pollen
#

🤔

kindred eagle
#

nope nope had tried the orignal as well

#

will try ov2 then compare

#

and then do the rest............

glacial pollen
#

Well, you can start away with ov2 and or klm if you want
but you see, people at times go for what seem " the best " or is the most recommended

#

without testing stuff, and AI is, well, it ain't deterministic in that way

#

However, I wouldn't bother with other models than those I wrote about

#

Most if not all customs at the time were trained on fp16 with exploded gradients ( simply put, aren't that stable

kindred eagle
#

hmmmmm, alright then i will actually compare the 3 models you have mentioned, compare them and will proceed with what i think is the best among them.

glacial pollen
#

Yup, the right approach

kindred eagle
#

ty for the suggestions @glacial pollen

glacial pollen
#

quality needs some decent work, that's just how it is

#

Yea np man

#

best of luck and take your time

glacial pollen
#

Just to encourage you to not give up, my best model took me few months ~lmao

#

Ofc let's not go that drastic, just saying

#

cause some people train 1 or 2 models and give up, quite sad seeing it happen

kindred eagle
#

but like some turned out crazy good but some had them annoyiing artifacts

glacial pollen
#

Yea like, I know it can be exhausting to be going through various batch_sizes and pretrains, but when it works, it's worth it

#

lots of things contribute into artifacts to be honest

kindred eagle
#

it is worth it tbh

glacial pollen
#

yup

kindred eagle
glacial pollen
#

oh, well

#

for de-reverbering I can only really recommend vx's dereverb

#

tho, yea, it ain't free and is a vst ( ai powered however

kindred eagle
#

ahh i use the uvr dereverb and denoise it works 75% of the time but yeah that remaining reverb in the audio............aaaaaggghhhhh

glacial pollen
#

Yea, the models aren't the best at certain reverb types, esp those minimal room ones

#

If you are skilled enough, you can manually yeet them, or at least tame the trails / leftovers

#

goes like this

kindred eagle
#

yup

glacial pollen
#

And becomes ^

kindred eagle
#

and yeah is spectralayers like good or meh

glacial pollen
#

That's rx

knotty moth
glacial pollen
#

Spectra layers is decent but, if I had to choose the winner, it's rx

glacial pollen
#

Dialogue isolate can damage the audio so, better to be careful

#

It's far from what I'd call reliable

kindred eagle
glacial pollen
#

well not really, if you're handy in it, you can use that with no issues

#

but I just prefer rx

kindred eagle
glacial pollen
#

specifically, I really like working on spectrograms in rx

#

change the scale, zoom in, feather if needed, work repeat

kindred eagle
glacial pollen
#

Aaaa

knotty moth
glacial pollen
#

I've never had any luck with it if it comes to anime type reverb

#

at best, it'd always castrate the audio

#

decrease the fullness or screw up the respiratory range

kindred eagle
#

anime reverb? anime has reverb in it? since when or am i dumb to not notice it?

glacial pollen
#

A lot of them do yes

#

it is a room-reverb type

#

mostly deflections

kindred eagle
#

ahh i c the toji model i trained.......... hmm

knotty moth
kindred eagle
#

thats why it was kinda F'ed up but usable

glacial pollen
#

The issue with stereo vs my workflow, is the fact stereo has 2 channels and they are never uniform

#

I extract stereo, operate on 1 channel, then de-reverb it in mono

#

100% predictable

kindred eagle
#

yup i started soing that recently

glacial pollen
#

Another thing is, vx lets you finetune the de-reverb to your needs

kindred eagle
#

better quality and prediction tbh

glacial pollen
#
  • you tame the rest in rx
#

and the results are perfect
much more perfect than any automation / models can give you

kindred eagle
#

ooooooo

#

imma use it

glacial pollen
#

Why is that? because it doesn't get 100% of it, it expects the user to handle a bit of it

knotty moth
#

the mono dereverb one is quite tougher for me, esp when there are some breaths

glacial pollen
#

Yea the breaths can get damaged sometimes, but you can always just layer the tracks

#

and manually de-reverb the breath

#

it's just some feathered selection yeeting / enveloping

#

But ye, I get that. Everyone has their own workflow they recommend.
That's why I recommend mine, which is vx's de-reverb pro mono + manual polishing in rx

kindred eagle
jaunty iris
#

sorry im really late to all this but if someone could point me how to use refinegan pretrains in applio im just stuck here

low shard
#

there’s no stable version for it yet

#

the only way is via using the main branch source code

jaunty iris
hallow thistle
jaunty iris
#

hmm ok

low shard
#

code > download zip > extract > run install

#

@jaunty iris what’s ur pc gpu tho

jaunty iris
#

oh ok so not the precompiled 3.2.8 got it

#

i just upgraded everything im on 4070 super 12gb sunglasses

low shard
#

its experimental

jaunty iris
#

understood

#

but yea been wanting to try training again and now i can do local 😁

#

interested in how this goes i was gonna test w KLM5

jaunty iris
low shard
jaunty iris
#

bro im stupid wtf 😭

#

im sorry bro

low shard
jaunty iris
#

ohhhh ok i see now

#

yea dude thats completely my bad i should get it now

#

thank u

low shard
kindred eagle
#

I’m currently using RVC for voice cloning, but I’m curious if there are any better apps out there that might do a better job. Sometimes, I feel like the slider values in RVC don’t work as well compared to the online RVC forks.

Please don’t judge me, but my GPU is a 1650 mobile. Any suggestions or experiences you can share would be greatly appreciated!

#

I used to use Kits.ai until they transitioned to a fully paid service. The inferences had minimal artifacting back then, but now the RVC that I run locally for inferencing has noticeably more artifacting compared to what Kits.ai.

simple ore
#

right now even the main branch has it disabled in the repo

low shard
low shard
knotty moth
low shard
#

^

simple ore
distant turtle
#

-colab

karmic oliveBOT
# distant turtle -colab
📒 Google Colab Notebooks
ℹ️ Note

While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.