#Is RVC the best model type for voice models?

1 messages · Page 1 of 1 (latest)

magic umbra
#

so i know there are other model types but idk which one is generally considered the best (for just general talking) im pretty sure its RVC but it doesnt hurt to ask

daring agate
#

well for speech most tts are already far superior to rvc in terms of quality and realism
for realtime inference? there's only rvc and ddsp-svc, but the latter is harder to use and the developer only provides help if you speak chinese
(rvc is actually abandoned since 2023, but it has an active community up to this day, the original devs of the project moved on)

magic umbra
daring agate
#

rvc has downgraded quality intentionally by the original main dev of the project, he did this so people can train models with weak gpus

#

ah and about the model sounding slightly different in realtime, thats normal, the realtime inference is weird

#

but it shouldn't be a massive difference

#

if ur getting such huge difference there's three options:

  1. model sucks
  2. your volume settings are bad
  3. low extra chunk value (below 2.7)
#

99% of the time is option 1

daring agate
#

originally it was made for funny ai covers

#

the realtime thing is a hack someone else made (basically converting the local inference to realtime)

magic umbra
#

so its just up to me to find a good model then =w=

#

questions regarding mental sanity will not be answered today

alpine plank
# magic umbra

explain what kind of issue that thousand models you've tried suck for you

#

maybe with some sample output audio

magic umbra
alpine plank
#

can't handle laughing, for example?

#

maybe you should lower your expectation and acknowledge the RVC limitations