#Best local open source Text-To-Speech and Speech-To-Text?

1 messages · Page 1 of 1 (latest)

cold bolt
#

What are the best local open-source Text-To-Speech and Speech-To-Text solutions available?

cold bolt
hollow tusk
cold bolt
hollow tusk
# cold bolt At the moment, inference, but I would like it for training in the near future as...

Your Nvidia GPU is good enough to do inference (use models) locally (on ur pc), not the best to train (make models) even if still possible

You can:

  • Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides):
    • Applio: A fork of RVC with some extra features like Applio TTS, kinda faster and simpler but same quality tho
    • Mainline: The original RVC
  • Cloud (remote good pc, easier and faster than ur PC but it's limited):
    • Ilaria RVC Zero: fastest and simplest that you can get for free
    • Weights.gg: Partnered with AI Hub, lets u do them easily but u may be in a queue
    • Applio Colab: max 4 hours daily, not granted, of GPU

Easiest possible (automatically separates vocals & instrumentals) : weights.gg
easiest cloud: Ilaria rvc zero
easiest local: Applio

#

RVC is STS

#

There are different Text To Speech (TTS) AIs:

GPT So Vits: RVC isn't as good as GPT So Vits for tts, but gpt so vits (few shot tts, which means needs just a lil training for models) can't use rvc models (and viceversa), and its only limited to: english, chinese & japanese, if you wanna check gpt so vits instead, read https://docs.ai-hub.wtf/tts/gpt-sovits/

Freemium 11labs: An easy way to do TTS is https://elevenlabs.io/, you can't use RVC model on this but its a mostly premium easy way for good quality TTS

FishSpeech: FishSpeech is a 0 shot (no explicit training needed) TTS, if you got a good pc you can use it locally else use their site

With RVC Models:

RVC is natively for Speech To Speech, but forks such as ilaria rvc mainline & applio have built in tts (using Microsoft Edge TTS to make a generated tts audio, which i suggest you to choose a tts model that is the same gender and language of the rvc model you wanna use, and then convert it with rvc)

If you wanna do tts locally with RVC Voice Models (if you got a good pc):

  • You can get Applio in our docs
  • While Ilaria RVC Mainline here (no guide as of right now)

If you don't got a good pc you can do tts with RVC Voice Models on cloud:

  • Ilaria RVC Zero (Running on A100 GPU, free fasted rvc on cloud) and the guide

  • Use Applio UI Colab (with google colab T4 free daily limit gpu)

  • if you don't wanna use edge tts, you could try another tts ai from our tts index and use the output as an input in rvc