#how do i make text to speech

1 messages · Page 1 of 1 (latest)

vernal schooner
#

i need it in c#, c++, python, c- and i need to also clone unique voice

hazy wren
#

Do you need it in unity though? Is it related to unity in any way?

vernal schooner
hazy wren
vernal schooner
hazy wren
#

Well, anyway, you'll need to use some kind of cloud service, like Google or Open AI tts models. You might be able to run something locally too, but it's probably gonna be resource demanding and low quality. As for the language, if it's a cloud service, you can make requests in C#. Local solutions might require using other languages as well.

barren yew
#

If you do find a library to use locally you want one thats c# or uses c/c++ so you can still interface with it (if cpp you may need to write your own helper functions that are c compatible)

vernal schooner
hazy wren
vernal schooner
#

Not robotic voice

topaz bronze
vernal schooner
#

with 200gb high quality data

#

how am i supposed to get proper voicce data

topaz bronze
#

This is getting pretty far outside the Unity world but you can look up XTTS on hugging face

hazy wren
native osprey
vernal schooner
#

and not by training it

native osprey
#

Yes, using existing text-to-speech text banks that you've decided aren't good enough for you

vernal schooner
#

and they dont use proper language

#

like c++ or c#

native osprey
#

It usually involves training a large AI model on a bunch of clips

#

which you've also refused

#

So, you need to either do that, or lower your standards

vernal schooner
#

nobody uses it

#

except for microsoft

native osprey
#

Not really. Text to speech is either using recorded voice banks, or AI

#

Both of which you've refused

vernal schooner
native osprey
#

Lots of recorded sounds, either entire words or phonemes, spliced together to make the sentence you've typed in

#

A.k.a, that thing you said you didn't want to do

#

Because you want to clone a voice

vernal schooner
native osprey
hazy wren
#

You don't have to train a model from scratch. You can fine tune one on a smaller set of data. That should be more achievable with limited hardware and probably provide better results than existing voice cloning capabilities(although I've seen some voice cloning/sampling examples that sound quite impressive).

As for the programming languages, I don't see how that's a problem. You can have an interop with virtually any language. At least if you're targeting pc/mac platforms.

oblique vine
#

I wonder if its even possible in the first place to clone your voice exactly, without training an AI
Prior to a lot of the recent AI TTS examples, the voices often sounded robotic and clearly synthetic

topaz bronze
#

Without AI, you can only reproduce the phonemes you have recorded. At best you get the star trek computer voice. Or Siri I guess.

vapid skiff
#

what is your big picture goal?