#Text-to-Speech API for Piper

1 messages · Page 1 of 1 (latest)

teal epoch
#

Is there a text-to-speech API for wyoming-piper similar to that of rhasspy's hermes?

#

Since I am running HA Supervised (docker addons), I am not sure of the setup for this.
Does this get installed on the host, and runs a separate instance of Piper? Or do I try to install this in the existing addon_core_piper container?

#

I build a hacky system with netcat:

Inside the addon_core_piper docker container, run a script for a persistent netcat listener that executes piper.
while true do netcat -lvp PORT -c '/usr/share/piper/piper --model /data/en_GB-alan-medium.onnx --output-file -' done

On Homeassistant, I have the "shell_command" integration that runs a script. That script accepts text strings as a variable, and runs echo $1 | nc ADDON_IP PORT > FILE

I have some additional script checks for input validation and stuff. Also, instead of creating a file, I can pipe the output to another netcat to a remote speaker.
I don't need to run HA satellites or media players that need to run anything big. My use case is speaker running from rpi Kodi boxes.

bold dock
teal epoch
#

I was curious about that service listening on 10200?
python3 -m wyoming_piper --piper /usr/share/piper/piper --uri tcp://0.0.0.0:10200

#

I think I see how it works.
echo '{"type": "synthesize", "data": {"text": "Hello World"} }' | nc ADDON_IP 10200
This does work. Just need to figure out how to pipe the output and close the connection

#

I think I need the netcat to recognize the audio-stop response payload. I'm guessing that is how the other components do it

#

-w1 seems to work

#

Now to figure out how to remove the json data, and keep only the WAV file in a usable format

bold dock
#

Sox can convert to WAV. Wyoming doesn't support changing the audio format. That's a client's responsibility.

teal epoch
bold dock
#

It's pretty easy to turn the raw PCM into WAV too. You just need to know the number of samples and the sample rate/width/channels to add the header.

teal epoch
#

The PCM data has the json messages mixed in too... so would need to also remove that then process the PCM into WAV.