#Voice PE + Home Assistant Cloud voice = Wildly off pronunciation

1 messages · Page 1 of 1 (latest)

balmy heart
#

Enjoying the Voice PE setup. I use Google AI as the conversation agent. I'm using Home Assistant Cloud for text-to-speech with British English, ThomasNeural, as the voice.

It sends back “witty” things like: “It’s -1 outside I mean brrrrr (burr)” with burr being the sound for shivering. But, the Home Assistant voice says it as “BEEEEEEEE R” Like a honey bee and then think pirate-ish with R.

I can't find a place to drop in anything to help Home Assistant Cloud with proper pronunciation like <phoneme alphabet="ipa" ph="/bɝ/">brr</phonem> or similar. Also, I can't find a place to contribue to this side of the house vs. contributing intents.

Help appreciated to get ole Jarvis proper .

Cross posting from the Community as I just got on the Discord chat and it seems much more active. (https://community.home-assistant.io/t/ha-cloud-voice-brr-shivering-sound-becomes-bee-rrrr/843648)

burnt venture
#

Maybe tell the LLM that the TTS model doesn’t will pronounce things phonetically and give it some examples. Instead it could output “burrrr” or something like that and the TTS model may do better.

balmy heart
#

Sure, so I do pass an instruction to the Conversation Agent which includes something for this, but it doesn't work: "Pronunciation of br, brr, or brrr is /bɝ/ with a homophone burr." How better can I write that? Add burr, burrr, burrrr? Any other language?

burnt venture
#

The LLM doesn’t need to know how to pronounce it. It needs to know how to represent it to your TTS model. Try to play with TTS Say or something to see what works. Try out “burrrr” or something. If that works, use that as an example and explain that the TTS model may have issues pronouncing words that represent sounds without vowels, like “brrr” which is pronounced as “burrr”.

balmy heart
#

Copy. I wrote an instruction for vowel-less words after confirming brrr / burrr on HA Cloud's voice being decent with burrr. Time will tell!

balmy heart
#

Well unfortunately this didn't work. And the instruction is being ignored. Today we have seen a few bee R (brr) and asterisk asterisk asterisk being read out when my kiddo asks stuff. Jarvis would be so cool if my HA Cloud voice (British Thomas) would not be awkward. Google AI is following the be like Jarvis from Iron Man movies pretty good lol

burnt venture
# balmy heart Well unfortunately this didn't work. And the instruction is being ignored. Today...

Yea. LLMs seem to not always listen to the instruction. I have a time sensor for the next bus at a nearby stop and it was giving me the time but in GMT. So if the bus was 5 minutes away it would say 8 hours and 5 minutes. I figured that it would be no problem to tell it to always give me times in local, Pacific time. Nope! It says something like “Ok. I’ll use Pacific time” then proceeds to not do that…

#

I even had the LLM help engineer my prompt, but nah.

balmy heart
#

Yeah if I just use Gemini on the phone all is good. Somewhere in the world of HA Assistant it gets jacked up and comes out wonky.

burnt venture
#

I’ve heard LLMs perform better when you tell them to think about their responses. Maybe tell it that it should re read its response and make sure any onomatopoeia are spelled phonetically before replying.

Might never get it right though. Hah