Slow Response when Voice assistant ist configured for open AI as backend | Home Assistant | Page 1

exotic stump May 1, 2025, 7:56 PM

#

i configured my raspi4 (4GB + SSD) HA Supervisor version. Installed OpenAI as STT, Conversation and TTS engine....so fully opena ai based voice pipeline. for conversations i am using HA voice assist PE. Everyhting works quite fine....activation word works quite reliable, i can ask whatever question i have in mind....i also got it performing actions like telling me weather information that my weather station provides....BUT

All that takes lots of time....for asking about the temperatur it takes at least 7s until i get an answer through the HA voice assist PE...

Since i am quite new to HA i wonder how to perform an end to end profiling to find out what part of the overall pipeline takes so much time?

What i already found out is thatwhen chatting through openai api with HA chat box i get nearly immediate response from the conversations api....but as soon as there is something with STT or TTS it seems that the overall turnaround times explode.

Is this specific to openai apis? or is there something additional i can tune on my side?

short ginkgo May 1, 2025, 8:08 PM

#

exotic stump i configured my raspi4 (4GB + SSD) HA Supervisor version. Installed OpenAI as ST...

settings - voice assistants
click the 3 dots next to your pipeline and select debug from the dropdown.
this allows you to see the calls to the pipeline and how long each step takes

#

example

#

exotic stump May 1, 2025, 9:09 PM

#

i see, seems STT and NLProcessing are the key factors ...

#

i thought that hosted components are more cost and performance effective than local processing....

#

i guess faster_whisper_2 is not suitable for raspberry pi hardware?...at least if i want to have runtime numbers like you have ?

i guess you have some faster dedicated hardware for that?

short ginkgo May 1, 2025, 9:25 PM

#

exotic stump i guess faster_whisper_2 is not suitable for raspberry pi hardware?...at least i...

my STT and LLM are offloaded to another server in my rack running on a 5060ti 🙂

#

realistically about 7 or 8 seconds is not that bad all in

exotic stump May 1, 2025, 9:45 PM

#

short ginkgo my STT and LLM are offloaded to another server in my rack running on a 5060ti 🙂

😅ok.. .i just wanted to replace my amazon echo setup...which is quite responsive....maybe due to simpler NL processing and faster sst...thought i could get the same locally...but...somehow i am facing reality check that this is not possible without significabt invest into hw and electricity cost 😅

short ginkgo May 1, 2025, 9:47 PM

#

stuff will get better with time. but unlike amazon in this ecosystem people dont have bllions to throw at it

jagged oracle May 1, 2025, 9:47 PM

#

I tested OpenAI STT and it's considerably slower than Nabu Casa. I'd recommend Nabu, even for the reply TTS.

exotic stump May 1, 2025, 9:56 PM

#

well, i am just an HA starter...need to explore the overall architecture and design idea behind it...quite time consuming...to set stuff up...it is nothing for 10min setup and everything works....e.g. my final target is to control everything with the smarter LLM as NL processor....but dont get everything to run.....

e.g. telling HA to play some specific track from spotify or just a simple rip radio stream on a specific chromecast (spotify connect capable device)...which is attached to a speaker.....

HA tells me that the player seems offline...but if i directly use ha media playback through the web Interface i can play music with no problems.....chromecast device is exposed to the voice assistant...dont know what the problem is ... i find it quite difficult to find solutions for such problems...

short ginkgo May 1, 2025, 9:57 PM

#

exotic stump well, i am just an HA starter...need to explore the overall architecture and des...

you have music assistant installed?

exotic stump May 1, 2025, 9:57 PM

#

i did in a former ha installation

short ginkgo May 1, 2025, 9:59 PM

#

music assistant is the best way to do any music stuff. add your spotify as a source and add your players and it magics stuff

#

some info on voice control methods including llm stuff can be found here

exotic stump May 1, 2025, 10:01 PM

#

ok, thx a lot will try things out tgroughout the weekend then..will report 👍

#

will be an inzwrwsting journey. i have lots of different manufacturwr devices at home...and want to bring them all together....like philips hue, wiz, chromecast, bravia tv, shelly sensors, ... 😅

short ginkgo May 1, 2025, 10:05 PM

#

take things 1 step at a time and enjoy each small win. dont burn out trying to do too much at once

exotic stump May 1, 2025, 10:05 PM

#

hooefully ha is stable and xapable of doing the job well....had a docker installation of ha before... which had limited capabilities and so switched to ha supervisor installation

short ginkgo May 1, 2025, 10:05 PM

#

exotic stump hooefully ha is stable and xapable of doing the job well....had a docker install...

oh no...

#

not superivsed

#

dont do that

#

run HAOS

exotic stump May 1, 2025, 10:06 PM

#

i do now

#

didnt want to reserve my raspberry for just one purpose...so chose docker...not realizing the drawbacks 😳

short ginkgo May 1, 2025, 10:08 PM

#

yeah HAOS is the best way to run HA

exotic stump May 1, 2025, 10:09 PM

#

yes....ok thx again..lets see how things are progressing...first step will be to speed up the stt step...my kids will kill.me if that takes too long 😅

#

i also have some other problem with the llm stuff....if i tell it to gather some infos from the internet it sometimes takes much longer of course....but finally gives no answer....played arround with the response token oarameter for the NL processing part ...but in most tries...it performs some action...and finally no answer in spoken text...a little disapointing

short ginkgo May 1, 2025, 10:13 PM

#

depends how things are set up, not a lot of llm's have access to live internet data

#

MCP may help this in the future though to be fair but not quite out the box yet

exotic stump May 1, 2025, 10:14 PM

#

yes i am using a chatgpt integration that offers search...works some times....maybe a timeout issue or token size limit problem...

#

thats the reason i asked for how to debug the overall pipeline

short ginkgo May 1, 2025, 10:15 PM

#

yeah, the TTS might reject result if its too big. you might have to limit its responce in your prompt

#

pipeline debug will help narrow that down though as you say

#

good luck with getting it improved anyhow 🙂

exotic stump May 1, 2025, 10:20 PM

#

#

this is an error.log i filtered out

#

when the ha voice device did not play back

short ginkgo May 1, 2025, 10:22 PM

#

esp_fail is odd. but if it works with shorter stuff the i guess its just unhappy with stuff thats too big

exotic stump May 1, 2025, 10:23 PM

#

yes...maybe i should play back on the chromecast device...dont know....

#

dont know how sound is pushed to the esp device...whuch dataformats are supoorted etc....

#

since i was able to even stream music to the ha device with music assistant...it should be capable of playing back tracks as streams in general...

short ginkgo May 1, 2025, 10:25 PM

#

the TTS generates a file locacally and hosts it. it then sends a url to it to the satelite via the api connection which when makes a new connection to stream it

#

check the firmware version of your voice-pe?

exotic stump May 1, 2025, 10:26 PM

#

just updated it today to the newest available

short ginkgo May 1, 2025, 10:26 PM

#

something like that

exotic stump May 1, 2025, 10:28 PM

#

#

i am ahead of you 😅

short ginkgo May 1, 2025, 10:28 PM

#

your on beta firmware

#

i am on stable 😛

exotic stump May 1, 2025, 10:29 PM

#

could be that i toggled the related switch at some point ... yes 😅 ...i love risky steps 😅

#

but good point

short ginkgo May 1, 2025, 10:29 PM

#

i keep pretty cutting edge but not bleeding edge in production

exotic stump May 1, 2025, 10:29 PM

#

i will switch back to stable release

#Slow Response when Voice assistant ist configured for open AI as backend