#Conversation Response Process Times Out Waiting for Piper

1 messages · Page 1 of 1 (latest)

grand fable
#

Good day all, hopefully I can explain this well enough to render some insight and help.

I have an automation that is triggered by a sentence, that sends some information off to a LLM via the 'Conversation Process' action. I am then using the 'Conversation Response' action in the automation to play back what was returned from the LLM.

My TTS is running Piper on an seperate local server with a bit more beef to it than my Home Assistant server.

This all works fine when the response is short, what I have found is less than 120 words. However, if the response from the LLM is a bit longer, this causes Piper to have to work a little harder and thus take a little longer.

What I am finding is that the conversation process or keep alive (I'm not sure what to call it tbh) seems to be ending before Piper can finish processing the response from the LLM. Which is often somewhere between 800 and 1200 words.

The end result is that nothing is played back via the media_player through the Conversation Reponse action.

I've tested this with simple asks of the LLM, such as "Tell me a story in 200 words". This helped determined how many words I could successfully get to work. Which seems to be 120 or less.

I am using the ReSpeaker Lite as the media player that initiates the automation via the wake word and then sentence. I have no issue with this automation or assist pipeline, if the the response from the LLM is small enough. Only why the response is a bit more verbose.

So my question or ask for help is, what causes the Conversation Process to timeout, can this value be adjusted, or is this just a limitation at the moment? Any workarounds?

grand fable
#

A bit more information to this topic. As I continue to test various different things. It appears that just plainly using TTS (Piper) whether it's on the beefer remote server or hosted locally on the HA server as an Add-On. It seems that if Piper does not respond within 5 seconds to the TTS action nothing gets played. Since the TTS is cached on the HA server, running the action again right afterwards works perfectly.

So I guess the question, is there a setting inside of HA for the TTS action to wait longer?

#

Some example YAML of the TTS action. (Stories courtesy of Llama3.3)

#

This plays without issue, it's 100 words and completes the text-to-speech conversion in about 4 seconds.

action: tts.speak
data:
  cache: false
  media_player_entity_id: media_player.respeaker_satellite_media_player
  message: >-
    "Sir, I've accessed a rather...amusing scenario for your consideration.
    Imagine Winter, personified as a mischievous force, smashing pumpkins off a
    train bridge after Halloween. The gourds, once jack-o'-lanterns, now lay
    shattered on the frosty ground below. As the last train of the evening
    rumbles by, Winter's chill breath extinguishes the final flickers of
    candlelight, plunging the pumpkin remnants into darkness. The sound of
    crunching vines and splintering rind fills the air, a seasonal farewell from
    Winter, as the bridge stands sentinel, a frosty guardian of the autumnal
    aftermath."
  options:
    voice: jarvis-high
target:
  entity_id: tts.piper_docker
#

This won't play on the first run, it's 150 words and completes the text-to-speech conversion in about 7 seconds.

action: tts.speak
data:
  cache: false
  media_player_entity_id: media_player.respeaker_satellite_media_player
  message: >-
    As the snowflakes gently fell, the neighborhood transformed into a winter
    wonderland. The smell of hot chocolate and cookies wafted through the air,
    mingling with the sound of laughter and carols. The Christmas lights, a
    kaleidoscope of colors, twinkled like stars on the houses and trees. A
    frozen pond sparkled like a diamond, reflecting the vibrant hues of the
    lights. Children bundled up in scarves and mittens, their eyes wide with
    wonder, gazed at the magical display. The soft glow of the lights seemed to
    bring the community together, filling hearts with joy and warmth. As the
    night fell, the winter Christmas lights shone brighter, a symbol of hope,
    love, and the magic of the season. The scene was a perfect blend of winter's
    chill and Christmas' warmth, creating an unforgettable holiday atmosphere.
    The lights danced, a festive spectacle, on this merry winter's night.
  options:
    voice: jarvis-high
target:
  entity_id: tts.piper_docker

But will play fine, if I run the action again right afterwards. Even though the cached option is set to false. It doesn't seem to have any affect whether it's true or false.