#Voice PE - Start conversation - not working with LLM pipeline

1 messages · Page 1 of 1 (latest)

peak fable
#

Hi, just received a Voice PE unit and apart from the device going to sleep and not responding tot he wake word every now and then, it seems OK.

WHat im trying to test and eval mostly is the start conversation option. Im using my local ollama pipeline for the conversation agent, all other aspects of speech and text is using HA cloud.

When it runs, the response never works because it keeps outputting the result (ill turn off the light) and the json payload to do it, but from what I can understand, as thats freetext and not a json doc, its not registered as the right event to be captured so the action never occurs.

Ive tried to adjust the system prompt to only return the json payload only but the unit is determined to only ever return the textual response from the pipeline.

Ha anyone done this yet, in my mind this should be simple to do, and maybe is and im just missing a step. Can someone guide on what I may be doing wrong here?

Thanks

robust fog
peak fable
#

i’m a dot release from latest (2025.4) and voice device / satellite updated to latest firmware yesterday.

robust fog
#

can you post the yaml of the action your using?

maiden hornet
#

I don't think the goal is to return JSON payload. The goal is for LLM to run tool, and return text response

#

Also which model do you use? Does it support tools? Do you have "control" turned on for it? And what is exposed?

peak fable
#

good point, i’ve tried with gemma3 and llama 3.2. i have the agent set up to not have control and have the option “prefer to handle commands locally” selected. for the devices i’m working with, lights just for now, i have them all exposed. with the same model, okay nabu, turn on/off devices works no problem. when responding to a start conversation test to turn off a light i’ve just turned on, then the same pipeline fails by responding with the output instead. no action occurs

peak fable
#

Hey @robust fog you've made me wonder if im missing parts of the process?
For my tsting, I was just invoking an start conversation int he dev tools asking if I wante to turn off the light. Maybe naively , I thought I could ue this trigger as the input into the voice pipeline ("light x is on, do you want to turn it off") where I was expecting the rest of the pipeline (LLM coversation agent back to text to speech) would continue as per where the conversation agent , not having control and the option to handle intents locally, would just turn the light off, and then respond with the fact that the action (the light is off) is just returned. Do I need to create an automation to handle a conversation return event to perform the action?

As a side, in the dev tools, the reason why I thought I needed to return a json payload @maiden hornet was due to expecting that would need to be returned for the 'conversation_process_finished' event to occur before any action may have happened.

Are here examples to look at where this works so I can see what I am missing? (Or are my local models - llama and gemma - not up to the task here?

maiden hornet
#

llama 3.2 is 3B. It won't do.
IDK how many params are there in your Gemma3. But i didn't hear it being good.
Use qwen 2.5 or llama 3.1 with at least 7B (better 14B) parameters. Then check how to call questions correctly (through automation) and try it.

peak fable
#

My current testing today has been on gemma3 12b which works well for standard pipeline of just asking assist to control devices. guess its just the question pipeline process im misunderstanding

maiden hornet
#

Ok 12B should be enough probably.
Then yeah, check in release notes examples of setup.

peak fable
#

My setup seems to have regressed. I'm finding the lack of documentation here frustrating. Trying to replicate what has been shown in what I have found online as working, I have created a new Assit agent , everything is using HA Cloud for text/speech and using new Google Generative AI for conversation agent. Whener I attempt to sart a conversation using an input such as turn off lights in an area, it still fails? Has any one come across up to date information, walkthroughs on how to set this up.

robust fog
peak fable
#

Hi , yes it did. I ended up setting up a key for openai and setup the OpenAI conversation agent. once I had added credit for that it worked. ! Im not sure whats going on as I had some success with my local ollama instance yesterday and then the assist.startconversation just stopped working against it. 🤷‍♂️ So im testing along with gpt for now.

Although for some reason my prompt seems to be sending a large amount of tokens! When i look in the integration debug log, that seems to be filled with all of my HA general error logs which I hope is not being sent as part of the prompt? (strange to appear in my OAIGPT integration log though I thought)

robust fog