Let me describe my setup: raspberry pi 4 with HA, Mac Mini M4 Pro with Ollama, Piper and Whisper running. Assist works in text mode, i.e. I can ask it to turn the light on, and it does, so that's nice. I can't test microphone or speaker because not using HTTPS for HA.
I bought and installed the little preview box, connected it, and sometimes it works, but it doen't talk back properly. It seems to speak english using a dutch voice.
I ran a network traced on port 10200 (piper) and it seems to be emitting JSON intermixed with audio. Here's a piece of the piper.err file:
DEBUG:wyoming_piper.handler:Synthesize(text='Since the user made a typo, but the context suggests they meant lights, the response should be straightforward.\n</think>', voice=SynthesizeVoice(name='nl_NL-mls-medium', language=None, speaker=None), context=None)
DEBUG:wyoming_piper.handler:synthesize: raw_text=Since the user made a typo, but the context suggests they meant lights, the response should be straightforward.
</think>, text='Since the user made a typo, but the context suggests they meant lights, the response should be straightforward. </think>.'
DEBUG:wyoming_piper.handler:Synthesize(text='Verlichting uitgezet', voice=SynthesizeVoice(name='nl_NL-mls-medium', language=None, speaker=None), context=None)
DEBUG:wyoming_piper.handler:synthesize: raw_text=Verlichting uitgezet, text='Verlichting uitgezet.'
DEBUG:wyoming_piper.handler:Text stream stopped
Any pointers?