(moving this post over from wrong location under #1257019582112334014 sorry for double post)
This is a quick demo of my current assist build.
The screen shown is powered by a small mini PC running Linux that has Wyoming satellite setup with HA open to a dashboard in full screen.
Currently using a fork of https://github.com/jimrushPersonal/ConversationForwarder to connect HA to a Pydantic AI agent I built. The agent has various tools, MCPs, and a graph memory system. It connects to HA via the built in MCP server. Right now the agent is using Grok-4-fast as it tested to be the best combo of speed and intelligence needed, but I switch models regularly.
Wakeword is a small custom trained Cortana wake word.
STT is standard, currently flipping between whisper-faster and NabuCasa's cloud based STT.
For TTS I'm using the new Chatterbox-Turbo model running on the laptop in the video (unforently the only hardware I own with a RTX card).
https://huggingface.co/ResembleAI/chatterbox-turbo. This is being exposed with a fork of https://github.com/travisvn/chatterbox-tts-api that I made to support the new turbo model. HA is connected to this endpoint via https://github.com/sfortis/openai_tts. This allows me to provide a short sample of the voice I want to clone, and it does a very good job of outputing expressive speech with no training required.
The dashboard is just a normal HA dashboard with an iFrame in the center that loads a HTML file from localhost of the mini PC. I small python script runs in the background that monitors sound output. It processes the sound frequencies and serves them over a web socket which the HTML page uses to make changes to the Three.js particle cloud. This is my attempt to have a visual of the agent 'speaking'.
A Home Assistant custom component to route voice assistant conversations to an HTTP endpoint. - GitHub - jimrushPersonal/ConversationForwarder: A Home Assistant custom component to route voice ass...