#Building a "Google Assistant" but with ChatGPT and Home Assistant??
1 messages · Page 1 of 1 (latest)
what do you have set up so far?
I manually recorded a m4a sound file, which I managed to get transcribed by Whisper/OpenAI and the answer spoken thru my sonos speakers. I am now trying to get a wake word working for my Atom Echo. When that is done I need it to somehow record or relay what I am saying to Whisper.
Oh, and I should say it was a python script that did the transcribing and calling openai API's
okay so you seem to be trying to reinvent the wheel a bit here
what type of home assistant install do you have?
from what i understand you want to use the atom echo as a voice sattelite to talk to home assistant with openai's chatgpt being the "brain" of the conversation agent. and also want to use openAI for your speach-to-text component too
I can definitely help you get the ball rolling but need to know where you are at currently so we can go from there.
Ok, I am not aiming to reinvent the wheel, just following ChatGPT's instructions. Your understanding of my goal is correct. I basically want to be able to say 'Hey Nabu, when did world war 2 end?", just like I used to do with Google Home devices. After that is working, I want to be able to have continous conversations, which involves some kind of memory.
I have configured the Atom Echo with HA, speech to text, text to speech. How I need to figure out how to send my voice input to OpenAI for processing, and then getting the result back to play in my Sonos speakers.
ok so you are at the part where you want to add openai as the the conversation agent
thats tackle that first however you should know - redirecting the output of the satelite is currently not posssible (you can edit firmware and kind of make it semi work but its not great specially for continued conversations.
do you have your OpenAI API key?
and to be sure, are you running the latest version of the voice assistant firmware on the m5ae?
I should also point out that voice and AI stuff is evolving extremely quickly at the moment. so blindly following ai instructions may not work out that well as their source matieral may be out of date
I have my own API key and I have just flashed the atom echo with the latest yaml from github. I have noticed that ChatGPT does not have the latest (or even old information), for example it was not aware of the OpenAI Conversation integration. It recommends me building a script in Flask or an automation i N8N to be able to accomplish what I want. Dont know if that is necessary though. So I should add OpenAI as a conversation agent as the next step? But even if Iget that to work, I can not get the output to play anywhere else than in my Atom Echo?
yeah you add the conversation agent integreation and then you will be able to set this as the conversation agent on your voice pipeline
which yaml did you flash?
I managed to get the OpenAI conversation agent to work. It gives me answers in the very bad speaker in the atom
ok, i would have recomended using the firmware tool to put factory VA firmware on it but flashing that does work too
is your target speaker (sonos?) already integrated with home assistant?
Yes
ok so i have a have a custom firmware config that adds redirection
so i have this guide here: https://gist.github.com/MichaelMKKelly/79b6f5fcb85f424cb510dc4e3f841aff
Thanks, I give that a go. It is not possible to use an automation that triggers on "conversation_processed" event and use trigger.event.data.response to play at any media device? (ChatGPT's suggestion)
so that approach you have to modify the firmware to get it to raise that event
but that will struggle even worse that with continued conversation
my setup works, although it may be delayed a bit in switching to "listening mode" so you have to watch the led a bit until you get used to the timing
the m5AE only has 1 i2s bus so it has to switch between microphone and speaker usage so its never going to be as smooth as something better
if you get something like the Voice-PE then it will be much more of a smooth setup. it has a better speaker by default too (and has a 3.5mm jack if you want to add something bigger)
the fact that the m5ae works as well as it does with the current stuff is amazing really. but its mostly a cheap unit for testing and dev. for longer term use you usually want something better
Thanks, Ill look into the Voice-PE. You have been most helpful!
no worries, as I mentioned before please do be careful blindly following AI with this kind of stuff, even with the newer models by the time the model is released a lot of it is out of date
voice and ai is in super active development too. its improving constantly
heres an example of a setup i have with a VPE with an extra speaker - https://gist.github.com/MichaelMKKelly/5033dec56c5ab6ee6b7db52f690b84e0
the VPE internal speaker is mostly fine for voice but if its a bigger room or you wanted to play music through it then a extra speaker is handy