#Building a "Google Assistant" but with ChatGPT and Home Assistant??

1 messages · Page 1 of 1 (latest)

fiery cape
#

Hi all. Is it possible to build a google assistant like experience with Home Assistant and ChatGPT "reasonably" easy? I have a M5Stack Atom Echo device for wake word detection. Is anyone aware of a guide out there? I have asked ChatGPT but it gets complicated really fast without any real progress towards the goal...

tired sorrel
fiery cape
#

I manually recorded a m4a sound file, which I managed to get transcribed by Whisper/OpenAI and the answer spoken thru my sonos speakers. I am now trying to get a wake word working for my Atom Echo. When that is done I need it to somehow record or relay what I am saying to Whisper.

#

Oh, and I should say it was a python script that did the transcribing and calling openai API's

tired sorrel
#

okay so you seem to be trying to reinvent the wheel a bit here

#

what type of home assistant install do you have?

#

from what i understand you want to use the atom echo as a voice sattelite to talk to home assistant with openai's chatgpt being the "brain" of the conversation agent. and also want to use openAI for your speach-to-text component too

#

I can definitely help you get the ball rolling but need to know where you are at currently so we can go from there.

fiery cape
#

Ok, I am not aiming to reinvent the wheel, just following ChatGPT's instructions. Your understanding of my goal is correct. I basically want to be able to say 'Hey Nabu, when did world war 2 end?", just like I used to do with Google Home devices. After that is working, I want to be able to have continous conversations, which involves some kind of memory.
I have configured the Atom Echo with HA, speech to text, text to speech. How I need to figure out how to send my voice input to OpenAI for processing, and then getting the result back to play in my Sonos speakers.

tired sorrel
#

ok so you are at the part where you want to add openai as the the conversation agent

#

thats tackle that first however you should know - redirecting the output of the satelite is currently not posssible (you can edit firmware and kind of make it semi work but its not great specially for continued conversations.

#

do you have your OpenAI API key?

#

and to be sure, are you running the latest version of the voice assistant firmware on the m5ae?

#

I should also point out that voice and AI stuff is evolving extremely quickly at the moment. so blindly following ai instructions may not work out that well as their source matieral may be out of date

fiery cape
#

I have my own API key and I have just flashed the atom echo with the latest yaml from github. I have noticed that ChatGPT does not have the latest (or even old information), for example it was not aware of the OpenAI Conversation integration. It recommends me building a script in Flask or an automation i N8N to be able to accomplish what I want. Dont know if that is necessary though. So I should add OpenAI as a conversation agent as the next step? But even if Iget that to work, I can not get the output to play anywhere else than in my Atom Echo?

tired sorrel
#

yeah you add the conversation agent integreation and then you will be able to set this as the conversation agent on your voice pipeline

fiery cape
#

I managed to get the OpenAI conversation agent to work. It gives me answers in the very bad speaker in the atom

tired sorrel
#

ok, i would have recomended using the firmware tool to put factory VA firmware on it but flashing that does work too

#

is your target speaker (sonos?) already integrated with home assistant?

fiery cape
#

Yes

tired sorrel
#

ok so i have a have a custom firmware config that adds redirection

fiery cape
#

Thanks, I give that a go. It is not possible to use an automation that triggers on "conversation_processed" event and use trigger.event.data.response to play at any media device? (ChatGPT's suggestion)

tired sorrel
#

so that approach you have to modify the firmware to get it to raise that event

#

but that will struggle even worse that with continued conversation

#

my setup works, although it may be delayed a bit in switching to "listening mode" so you have to watch the led a bit until you get used to the timing

#

the m5AE only has 1 i2s bus so it has to switch between microphone and speaker usage so its never going to be as smooth as something better

#

if you get something like the Voice-PE then it will be much more of a smooth setup. it has a better speaker by default too (and has a 3.5mm jack if you want to add something bigger)

#

the fact that the m5ae works as well as it does with the current stuff is amazing really. but its mostly a cheap unit for testing and dev. for longer term use you usually want something better

fiery cape
#

Thanks, Ill look into the Voice-PE. You have been most helpful!

tired sorrel
#

no worries, as I mentioned before please do be careful blindly following AI with this kind of stuff, even with the newer models by the time the model is released a lot of it is out of date

#

voice and ai is in super active development too. its improving constantly