#chatgpt through alexa
1 messages · Page 1 of 1 (latest)
Nope.
Would I need to use home assistant voice if I wanted to do something in that aspect to control my house and devices?
Well there's a lot of options for HA satellite now - but mostly they're based on same technology that VPE is using.
ESP32S3 + multi-mic array.
Would this be a more ideal powerful route to go vs the home assistant voice?
I mean, "home assistant voice" is also using that.
You may choose. VPE, Satellite1, Respeaker Lite. Now there's also Respeaker XVF-3800, but that thing is still in the works firmware-wise.
Depends on your commitment.
I'm still pretty new, only getting my server up 3 weeks ago so I'll take a look at those options and see which one are more on the "user friendly" side so I don't get lost in the setup lol thank you!
Well HA Voice PE is theeasiest i guess.
One last question, since you 100% know more than I do. If I go with HA voice vs the other optios you mentioned, am I getting a noticably inferior setup than let's say going with the EAP32S3 route, or the other ones you mentioned. Yes I want it simple, but I also have a semi beefy PC to run my home assistant through unraid as well as a Plex server, so I don't want to sell my system short if that makes sense
It doesn't matter. You may use your system with whatever satellites you choose.
Thanks so much ! Massive help in steering my in the right direction. Would've been in a mess trying to make Alexa do something it can't
Vpe works great and can be customized. I prefer it over my HomePods as it can actually do stuff
I'm not sure that "customized" is the right word though...
I have a full glados theme going on with a hey glados wakeword
Where as Siri is like sorry I can’t help you with that right now
Alexa decided to shout out an ad during Turing off the lights
Am I understanding this right , from what I've seen recently that I can use VPE and give commands more loosely with the chatgpt link and it'll understand more what I want it to do?
It should yes
I have mine using a free Gemini API key
Set it up in the voice settings and test the voice commands via browser
Still need nabu to create that link between the two in your case?
Nope
Interesting. Good to know thank you. I'm going to assume that'd be the same if I used a chatgpt API key?
It should be the same
You can have it set up so it tries to process the command locally first
So have it try and see if Gemini knows the answer, but if it doesn't have chatgpt come in for the assist?
It tries to have home assistant process through the intents like turn on and off the lights etc first
Or have it first try to answer through ollama locally and if it's not sure, then it kicks off to either Gemini or chatgpt
Then it uses the llm
This local thing can run on a pi4 in under a second for basic home controls
It’s a way to reduce API key use
It's not on VPE, it's HA voice pipeline. The device itself isn't that customizable - beyond LED color and Grove port there's not much you can change.
I currently have ollama setup on my local server that has a 3060 graphics to process commands, using llama3.1 . If it can't understand, I can have my home assistant say " ok , let's ask Gemini" correct? And I guess I should ask as I'm still looking at which is better, can you run both Gemini and chatgpt or does it just become redundant
I flashed a custom wake word
I can customize the sound files etc
You can't do it so easily. Gemini and Llama have to use different pipelines. Then you switch pipines..
Yes, and all of this you can do on Satellite1, Koala, Respeaker Lite etc. too.
Ahhh, not easy
Ah now I get it. It's not like HA knows to ask Gemini if my local LLM doesn't know the answer. I'd have to ask my secondary pipeline to do the task. So something simple like hey turn the lights off, that can be through my ollama locally. But something like, turn off all the lights on the main floor, if I don't have that grouping created, I'd have to ask Gemini so it can make sense of what I'm asking
Assuming the LLM supports the new tooling, it can do that
Try it through the browser first
I could make a MCP tool server to ask big brother when a small LLM does not know but the small LLM needs to be able use the MCP tooling
That all could be done with local LLM.
60% of that - without LLM at all. There's a lot of things that HA can process internally, with Hassil intent recognition engine.
You don't need MCP for it, simple script with conversation.process, exposed to Assist as a tool.
I’m a few months behind on the tooling and formatting
It's not that new. conversation.process service is there for a year or so I guess.. 🙂
It was all proof of concept to me till proper ollama support was added
Then it became something my wife won’t bonk me on the head for
has anyone actually gotten an xvf 3800 to work on like, a linux box? I am trying to use it with a pi and linux voice assistant and this thing - does not work at all - there seems to be no way to make it function with this.
I have no idea. I wrote ESPHome component for it, and PE-like ESPHome software config. It works fairly well. But I never used it via USB, as just a mic array.