#Opinion on LLMs for HA

1 messages · Page 1 of 1 (latest)

quartz flume
#

After reading a lot about the assist pipeline and how it’s being used by different users, I’m wondering how much an LLM actually improves Home Assistant, or if it’s just a small extra feature to play around with. I’ve read that LLMs sometimes tend to hallucinate when calling tools or performing other tasks. Also, isn’t the response time much longer than simply using speech-to-text, for example?

I’d love to know if you use an LLM for your Home Assistant. If you do, why, and what benefits have you found? If you don’t use an LLM, why not?

jagged vale
#

I made my pipeline to work with inbuilt HA conversation agent. I have LLM integration, but use it basically as fallback for general questions, and to refine some generic requests like music playback. Everything else, that replaced Alexa functionality for my family, I have done with custom intent scripts.
Until I can run something 100% reliable at home, I don't give LLM access to my devices.

slim tinsel
#

Yeah, local llm is just not smart enough. Even a 70b model wouldn’t be reliable. I am running local tts and stt though. I have mine fallback to gpt40mini. It’s the most reliable. Really we need better custom hardware and models trained just for this purpose.

quartz flume
sour grove
#

If you use something like gpt-4o-mini it works well, but of course isn't local. There's more you can use than just llm though. I am using vllms for camera analysis for instance. 🙂

#

For now unless you got the hardware to run something like a 400b model, i don't think anything local is gonna work as well except for basic commands 🫤

nova bobcat
#

I personally love it. As I've always been a fan of iron man, I got one of my own Jarvis going :p

Though, I need to get a 4060TI 12GB to handle a larger model than llama3.2 on a 1050TI, and faster loading x)

lofty fulcrum
#

I always prioritize speaking sentences that are built-in commands rather than using some casual sentences for control. This is because I don't want to stand there for a few seconds waiting for the LLM to complete when leaving a certain area. If that were the case, I might as well trigger the switch on my wall. If there were a model that could consistently convert and output user sentences into ones understandable by HA's built-in intents, then we would only need to manage our built-in intents.

sour grove
#

Mine is GLaDOS 😁

nova bobcat
# sour grove

Hah xD

Sadly the MET.no integration makes my assistant unable to read more than weather "right now", even if the integration shows for the entire week.

sour grove
#

Yeah i wrote a script to get deeper data and expose that to the llm as a tool

slim tinsel
sour grove
#

Too well actually, the unit on my upper floor kept hearing me, i had to mute it's mics at the end there lol.

#

The acoustics of my workshop make my voice resonate up the stairwell 😁

trail fossil
#

I built an AI box to see how far I could push "all local" home assistant. (Piper, Whisper, and Ollama) Sometimes it's shockingly responsive. Other times it's a 4-7 second wait. No major hallucinations yet. The worst thing it's done has been when it tells me it did something and it didn't do it. One model insisted on broadcasting everything to all my PEs.

It's been fun to give it a personality. We actually have a few different personalities that we choose via drop-down helpers because the more extreme personalities get old really fast.

slim tinsel
trail fossil
#

I started with Llama3.x 8b but I recently moved to Qwen2.5:14b.

slim tinsel
#

Yeah I’ve tried both, weren’t happy with either to be honest, it can’t seem to handle multiple requests at once

jagged vale
#

I don't think HA itself is ready for multiple context-aware conversations...

slim tinsel
#

Works fine with ChatGPT

sour grove
#

Yeah I have found 4o-mini to do just fine for day-to-day things for me. Multi request combo commands like "Turn on the lights and play some music by Billy Talent" or "Tell me the weather forecast for today and list my upcoming calendar events"

jagged vale
#

Ah, sorry, you meant several commands in one sentence, I thought you mean context segregation when several people talking to different satellites simultaneously.

sour grove
#

ooh, Yeah in theory that should also work with something like 4o-mini, but something local is probably gonna explode because you'd need a lot of vram to hold all those session contexts simultaneously

jagged vale
sour grove
#

I think when a voice satellite starts a conversation it sets its own conversation ID doesn't it?

jagged vale
hushed crag
hushed crag
sour grove
#

Piper

pulsar thorn
sour grove
spice ibex
hushed crag
# spice ibex that's probably the LLM being dumb

Seems like Llama3.2 3B is worse at handling HA commands but is way better at understanding the prompt and responding accordingly. Qwen2.5 3B is the opposite. Do you have any suggestions on a middle ground?

spice ibex
#

I'm afraid not, I've found 3b models to be too dumb for this. I personally use llama3.1 8b fp16, but that also has brain farts sometimes. I plan to play around with more models, including quantized versions.

#

if you don't have a good GPU, my advice is to just go for cloud. it's better and costs pennies

hushed crag
slim tinsel
#

I’ve tried a lot of different models and they are all useless compared to ChatGPT. Even a 32b model at q8 was stupid

hushed crag
sour grove
#

I've seen qwen do fairly well, but you need to bump the context size up especially with a lot of exposed entities. The default of 8192 doesn't work too well if you have more than a handful of entities exposed.

hushed crag
nova bobcat
sour grove
#

Don't have it shared but it's a pretty simple script, though i wrote it as an intent script.

#
description: "Gets the daily weather forecasts. Use this when asked about the weather."
action:
  - action: nws.get_forecasts_extra
    target:
      entity_id: weather.kpvd
    data:
      type: twice_daily
    response_variable: result
  - stop: ""
    response_variable: result
speech:
  text: "{{ action_response }}"
#

Could adapt it into a regular script though.

vapid nacelle
#

After the release of the 'HassBroadcast' my local llama 3.1 8B has been spamming the 'HassBroadcast' action when granted 'control', which is causing huge slowdowns as it doesn't show the response until it spoke the entire question/response on speakers in the house.

#

Adjusting the prompt to not (ab)use 'HassBroadcast' hasn't been entirely successful so far

hushed crag
vapid nacelle
hushed crag
bleak trail
zealous glade
nova bobcat
hushed crag
nova bobcat
hushed crag
nova bobcat
zealous glade
hushed crag
zealous glade
hushed crag