#Problems with my LLM enhanced conversation agent.
1 messages · Page 1 of 1 (latest)
Too many entities exposed
I removed all exposed entities and it’s still not running with “Assist“ selected. The assist pipeline debug gets stuck.
Where do you run your Qwen? It seems to be lacking horsepower...
Even without exposed entities, HA is using context pretty extensively.
It runs on my i5 13500 CPU. I'm currently running the 7b. I'll try and run a smaller model.
Don't bother. I don't think running LLM without VRAM will work. Well maybe on 2B model...
I run my LLM on CPU, and it works without Assist control. With it - it just isn't fast enough.
Maybe try posting a (censored?) view of the 'Raw' output?
I just figured out from the Raw output that my LLM is spamming the 'HassBroadcast' when granted 'control', which is causing huge slowdowns as it doesn't show the response until it spoke the entire question & response on some speaker in the house.
It'll run inference just fine, but you need a GPU for prompt processing. It'll be painfully slow otherwise.