#Connect HA to vLLM
1 messages · Page 1 of 1 (latest)
All depends on your hardware your LLM is running on literally that's everything
While I don't disagree its partly.my hardware, vLLM always for concurrent requests. Ollama doesnt and uploads the model
if its the same model and you adjust your settings so that HA and OWUI are calling the same model with the same settings then ollama doesn't have to reload the model. I remember solving this a while ago to prevent the constant reloading. or are you using different models?
Its the same model but I have tweaked the OWUI. Home Assistant has no way to set temperature or chunk size so it uploads it
@cedar barn
to be fair, I think I switched my OWUI settings to match ha to do this on my setup.
Which would make sense for most models but GPT-OSS:20b needs a but of tweaking to get it where I want it for other users
Bit
yeah that's fair enough, my setup was with qwen3
I need to look back into qwens. I just couldn't get all 280 entities to it. It would never find sunset and stuff like that. So I've even using 20b
yeah 280 is quite a lot of entities
I don't really know how to limit it too much more. I havent made any scripts or anything yet, but I have roughly 60 smart switches alone in my house lol
are all of them required to have independent control? could you group some?
or maybe if the switch interaction is basic on/off you could potentially make a script for "switch control" with on/off variable and a select variable. then see if the ai can call the switch control script with the variables that you specify in command.
That last idea seems like the best option actually. I'm sure there are some switches I could get rid of, just havent figure out which ones yet.
So the script would be the best of both worlds I think
yeah, it might require some tinkering to get it to work but in concept I think it should be fine.
Awesome thank you so much for your help
no worries, good luck with it 🙂
Theres a Tinyllama refined for HA only 1.0b
Nice