#ollama remote

1 messages · Page 1 of 1 (latest)

wind slate
#

I run HA in a VM and I have a second VM next to it in which I have some containers running with ollama, whisper, piper and Open-Web UI. When I ask ollama a question via Open-Web UI, all 4 CPUs of the VM hit the roof and a few seconds later I get an answer and CPU usage drops to 0.

In HA I have the assistant pointed to that remote VM via the assistant settings, the configuration, conversation agent, stt and tts. When I now ask the same question via the assistant, I see all 4 CPUs go to the roof, but I never get an answer and CPU remains high until I kick the ollama container or wait a few minutes for "timeout running pipeline" message in HA assistant.

Could it be ollama doesn't know how to find the way back to home assistant?

dusk pumice
#

Turn off the assistant's control device and try to have a conversation.

wind slate
#

Changed STT and TTS to none, but that doesn't help.

dusk pumice
#

This can turn off assistant control

wind slate
#

Yes, that worked. Thanks.... but...... how can I now have it control my home aswell ?

dusk pumice
#

Publicize a small number of devices or upgrade hardware.

wind slate
#

Is it sending all the home assistant devices to ollama with every request?

#

Or why should I limit the number of devices?

dusk pumice
#

yes,<10

wind slate
#

Ah I see..... I'll try giving it more CPUs since I don't have a GPU for it currently. The hypervisor it is running on is 13th Gen Intel(R) Core(TM) i5-13500H with 8 cores.

Thank you for your help

dusk pumice
#

Try starting with 1 and then+1. Until Ollama died.

wind slate
#

LOL, I have 199 entities exposed 🙂

#

Now with 8 vCPU and 16 entities, it works, though very slow

deft gulch
#

Running Ollama with Assist on CPU won't work. HA is sending a lot of context to Ollama, apart from entities. It's all exposed scripts as well, and information on how to use scripts (tools). Context size is too big even with no entities exposed. Get GPU. Or use ChatGPT.
I have same setup as you. And I use Assist, with diverting to Ollama for general questions when Assist doesn't understand. But no control. All the control stuff is in custom intents. 🙂

wind slate
#

My hypervisor has a "Raptor Lake-P [Iris Xe Graphics]" GPU, but I doubt that will perform.

#

Maybe get a Jetson Nano developer kit.

solar sigil
#

I have 100 exposed entities with context size set to 8192 and the LLM with home assistant control works fine, but I have and older Nvidia GPU with enough VRAM. Since you are hardware limited, increasing the context size is probably not an option My personal recommendation is a refurbished/used Nvidia GPU upgrade. IMO, more bang for the buck.

deft gulch
deft gulch
# solar sigil I have 100 exposed entities with context size set to 8192 and the LLM with hom...

That is in case you already have decent PC to put it in, with good PCIE slot. 🙂
My "server" is Lenovo Thinkcenter, and while it hax PCIE3x8, it's proprietary connector, and with external cable, adapter and external power brick i didn't succeed putting my 3080 in that. IDK if it's my hands or some hardware, but Promox couldn't see the card. And even if it could - that Frankenstein monster wasn't looking or feeling good.