#Assist is stopping and launching models for every request and sometimes within the same request.

1 messages · Page 1 of 1 (latest)

proven kestrel
#

Hello - I am just getting into this. But I have Ollama running as a service on a Linux machine. It is running llama3.2:3b.
I have everything connected in home assistant and it is working. However, I have noticed that it takes a long time for responses. This seems to be caused by something (home assistant?) stopping the current running model and then restarting it. I see these ollama runner processes starting and subsequently running on different ports than 11434. It eventually starts to run out of RAM.

Any ideas on why is HA not using the running model and leaving it running? Even if set to forever in the config.

proven kestrel
#

Anyone? Anyone?

The service is literally starting and stopping the model for every request. This can't be expected operation?

cerulean trout
proven kestrel
#

Keep alive is at -1 . It might be on my end, but struggling to find where. The model is running via ollama serve, and then when I query home assistant I can see the model stop and then start again.