Huge ticket count | Home Assistant | Page 1

craggy flume Jul 10, 2025, 4:16 PM

#

I have read the documentations and I can't see anything relating to this but when I query through 'Assist' using my local Ollama agent or OpenAI Home Assistant uses an astronomical amount of tokens. For example when asking "What is the date?" it used 7,000 tokens. I have tried to access the logs for this on OpenAI but they aren't available. Likewise when I use Ollama it practically shuts down. Does Home Assistant open 'threads' relating to every single device/entity or something for each request?

warped quail Jul 10, 2025, 4:19 PM

#

How many entities are exposed to Assist? This can be found on the Voice Assistants page at the bottom of the Assist card.

craggy flume Jul 10, 2025, 4:28 PM

#

237

trim trench Jul 10, 2025, 6:45 PM

#

HA recommends a much smaller number of exposed entities for local models (definitely less than 100) - unless you have some really beefy GPUs. For each exposed entity HA sends a lot more info than just the name and the state. Also note that this context is in addition to your model. So if you have a 10GB model and a 12GB card, the context could be adding another 1 - 5 GB and causing slower memory to be used.

#Huge ticket count