#Voice Assistant with Ollama failing to turn light off with HassLightSet brightness 0

1 messages · Page 1 of 1 (latest)

indigo wing
#

I have been doing some testing with my Ollama server, so I disabled the local fallback and noticed that the AI model fails to turn a light off. Here are some relevant logs, it does look like it is calling HassLightSet with a brightness of 0, but rather than that turning the light off, it turns it on.

2025-07-25 14:59:58.735 DEBUG (MainThread) [homeassistant.components.ollama.entity] Received response: model='PetrosStav/gemma3-tools:4b' created_at='2025-07-25T19:59:58.719698736Z' done=False done_reason=None total_duration=None load_duration=None prompt_eval_count=None prompt_eval_duration=None eval_count=None eval_duration=None message=Message(role='assistant', content='', thinking=None, images=None, tool_calls=[ToolCall(function=Function(name='HassLightSet', arguments={'area': 'Office', 'brightness': 0, 'domain': ['light'], 'floor': 'Downstairs', 'name': 'Office Ceiling Fan Light'}))])
2025-07-25 14:59:58.800 DEBUG (MainThread) [homeassistant.components.ollama.entity] Received response: model='PetrosStav/gemma3-tools:4b' created_at='2025-07-25T19:59:58.784994921Z' done=True done_reason='stop' total_duration=1733147293 load_duration=33422487 prompt_eval_count=4775 prompt_eval_duration=1019589520 eval_count=51 eval_duration=562631877 message=Message(role='assistant', content='', thinking=None, images=None, tool_calls=None)
2025-07-25 14:59:58.801 INFO (MainThread) [homeassistant.helpers.intent] Triggering intent handler <ServiceIntentHandler - HassLightSet>

This is using the Ollama core integration with https://ollama.com/PetrosStav/gemma3-tools:4b model, but it looks like the AI is calling the correct tool, with the correct parameters, but for some reason it's resulting in the light turning on rather than turning off. No idea what is going on here.

#

I wonder if something is interpreting brightness:0 as brightness being unset, and thus removing it from the object, resulting in the call to light.turn_on being a bare call without a brightness parameter, which would turn the light on...

indigo wing
#

looks like it might be the model just wanting to use HassLightSet for everything with lights. Llama3.1:8b properly uses HassTurnOn/Off, but Gemma3 appears to be using HassLightSet for all lighting control. I still argue that HassLightSet with a brightness value of 0 should turn the light off, not on at full brightness though.

worn lantern
#

Try Qwen3. It’s been great for me.

meager jungle
#

The issues I had were similar; not a right function or aright formatting of the function call being produced

indigo wing
#

Yeah, but I was hoping to find something multimodal so that I could use it with the llm vision integration as well

#

I was wondering llama 3.1 previously, and it worked well... I was trying out Gemma 3 hoping to find one model that could do my tool calling and the vision stuff

#

I'll definitely give Qwen a shot though!

#

Will just have to run a separate model for llm vision

meager jungle
#

I get my Qwen 3 to call the Qwen 2,5 through LLM-Vision script and then it tells me what's happening

#

The script looks like this to return the answer:

sequence:
  - action: llmvision.image_analyzer
    data:
      include_filename: false
      target_width: 1024
      max_tokens: 100
      temperature: 0.1
      provider: 01JKZVKXSB5ZD4KDXQ1JG81CXH
      model: qwen2.5vl:3b-q8_0
      image_entity:
        - camera.back_yard_camera
      message: >-
        Describe in one sentence the human activities in this image, if there is
        any. 
    response_variable: answer
  - stop: Finished
    response_variable: answer
alias: Generate back yard camera description
description: >-
  This script can be used by the LLM to describe the current situation at the
  back yard
worn lantern
#

Yea. I use two different models too. Unloading and reloading takes like 2s on my VRAM.

meager jungle
#

| 0 N/A N/A 1559511 C /usr/local/bin/ollama 5472MiB |
| 0 N/A N/A 1559545 C /usr/local/bin/ollama 6842MiB |

#

qwen3:4b-q4_K_M
qwen2.5vl:3b-q8_0

#

They should take less than 16gb of vram

#

You could also only take the q4_k_m of qwen2.5vl to save some more ram, maybe squeeze it under 12gb

#

(also, my qwen3_4b has a 10,000 context)

#

I expose like 130 entities

indigo wing
#

I have 24G of vram available, but I'm also running whisper and frigate with that

#

But I do have the room for 2 separate models, was just hoping to consolidate a little

#

On a 3090

worn lantern
#

Yea, I don’t think I’ve seen any good multi modal models that also support tool use well.

meager jungle
# indigo wing I have 24G of vram available, but I'm also running whisper and frigate with that

I got a 3090 and run the three following models:

qwen3:4b-q4_K_M
qwen2.5vl:3b-q8_0
Whisper-large-v3

Total consuption is:

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 575.57.08              Driver Version: 575.57.08      CUDA Version: 12.9     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3090        On  |   00000000:01:00.0 Off |                  N/A |
|  0%   40C    P8             20W /  370W |   15744MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A            1670      C   python3                                3410MiB |
|    0   N/A  N/A         1559511      C   /usr/local/bin/ollama                  5472MiB |
|    0   N/A  N/A         1559545      C   /usr/local/bin/ollama                  6842MiB |
+-----------------------------------------------------------------------------------------+
#

So you should still be fine with frigate

indigo wing
#

I have a lot of cameras 😋

meager jungle
#

Maybe time to get a intel cpu then 😛

#

I had 5 cameras on a gpu ( 2080s) and switching to a n100 proved wonders

#

YMMV tho

fluid rock
#

With Qwen3, how do you remove the <think> tags? They are passed to my voice assistants regardless of what settings I choose.

meager jungle
#

last HA version fixed that for me

#

.7

fluid rock
#

I'll have to try it again. Can't remember if I've tried it since going to .7. IIRC I was still getting the tags even if the thinking was empty. I'm using the Ollama integration.

#

Yep, just tested it again with conversation.process. I get think tags regardless of whether the think option is toggled on or off (using Qwen3 30b)

meager jungle
#

Try recreating the integration maybe?

split forge
#

Is your ollama up to date? Only the latest versions properly support handling think tags

#

May also need to purge and redownload the models as well, they may have added new metadata to support it

fluid rock
#

I'll give it a shot later today. Thanks!