#Has anyone successfully experimented with local thinking models?
1 messages · Page 1 of 1 (latest)
Can just bump up context in config? Might need some work on the integration itself to potentially filter out think tags 🫤
Chinese propaganda trained models.. are you sure? Popping up all over YouTube and other socials pushed by "influencers".
As opposed to capitalist trained models?
Local is local and as long as you know the limits it’s fine.
Yeah that’s where it’s getting bogged down. I do believe new / coming versions of the model are meant to have the ability to filter that or maybe it can be filtered from the output in lmstudio
Let's wait for Huggingface to train their own version of R1. 90% that it will have more options.
But i guess you could add something like "omit completely thinking block from the output".
However, i doubt it has support for tooling. Does it?
Is 1.5B enough??
Even your local model is censored by its makers for fear of the regime. Open source and China is a contradiction in terms. Almost as hilarious as democratic elections in Russia.
There should be set of weights to tune model, but it was tested that on questions like Taiwan it doesn't even "think", just spits China's official resolution.
I just stumbled on this because I set up ollama with qwen3 and it responds with thinking too. I tried adding /no_think to the system prompt, as suggested in their documentation, and it seems to prevent reasoning but still returns <think></think> as empty blocks.
I suppose a UI toggle to hide or show thinking would be useful for this?
I've ran into issues using deepseek where it flat out doesnt respond, qwen3 or lama3.x versions seems to be more reliable with 8b or higher. Using toolbased LLMs is what HA mentioned should be used
Yea, I'm using qwen3-8b and it has tool support. It seems to work well, the only issue is the output containing <think>
Relevant github issue: https://github.com/home-assistant/core/issues/140003
This doesn't help when using Assist, but I use the conversation.process action a lot to build notifications. With that it's fairly easy to filter the <think> portion out before sending the notification. It wouldn't be too hard to code Assist conversations to also filter out the <think> portion.
Yes - but we still don't have that... I remember people complaining about it since Deepseek R1.
i suspect this will become a popular discussion point over the next few days as people experiment with qwen3
Agreed - LLMs are currently trending toward "thinking" models.
so adding /no_think text removes the think text but it leaves empty tags. it seems this is actually a thing with ollama. the devs over there say that its intended and it should be down to the user to filter it out. but everyone else thinks it should not be there and its an ollama bug... so it seems there might be a bit of a standoff.
It would be so easy for the devs of ollama or even HA to make an input boolean along the lines of "Show Thinking" and then apply a filter (or not). Let the user decide. So far qwen3 is pretty awesome for a small model, so I hope ollama or HA implements the option to hide the tagged information.
if ollama themselves dont want to move on it then i think an option being added to the ollama integration would be the way to go. but better minds than mine make those decisions
Probably best in the integration, that seems to be the trend. Think OpenWebUI already has a mechanism to filter out think tags, LMStudio might as well
I don't think it's a thing with ollama per say. The model includes that in it's response, Ollama doesn't add it.
Their stance is that it's for the frontend to determine what to hide and show.
And that they should be faithfully returning what the model says.
(From my reading of the GitHub issue at least)
does seem that theres fairly strong opinions on both sides
I did some tinkering with HA's Ollama integration code and it looks fairly easy to add the input boolean and filter. I think the bigger question is whether anyone at HA wants the option there. I do think we'll be seeing more thinking models so this issue will be ongoing when everyone wants the latest/greatest LLM but it's too wordy for Assist.
Might be worth at least opening a PR or feature request?
I agree, I am using Qwen3 now and it is substantially better than anything I have used, I bet others are gonna start migrating towards that and wanting those think tags gone
your right, its probably not difficult but there is a bit of a disagreement over where the issue is. its reasnable to question why the empty tags are there at all when nothink is passed to it which its clearly receiving because its turning off the actual info just not the tags
hearing my assist go "Think! Think! The temperature outside is 70 degrees" is a bit goofy 😄