model questions | Friends of the Crustacean 🦞🤝 | Page 1

blissful rapids Apr 20, 2026, 8:26 PM

#

Is it normal for the phi4-3.8b model to be unable to execute tool calls? It can send messages when integrated with openclaw, but it can't seem to tool call, although it knows how to?

solemn timberBOT Apr 20, 2026, 8:26 PM

#

sour roseBOT Apr 20, 2026, 8:26 PM

#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

solemn timberBOT Apr 20, 2026, 8:26 PM

#

sage hornetBOT Apr 20, 2026, 8:27 PM

#

Yeah, pretty normal unfortunately.

A 3.8B-class model can often talk about tools, but still be bad at actually emitting valid tool calls and using tool results correctly. In OpenClaw, small/local models often:

know a tool exists
sometimes manage very simple calls
fail on JSON/schema formatting, args, or multi-step tool use

So “it can message, but not really tool call” is a very believable failure mode.

If you want reliable tool use, I’d use a stronger model for the agent and keep the smaller one for plain chat. The docs also explicitly recommend avoiding older/weaker tiers for tool-enabled agents: https://docs.openclaw.ai/concepts/models

And for fresh real-world model reports, check the community notes here: https://docs.openclaw.ai/concepts/models#model-discussions-community-notes

If you want, I can help you figure out whether this is just model weakness vs a specific OpenClaw/provider config issue.

blissful rapids Apr 20, 2026, 8:58 PM

#

I like the phi4 3.8b decision making and speed, it sometimes gives answers like `Tool to be used: write

JSON object to execute: {
"name": "write",
"arguments": {...`

so i can see it's sort of trying to make a tool call, but when i look at the actions the model takes, it only does "messaging"

I've tried altering the modelfile, which doesn't really change much. this model is the only one out of all the local one's that have this behavior. Would it be better to try temporarily training this model to make tool calls or

sage hornetBOT Apr 20, 2026, 8:59 PM

#

That’s the classic failure mode, yeah: it’s describing a tool call instead of emitting one.

If OpenClaw only shows messaging, then from OpenClaw’s point of view the model returned plain assistant text, not a real structured tool call event. It won’t regex that back into a tool invocation, for obvious safety reasons.

So I would not start by training/fine-tuning it. That’s a lot of work, and for a small model it’s still usually brittle in an agent loop.

What I’d try first:

Check the serving stack/API mode
- LM Studio: tool calling depends on API mode
- local docs recommend Responses API for the better path: https://docs.openclaw.ai/gateway/local-models
Assume the model is the limiting factor
- small local models often know the format, but fail the exact schema/protocol
- OpenClaw docs are pretty blunt about avoiding weaker tiers for tool-enabled agents: https://docs.openclaw.ai/concepts/models
Use it as a fast chat/planning model, not the tool-execution model
- honestly this is probably the best fit if you like its speed/decision style

#

So short version: yes, normal, and the fact that it’s printing fake tool JSON is actually strong evidence that the issue is model/tool-protocol reliability, not just your Modelfile.

If you want, send:

openclaw --version
whether this is Ollama or LM Studio
and the provider/API mode you’re using

That’d make it easier to tell whether there’s also a stack-specific issue on top.

blissful rapids Apr 20, 2026, 9:00 PM

#

this is ollama

sage hornetBOT Apr 20, 2026, 9:01 PM

#

Ah, then first thing I’d check is this exact gotcha:

if your Ollama base URL ends in /v1, that’s probably the problem.

OpenClaw’s docs call this out pretty directly, because with Ollama’s OpenAI-compatible mode models can output raw tool JSON as plain text instead of making real tool calls:

Ollama docs: https://docs.openclaw.ai/providers/ollama

What you want is the native Ollama API, not OpenAI-compatible mode:

{
  models: {
    providers: {
      ollama: {
        baseUrl: "http://127.0.0.1:11434", // no /v1

#

        api: "ollama"
      }
    }
  }
}

If you’re already using native Ollama mode, then I’d blame the model, not OpenClaw. In that case, I would not try training it first. Better order of operations:

Verify native Ollama mode (no /v1, api: "ollama")
Try a leaner tool surface for local models
OpenClaw has an experimental flag for smaller local backends:
agents.defaults.experimental.localModelLean: true
If it still prints fake tool JSON, treat it as a model limitation

So my take is:

#

Ollama /v1 → likely config/protocol problem
native Ollama API and still happening → likely phi4-3.8b just isn’t reliable enough for tool calling

If you paste your Ollama provider block, I can sanity-check it quickly.

blissful rapids Apr 20, 2026, 9:03 PM

#

its more like, every other ollama model works fine, qwen, mistral, oss, etc, just phi4 3.8b. I just want to know if its possible to train or somehow get the model to tool call like the others

sage hornetBOT Apr 20, 2026, 9:04 PM

#

Yeah, possible in principle, but I wouldn’t expect an easy win.

What you’d need is not really a “temporary training” tweak, but an actual SFT/LoRA-style fine-tune on tool-calling traces, using the exact Ollama/native tool-call format you want it to emit. A Modelfile prompt alone usually won’t bridge that gap.

My honest take:

Yes, you can try to train it
No, it’s probably not worth it for a 3.8B model unless you really want this as an experiment
best case, you get some improvement
worst case, you lose some of the decision-making/style you currently like

The key thing is that it must learn to produce real structured tool calls, not just text like:

Tool to be used: write
JSON object to execute: ...

That means training on actual assistant → tool_call → tool_result → assistant traces, not just instruction text.

#

So if your goal is practicality, I’d recommend:

keep phi4-3.8b for fast reasoning/chat
use a different model for actual tool execution

If your goal is experimentation, then yes, a LoRA/fine-tune on function-calling data is the right direction, but I’d treat it as a research project, not a quick fix.

blissful rapids Apr 20, 2026, 9:04 PM

#

aight thx

#model questions