#Long wait time using ollama local model

1 messages · Page 1 of 1 (latest)

full jolt
#

Hi, i try to use OpenClaw with ollama qwen3.5:4b but the respondse time is very long. I waited 5 hours and still nothing.
But when i use the model without openclaw it responds in 1 min. I use a m1 mac mini for openclaw and a windows 11 pc with 32gb ram, rtx 3050ti and a intel core i7 11th gen for ollama.
Any way to fix this?

thorny totemBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

full jolt
#

@empty geyser

empty geyserBOT
#

Yeah — 5 hours is a hang, not normal latency.

Given your setup (OpenClaw on Mac, Ollama on Windows), the two most common causes are:

  1. Wrong Ollama endpoint mode
    If your OpenClaw config points to ...:11434/v1 (OpenAI-compatible mode), tool calling can stall/hang.
    Use native Ollama API instead: http://<windows-host>:11434 (no /v1) and api: "ollama".

  2. Model too small for agentic tool use
    qwen3.5:4b can be okay for plain chat, but OpenClaw tool orchestration is harder than direct prompt chat. Small local models often loop or stall.


Please run these on the Mac and paste output:

#
openclaw --version
openclaw status
openclaw models status

Then run logs while sending one short test message:

openclaw logs --follow

If you have explicit Ollama provider config, it should look like this shape:

{
#
models: {
providers: {
ollama: {
baseUrl: "http://<windows-host>:11434",
api: "ollama",
apiKey: "ollama-local"
}
}
}
}

Then:

full jolt
#

version: OpenClaw 2026.4.2 (d74a122)

status is too big to send on discord.

3: emma@Mac-mini-van-Sam ~ % openclaw models status

🦞 OpenClaw 2026.4.2 (d74a122) — If you can describe it, I can probably automate it—or at least make it funnier.

Config : ~/.openclaw/openclaw.json
Agent dir : ~/.openclaw/agents/main/agent
Default : ollama/qwen3.5:4b
Fallbacks (0) : -
Image model : -
Image fallbacks (0): -
Aliases (0) : -
Configured models (2): ollama/qwen3.5:4b, openai-codex/gpt-5.4

Auth overview
Auth store : ~/.openclaw/agents/main/agent/auth-profiles.json
Shell env : off
Providers w/ OAuth/tokens (1): openai-codex (1)

  • ollama effective=profiles:~/.openclaw/agents/main/agent/auth-profiles.json | profiles=1 (oauth=0, token=0, api_key=1) | ollama:default=marker(ollama-local) | models.json=OL...EY | source=models.json: ~/.openclaw/agents/main/agent/models.json
  • openai-codex effective=profiles:~/.openclaw/agents/main/agent/auth-profiles.json | profiles=1 (oauth=1, token=0, api_key=0) | openai-codex:samvolont@outlook.com=OAuth (samvolont@outlook.com)

OAuth/token status

empty geyserBOT
#

5 hours is definitely not normal.

From your output, the most likely causes are:

  1. OpenClaw is hitting Ollama via the wrong endpoint (/v1)
  2. qwen3.5:4b is too weak for OpenClaw’s tool/context workload (can appear to “hang”)
  3. Mac mini (gateway) → Windows Ollama network path is slow/failing

Please run these 3 checks:

openclaw config get models.providers.ollama

You want:

  • baseUrl like http://<windows-ip>:11434 (no /v1)
  • api: "ollama"
#
time curl -s http://<windows-ip>:11434/api/tags > /dev/null

Run this from the Mac mini. If this is slow/fails, it’s network/firewall.

openclaw agent --message "Reply with exactly OK"
openclaw logs --follow

If this hangs too, paste ~30 log lines around that run.


Also important: with local 4B models, OpenClaw can stall on tool-calling/formatting even if direct Ollama chat looks fine. Add a fallback model so it doesn’t get stuck forever when Ollama fails.

full jolt
#

the config:
emma@Mac-mini-van-Sam ~ % openclaw config get models.providers.ollama

🦞 OpenClaw 2026.4.2 (d74a122) — iMessage green bubble energy, but for everyone.

{
"baseUrl": "hidden",
"apiKey": "OPENCLAW_REDACTED",
"api": "ollama",
"models": [
{
"id": "kimi-k2.5:cloud",
"name": "kimi-k2.5:cloud",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 128000,
"maxTokens": 8192,
"api": "ollama"
},
{
"id": "minimax-m2.5:cloud",
"name": "minimax-m2.5:cloud",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 128000,
"maxTokens": 8192,
"api": "ollama"
},
{
"id": "glm-5:cloud",
"name": "glm-5:cloud",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 128000,
"maxTokens": 8192,
"api": "ollama"
},
{
"id": "qwen3.5:9b",
"name": "qwen3.5:9b",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 262144,
"maxTokens": 8192,
"api": "ollama"
},
{
"id": "qwen3.5:4b",
"name": "qwen3.5:4b",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 262144,
"maxTokens": 8192,
"api": "ollama"
}
]
}
emma@Mac-mini-van-Sam ~ %

empty geyserBOT
#

Nice, this config dump is super helpful — and I think I see the main problem.

Your Ollama endpoint config is correct (api: "ollama" and no /v1), but both local Qwen models are set to contextWindow: 262144.

OpenClaw maps that to Ollama options.num_ctx, and on a 3050 Ti that can make generation extremely slow (or look hung for hours).

Fix (quick)

Lower context window + output tokens for local models:

openclaw config set models.providers.ollama.models[3].contextWindow 16384 --strict-json
openclaw config set models.providers.ollama.models[4].contextWindow 8192 --strict-json
openclaw config set models.providers.ollama.models[3].maxTokens 2048 --strict-json
openclaw config set models.providers.ollama.models[4].maxTokens 1024 --strict-json
openclaw gateway restart
#

(From your pasted order: [3]=qwen3.5:9b, [4]=qwen3.5:4b.)

Then test

openclaw agent --message "Reply with exactly OK"

If that works fast, you found it.


If you still see long waits after this, next check is network latency Mac→Windows:

time curl -s http://100.115.73.115:11434/api/tags > /dev/null

Docs:

full jolt
#

i got an error on the commands: zsh: no matches found: models.providers.ollama.models[3].contextWindow
zsh: no matches found: models.providers.ollama.models[4].contextWindow
zsh: no matches found: models.providers.ollama.models[3].maxTokens
zsh: no matches found: models.providers.ollama.models[4].maxTokens

empty geyserBOT
#

Yep — that’s zsh globbing on [].
Use quotes around the config path (or escape brackets).

Run this exactly:

openclaw config set 'models.providers.ollama.models[3].contextWindow' 16384 --strict-json
openclaw config set 'models.providers.ollama.models[4].contextWindow' 8192 --strict-json
openclaw config set 'models.providers.ollama.models[3].maxTokens' 2048 --strict-json
openclaw config set 'models.providers.ollama.models[4].maxTokens' 1024 --strict-json
openclaw gateway restart

Then verify:

#
openclaw config get models.providers.ollama

If you want, paste that output and I’ll check that the values landed correctly.

full jolt
#

emma@Mac-mini-van-Sam ~ % openclaw config get models.providers.ollama

🦞 OpenClaw 2026.4.2 (d74a122) — I can't fix your code taste, but I can fix your build and your backlog.

{
"baseUrl": "http://100.115.73.115:11434",
"apiKey": "OPENCLAW_REDACTED",
"api": "ollama",
"models": [
{
"id": "kimi-k2.5:cloud",
"name": "kimi-k2.5:cloud",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 128000,
"maxTokens": 8192,
"api": "ollama"
},
{
"id": "minimax-m2.5:cloud",
"name": "minimax-m2.5:cloud",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 128000,
"maxTokens": 8192,
"api": "ollama"
},
{
"id": "glm-5:cloud",
"name": "glm-5:cloud",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 128000,
"maxTokens": 8192,
"api": "ollama"
},
{
"id": "qwen3.5:9b",
"name": "qwen3.5:9b",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 16384,
"maxTokens": 2048,
"api": "ollama"
},
{
"id": "qwen3.5:4b",
"name": "qwen3.5:4b",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 8192,
"maxTokens": 1024,
"api": "ollama"
}
]
}
emma@Mac-mini-van-Sam ~ %

empty geyserBOT
#

Perfect — those values are now set correctly ✅

Next step: test if latency is fixed.

openclaw agent --message "Reply with exactly OK"

Then in Discord thread, send:

/new

(so you start a fresh session and don’t carry old heavy context)

If it’s still slow, run this from the Mac mini and paste time/output:

#
time curl -s http://100.115.73.115:11434/api/chat \
-H 'Content-Type: application/json' \
-d '{"model":"qwen3.5:4b","messages":[{"role":"user","content":"Reply with OK"}],"stream":false,"options":{"num_ctx":8192}}'

If this curl is fast but OpenClaw is still slow, we’ll inspect openclaw logs --follow next.

full jolt
#

error with the openclaw command:
emma@Mac-mini-van-Sam ~ % openclaw agent --message "Reply with exactly OK"

🦞 OpenClaw 2026.4.2 (d74a122) — Self-hosted, self-updating, self-aware (just kidding... unless?).

Gateway agent failed; falling back to embedded: Error: Pass --to <E.164>, --session-id, or --agent to choose a session
Error: Pass --to <E.164>, --session-id, or --agent to choose a session
emma@Mac-mini-van-Sam ~ %

empty geyserBOT
#

Ah yep — that error is just because openclaw agent needs a session selector.

Use:

openclaw agent --agent main --message "Reply with exactly OK"

(or with JSON output)

openclaw agent --agent main --message "Reply with exactly OK" --json

You can also just test directly in this Discord thread now:

  1. send /new
#
  1. send Reply with exactly OK

If it still hangs after the context-window fix, next step is openclaw logs --follow while you send that test message.

full jolt
#

now we get another error: emma@Mac-mini-van-Sam ~ % openclaw agent --agent main --message "Reply with exactly OK"

🦞 OpenClaw 2026.4.2 (d74a122) — No $999 stand required.



Gateway agent failed; falling back to embedded: GatewayClientRequestError: FailoverError: Model context window too small (8192 tokens). Minimum is 16000.
17:33:55+02:00 [agent/embedded] low context window: ollama/qwen3.5:4b ctx=8192 (warn<32000) source=modelsConfig
17:33:55+02:00 [agent/embedded] blocked model (context window too small): ollama/qwen3.5:4b ctx=8192 (min=16000) source=modelsConfig
17:33:55+02:00 [diagnostic] lane task error: lane=main durationMs=571 error="FailoverError: Model context window too small (8192 tokens). Minimum is 16000."
17:33:55+02:00 [diagnostic] lane task error: lane=session:agent:main:main durationMs=573 error="FailoverError: Model context window too small (8192 tokens). Minimum is 16000."
17:33:55+02:00 [model-fallback/decision] model fallback decision: decision=candidate_failed requested=ollama/qwen3.5:4b candidate=ollama/qwen3.5:4b reason=unknown next=none
FailoverError: Model context window too small (8192 tokens). Minimum is 16000.
emma@Mac-mini-van-Sam ~ %

empty geyserBOT
#

Great debug output — this explains it exactly.

OpenClaw enforces a minimum context window of 16000, so 8192 gets blocked.

Set 4b to 16384 (just above minimum), restart, and retest:

openclaw config set 'models.providers.ollama.models[4].contextWindow' 16384 --strict-json
openclaw gateway restart
openclaw agent --agent main --message "Reply with exactly OK"

If you want extra safety (so it doesn’t hard-fail), add a fallback too:

openclaw models fallbacks add ollama/qwen3.5:9b
#

You were very close — this is now a config threshold issue, not a mystery hang.

full jolt
#

emma@Mac-mini-van-Sam ~ % openclaw agent --agent main --message "Reply with exactly OK"

🦞 OpenClaw 2026.4.2 (d74a122) — If something's on fire, I can't extinguish it—but I can write a beautiful postmortem.



LLM request failed: network connection error.
emma@Mac-mini-van-Sam ~ %

empty geyserBOT
#

Nice — we’ve moved past the model config issue.
Now this is a pure connectivity problem to Ollama (network connection error).

Please run this from the Mac mini:

curl -sS --max-time 8 http://100.115.73.115:11434/api/tags
  • If this fails/timeouts: OpenClaw can’t reach Ollama host.
  • If this works: share output + we check OpenClaw logs next.

Then on the Windows Ollama machine, verify Ollama is actually up:

ollama list
#

If curl from Mac fails, likely causes are:

  • Windows Ollama not running
  • Ollama not listening for remote connections
  • firewall/Tailscale route blocking port 11434
  • Tailscale IP changed (100.x address drift)

Also useful while testing:

openclaw logs --follow

Paste the result of the Mac curl and we can pinpoint exactly which of those it is.

full jolt
#

the network connection is because of my tailscale not beïng active on the mac 😅

empty geyserBOT
#

Haha yep, that would do it 😄
Nice catch — that explains the network error perfectly.

So your final fix set is:

  • Tailscale active on Mac + Windows
  • Ollama context windows lowered (and qwen3.5:4b kept at >=16000, e.g. 16384)

You should be in much better shape now.

Glad that helped! 🎉
If this fixed it, please right-click the solution message → Apps → ✅ Mark Solution.
This helps Answer Overflow index the solution for future users.

full jolt
#

reply from the agent: mma@Mac-mini-van-Sam ~ % openclaw agent --agent main --message "Reply with exactly OK"

🦞 OpenClaw 2026.4.2 (d74a122) — Greetings, Professor Falken



OK
emma@Mac-mini-van-Sam ~ %

empty geyserBOT
#

Perfect — that confirms the model path is working again ✅ (OK response from openclaw agent).

You should now be good in Discord too (use /new once if the old thread context was huge).

Glad that helped! 🎉
If this fixed it, please right-click the solution message → Apps → ✅ Mark Solution.
This helps Answer Overflow index the solution for future users.