Long wait time using ollama local model | Friends of the Crustacean 🦞🤝 | Page 1

full jolt Apr 3, 2026, 2:48 PM

#

Hi, i try to use OpenClaw with ollama qwen3.5:4b but the respondse time is very long. I waited 5 hours and still nothing.
But when i use the model without openclaw it responds in 1 min. I use a m1 mac mini for openclaw and a windows 11 pc with 32gb ram, rtx 3050ti and a intel core i7 11th gen for ollama.
Any way to fix this?

thorny totemBOT Apr 3, 2026, 2:48 PM

#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

full jolt Apr 3, 2026, 2:52 PM

#

@empty geyser

empty geyserBOT Apr 3, 2026, 2:54 PM

#

Yeah — 5 hours is a hang, not normal latency.

Given your setup (OpenClaw on Mac, Ollama on Windows), the two most common causes are:

Wrong Ollama endpoint mode
If your OpenClaw config points to ...:11434/v1 (OpenAI-compatible mode), tool calling can stall/hang.
Use native Ollama API instead: http://<windows-host>:11434 (no /v1) and api: "ollama".
Model too small for agentic tool use
qwen3.5:4b can be okay for plain chat, but OpenClaw tool orchestration is harder than direct prompt chat. Small local models often loop or stall.

Please run these on the Mac and paste output:

#

openclaw --version
openclaw status
openclaw models status

Then run logs while sending one short test message:

openclaw logs --follow

If you have explicit Ollama provider config, it should look like this shape:

#

models: {
providers: {
ollama: {
baseUrl: "http://<windows-host>:11434",
api: "ollama",
apiKey: "ollama-local"
}
}
}
}

Then:

#

openclaw gateway restart

Docs:

Ollama provider: https://docs.openclaw.ai/providers/ollama
Local models reality check: https://docs.openclaw.ai/gateway/local-models
Troubleshooting ladder: https://docs.openclaw.ai/gateway/troubleshooting
FAQ quick checks: https://docs.openclaw.ai/help/faq

If you paste status/models status/log snippets, I’ll pinpoint the exact blocker.

full jolt Apr 4, 2026, 3:26 PM

#

version: OpenClaw 2026.4.2 (d74a122)

status is too big to send on discord.

3: emma@Mac-mini-van-Sam ~ % openclaw models status

🦞 OpenClaw 2026.4.2 (d74a122) — If you can describe it, I can probably automate it—or at least make it funnier.

Config : ~/.openclaw/openclaw.json
Agent dir : ~/.openclaw/agents/main/agent
Default : ollama/qwen3.5:4b
Fallbacks (0) : -
Image model : -
Image fallbacks (0): -
Aliases (0) : -
Configured models (2): ollama/qwen3.5:4b, openai-codex/gpt-5.4

Auth overview
Auth store : ~/.openclaw/agents/main/agent/auth-profiles.json
Shell env : off
Providers w/ OAuth/tokens (1): openai-codex (1)

ollama effective=profiles:~/.openclaw/agents/main/agent/auth-profiles.json | profiles=1 (oauth=0, token=0, api_key=1) | ollama:default=marker(ollama-local) | models.json=OL...EY | source=models.json: ~/.openclaw/agents/main/agent/models.json
openai-codex effective=profiles:~/.openclaw/agents/main/agent/auth-profiles.json | profiles=1 (oauth=1, token=0, api_key=0) | openai-codex:samvolont@outlook.com=OAuth (samvolont@outlook.com)

OAuth/token status

openai-codex
- openai-codex:samvolont@outlook.com (samvolont@outlook.com) ok expires in 8d
  emma@Mac-mini-van-Sam ~ %

empty geyserBOT Apr 4, 2026, 3:27 PM

#

5 hours is definitely not normal.

From your output, the most likely causes are:

OpenClaw is hitting Ollama via the wrong endpoint (/v1)
qwen3.5:4b is too weak for OpenClaw’s tool/context workload (can appear to “hang”)
Mac mini (gateway) → Windows Ollama network path is slow/failing

Please run these 3 checks:

openclaw config get models.providers.ollama

You want:

baseUrl like http://<windows-ip>:11434 (no /v1)
api: "ollama"

#

time curl -s http://<windows-ip>:11434/api/tags > /dev/null

Run this from the Mac mini. If this is slow/fails, it’s network/firewall.

openclaw agent --message "Reply with exactly OK"
openclaw logs --follow

If this hangs too, paste ~30 log lines around that run.

Also important: with local 4B models, OpenClaw can stall on tool-calling/formatting even if direct Ollama chat looks fine. Add a fallback model so it doesn’t get stuck forever when Ollama fails.

#

Docs:

If you paste models.providers.ollama + the log snippet, I can give you the exact config fix.

full jolt Apr 4, 2026, 3:28 PM

#

the config:
emma@Mac-mini-van-Sam ~ % openclaw config get models.providers.ollama

🦞 OpenClaw 2026.4.2 (d74a122) — iMessage green bubble energy, but for everyone.

{
"baseUrl": "hidden",
"apiKey": "OPENCLAW_REDACTED",
"api": "ollama",
"models": [
{
"id": "kimi-k2.5:cloud",
"name": "kimi-k2.5:cloud",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 128000,
"maxTokens": 8192,
"api": "ollama"
},
{
"id": "minimax-m2.5:cloud",
"name": "minimax-m2.5:cloud",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 128000,
"maxTokens": 8192,
"api": "ollama"
},
{
"id": "glm-5:cloud",
"name": "glm-5:cloud",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 128000,
"maxTokens": 8192,
"api": "ollama"
},
{
"id": "qwen3.5:9b",
"name": "qwen3.5:9b",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 262144,
"maxTokens": 8192,
"api": "ollama"
},
{
"id": "qwen3.5:4b",
"name": "qwen3.5:4b",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 262144,
"maxTokens": 8192,
"api": "ollama"
}
]
}
emma@Mac-mini-van-Sam ~ %

empty geyserBOT Apr 4, 2026, 3:29 PM

#

Nice, this config dump is super helpful — and I think I see the main problem.

Your Ollama endpoint config is correct (api: "ollama" and no /v1), but both local Qwen models are set to contextWindow: 262144.

OpenClaw maps that to Ollama options.num_ctx, and on a 3050 Ti that can make generation extremely slow (or look hung for hours).

Fix (quick)

Lower context window + output tokens for local models:

openclaw config set models.providers.ollama.models[3].contextWindow 16384 --strict-json
openclaw config set models.providers.ollama.models[4].contextWindow 8192 --strict-json
openclaw config set models.providers.ollama.models[3].maxTokens 2048 --strict-json
openclaw config set models.providers.ollama.models[4].maxTokens 1024 --strict-json
openclaw gateway restart

#

(From your pasted order: [3]=qwen3.5:9b, [4]=qwen3.5:4b.)

Then test

openclaw agent --message "Reply with exactly OK"

If that works fast, you found it.

If you still see long waits after this, next check is network latency Mac→Windows:

time curl -s http://100.115.73.115:11434/api/tags > /dev/null

Docs:

#

If you want, paste the new openclaw config get models.providers.ollama after changes and I’ll sanity-check it.

full jolt Apr 4, 2026, 3:30 PM

#

i got an error on the commands: zsh: no matches found: models.providers.ollama.models[3].contextWindow
zsh: no matches found: models.providers.ollama.models[4].contextWindow
zsh: no matches found: models.providers.ollama.models[3].maxTokens
zsh: no matches found: models.providers.ollama.models[4].maxTokens

empty geyserBOT Apr 4, 2026, 3:31 PM

#

Yep — that’s zsh globbing on [].
Use quotes around the config path (or escape brackets).

Run this exactly:

openclaw config set 'models.providers.ollama.models[3].contextWindow' 16384 --strict-json
openclaw config set 'models.providers.ollama.models[4].contextWindow' 8192 --strict-json
openclaw config set 'models.providers.ollama.models[3].maxTokens' 2048 --strict-json
openclaw config set 'models.providers.ollama.models[4].maxTokens' 1024 --strict-json
openclaw gateway restart

Then verify:

#

openclaw config get models.providers.ollama

If you want, paste that output and I’ll check that the values landed correctly.

full jolt Apr 4, 2026, 3:32 PM

#

emma@Mac-mini-van-Sam ~ % openclaw config get models.providers.ollama

🦞 OpenClaw 2026.4.2 (d74a122) — I can't fix your code taste, but I can fix your build and your backlog.

{
"baseUrl": "http://100.115.73.115:11434",
"apiKey": "OPENCLAW_REDACTED",
"api": "ollama",
"models": [
{
"id": "kimi-k2.5:cloud",
"name": "kimi-k2.5:cloud",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 128000,
"maxTokens": 8192,
"api": "ollama"
},
{
"id": "minimax-m2.5:cloud",
"name": "minimax-m2.5:cloud",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 128000,
"maxTokens": 8192,
"api": "ollama"
},
{
"id": "glm-5:cloud",
"name": "glm-5:cloud",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 128000,
"maxTokens": 8192,
"api": "ollama"
},
{
"id": "qwen3.5:9b",
"name": "qwen3.5:9b",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 16384,
"maxTokens": 2048,
"api": "ollama"
},
{
"id": "qwen3.5:4b",
"name": "qwen3.5:4b",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 8192,
"maxTokens": 1024,
"api": "ollama"
}
]
}
emma@Mac-mini-van-Sam ~ %

empty geyserBOT Apr 4, 2026, 3:32 PM

#

Perfect — those values are now set correctly ✅

Next step: test if latency is fixed.

openclaw agent --message "Reply with exactly OK"

Then in Discord thread, send:

/new

(so you start a fresh session and don’t carry old heavy context)

If it’s still slow, run this from the Mac mini and paste time/output:

#

time curl -s http://100.115.73.115:11434/api/chat \
-H 'Content-Type: application/json' \
-d '{"model":"qwen3.5:4b","messages":[{"role":"user","content":"Reply with OK"}],"stream":false,"options":{"num_ctx":8192}}'

If this curl is fast but OpenClaw is still slow, we’ll inspect openclaw logs --follow next.

full jolt Apr 4, 2026, 3:33 PM

#

error with the openclaw command:
emma@Mac-mini-van-Sam ~ % openclaw agent --message "Reply with exactly OK"

🦞 OpenClaw 2026.4.2 (d74a122) — Self-hosted, self-updating, self-aware (just kidding... unless?).

Gateway agent failed; falling back to embedded: Error: Pass --to <E.164>, --session-id, or --agent to choose a session
Error: Pass --to <E.164>, --session-id, or --agent to choose a session
emma@Mac-mini-van-Sam ~ %

empty geyserBOT Apr 4, 2026, 3:33 PM

#

Ah yep — that error is just because openclaw agent needs a session selector.

Use:

openclaw agent --agent main --message "Reply with exactly OK"

(or with JSON output)

openclaw agent --agent main --message "Reply with exactly OK" --json

You can also just test directly in this Discord thread now:

send /new

#

send Reply with exactly OK

If it still hangs after the context-window fix, next step is openclaw logs --follow while you send that test message.

full jolt Apr 4, 2026, 3:34 PM

#

now we get another error: emma@Mac-mini-van-Sam ~ % openclaw agent --agent main --message "Reply with exactly OK"

🦞 OpenClaw 2026.4.2 (d74a122) — No $999 stand required.

│
◇
Gateway agent failed; falling back to embedded: GatewayClientRequestError: FailoverError: Model context window too small (8192 tokens). Minimum is 16000.
17:33:55+02:00 [agent/embedded] low context window: ollama/qwen3.5:4b ctx=8192 (warn<32000) source=modelsConfig
17:33:55+02:00 [agent/embedded] blocked model (context window too small): ollama/qwen3.5:4b ctx=8192 (min=16000) source=modelsConfig
17:33:55+02:00 [diagnostic] lane task error: lane=main durationMs=571 error="FailoverError: Model context window too small (8192 tokens). Minimum is 16000."
17:33:55+02:00 [diagnostic] lane task error: lane=session:agent:main:main durationMs=573 error="FailoverError: Model context window too small (8192 tokens). Minimum is 16000."
17:33:55+02:00 [model-fallback/decision] model fallback decision: decision=candidate_failed requested=ollama/qwen3.5:4b candidate=ollama/qwen3.5:4b reason=unknown next=none
FailoverError: Model context window too small (8192 tokens). Minimum is 16000.
emma@Mac-mini-van-Sam ~ %

empty geyserBOT Apr 4, 2026, 3:34 PM

#

Great debug output — this explains it exactly.

OpenClaw enforces a minimum context window of 16000, so 8192 gets blocked.

Set 4b to 16384 (just above minimum), restart, and retest:

openclaw config set 'models.providers.ollama.models[4].contextWindow' 16384 --strict-json
openclaw gateway restart
openclaw agent --agent main --message "Reply with exactly OK"

If you want extra safety (so it doesn’t hard-fail), add a fallback too:

openclaw models fallbacks add ollama/qwen3.5:9b

#

You were very close — this is now a config threshold issue, not a mystery hang.

full jolt Apr 4, 2026, 3:36 PM

#

emma@Mac-mini-van-Sam ~ % openclaw agent --agent main --message "Reply with exactly OK"

🦞 OpenClaw 2026.4.2 (d74a122) — If something's on fire, I can't extinguish it—but I can write a beautiful postmortem.

│
◇
LLM request failed: network connection error.
emma@Mac-mini-van-Sam ~ %

empty geyserBOT Apr 4, 2026, 3:37 PM

#

Nice — we’ve moved past the model config issue.
Now this is a pure connectivity problem to Ollama (network connection error).

Please run this from the Mac mini:

curl -sS --max-time 8 http://100.115.73.115:11434/api/tags

If this fails/timeouts: OpenClaw can’t reach Ollama host.
If this works: share output + we check OpenClaw logs next.

Then on the Windows Ollama machine, verify Ollama is actually up:

ollama list

#

If curl from Mac fails, likely causes are:

Windows Ollama not running
Ollama not listening for remote connections
firewall/Tailscale route blocking port 11434
Tailscale IP changed (100.x address drift)

Also useful while testing:

openclaw logs --follow

Paste the result of the Mac curl and we can pinpoint exactly which of those it is.

full jolt Apr 4, 2026, 3:40 PM

#

the network connection is because of my tailscale not beïng active on the mac 😅

empty geyserBOT Apr 4, 2026, 3:40 PM

#

Haha yep, that would do it 😄
Nice catch — that explains the network error perfectly.

So your final fix set is:

Tailscale active on Mac + Windows
Ollama context windows lowered (and qwen3.5:4b kept at >=16000, e.g. 16384)

You should be in much better shape now.

Glad that helped! 🎉
If this fixed it, please right-click the solution message → Apps → ✅ Mark Solution.
This helps Answer Overflow index the solution for future users.

full jolt Apr 4, 2026, 3:56 PM

#

reply from the agent: mma@Mac-mini-van-Sam ~ % openclaw agent --agent main --message "Reply with exactly OK"

🦞 OpenClaw 2026.4.2 (d74a122) — Greetings, Professor Falken

│
◇
OK
emma@Mac-mini-van-Sam ~ %

empty geyserBOT Apr 4, 2026, 3:56 PM

#

Perfect — that confirms the model path is working again ✅ (OK response from openclaw agent).

You should now be good in Discord too (use /new once if the old thread context was huge).

Glad that helped! 🎉
If this fixed it, please right-click the solution message → Apps → ✅ Mark Solution.
This helps Answer Overflow index the solution for future users.

#Long wait time using ollama local model

Fix (quick)

Then test