#Long wait time using ollama local model
1 messages · Page 1 of 1 (latest)
@empty geyser
Yeah — 5 hours is a hang, not normal latency.
Given your setup (OpenClaw on Mac, Ollama on Windows), the two most common causes are:
-
Wrong Ollama endpoint mode
If your OpenClaw config points to...:11434/v1(OpenAI-compatible mode), tool calling can stall/hang.
Use native Ollama API instead:http://<windows-host>:11434(no/v1) andapi: "ollama". -
Model too small for agentic tool use
qwen3.5:4bcan be okay for plain chat, but OpenClaw tool orchestration is harder than direct prompt chat. Small local models often loop or stall.
Please run these on the Mac and paste output:
openclaw --version
openclaw status
openclaw models status
Then run logs while sending one short test message:
openclaw logs --follow
If you have explicit Ollama provider config, it should look like this shape:
{
models: {
providers: {
ollama: {
baseUrl: "http://<windows-host>:11434",
api: "ollama",
apiKey: "ollama-local"
}
}
}
}
Then:
openclaw gateway restart
Docs:
- Ollama provider: https://docs.openclaw.ai/providers/ollama
- Local models reality check: https://docs.openclaw.ai/gateway/local-models
- Troubleshooting ladder: https://docs.openclaw.ai/gateway/troubleshooting
- FAQ quick checks: https://docs.openclaw.ai/help/faq
If you paste status/models status/log snippets, I’ll pinpoint the exact blocker.
version: OpenClaw 2026.4.2 (d74a122)
status is too big to send on discord.
3: emma@Mac-mini-van-Sam ~ % openclaw models status
🦞 OpenClaw 2026.4.2 (d74a122) — If you can describe it, I can probably automate it—or at least make it funnier.
Config : ~/.openclaw/openclaw.json
Agent dir : ~/.openclaw/agents/main/agent
Default : ollama/qwen3.5:4b
Fallbacks (0) : -
Image model : -
Image fallbacks (0): -
Aliases (0) : -
Configured models (2): ollama/qwen3.5:4b, openai-codex/gpt-5.4
Auth overview
Auth store : ~/.openclaw/agents/main/agent/auth-profiles.json
Shell env : off
Providers w/ OAuth/tokens (1): openai-codex (1)
- ollama effective=profiles:~/.openclaw/agents/main/agent/auth-profiles.json | profiles=1 (oauth=0, token=0, api_key=1) | ollama:default=marker(ollama-local) | models.json=OL...EY | source=models.json: ~/.openclaw/agents/main/agent/models.json
- openai-codex effective=profiles:~/.openclaw/agents/main/agent/auth-profiles.json | profiles=1 (oauth=1, token=0, api_key=0) | openai-codex:samvolont@outlook.com=OAuth (samvolont@outlook.com)
OAuth/token status
- openai-codex
- openai-codex:samvolont@outlook.com (samvolont@outlook.com) ok expires in 8d
emma@Mac-mini-van-Sam ~ %
- openai-codex:samvolont@outlook.com (samvolont@outlook.com) ok expires in 8d
5 hours is definitely not normal.
From your output, the most likely causes are:
- OpenClaw is hitting Ollama via the wrong endpoint (
/v1) qwen3.5:4bis too weak for OpenClaw’s tool/context workload (can appear to “hang”)- Mac mini (gateway) → Windows Ollama network path is slow/failing
Please run these 3 checks:
openclaw config get models.providers.ollama
You want:
baseUrllikehttp://<windows-ip>:11434(no/v1)api: "ollama"
time curl -s http://<windows-ip>:11434/api/tags > /dev/null
Run this from the Mac mini. If this is slow/fails, it’s network/firewall.
openclaw agent --message "Reply with exactly OK"
openclaw logs --follow
If this hangs too, paste ~30 log lines around that run.
Also important: with local 4B models, OpenClaw can stall on tool-calling/formatting even if direct Ollama chat looks fine. Add a fallback model so it doesn’t get stuck forever when Ollama fails.
Docs:
- https://docs.openclaw.ai/providers/ollama
- https://docs.openclaw.ai/gateway/local-models
- https://docs.openclaw.ai/gateway/troubleshooting
If you paste models.providers.ollama + the log snippet, I can give you the exact config fix.
the config:
emma@Mac-mini-van-Sam ~ % openclaw config get models.providers.ollama
🦞 OpenClaw 2026.4.2 (d74a122) — iMessage green bubble energy, but for everyone.
{
"baseUrl": "hidden",
"apiKey": "OPENCLAW_REDACTED",
"api": "ollama",
"models": [
{
"id": "kimi-k2.5:cloud",
"name": "kimi-k2.5:cloud",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 128000,
"maxTokens": 8192,
"api": "ollama"
},
{
"id": "minimax-m2.5:cloud",
"name": "minimax-m2.5:cloud",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 128000,
"maxTokens": 8192,
"api": "ollama"
},
{
"id": "glm-5:cloud",
"name": "glm-5:cloud",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 128000,
"maxTokens": 8192,
"api": "ollama"
},
{
"id": "qwen3.5:9b",
"name": "qwen3.5:9b",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 262144,
"maxTokens": 8192,
"api": "ollama"
},
{
"id": "qwen3.5:4b",
"name": "qwen3.5:4b",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 262144,
"maxTokens": 8192,
"api": "ollama"
}
]
}
emma@Mac-mini-van-Sam ~ %
Nice, this config dump is super helpful — and I think I see the main problem.
Your Ollama endpoint config is correct (api: "ollama" and no /v1), but both local Qwen models are set to contextWindow: 262144.
OpenClaw maps that to Ollama options.num_ctx, and on a 3050 Ti that can make generation extremely slow (or look hung for hours).
Fix (quick)
Lower context window + output tokens for local models:
openclaw config set models.providers.ollama.models[3].contextWindow 16384 --strict-json
openclaw config set models.providers.ollama.models[4].contextWindow 8192 --strict-json
openclaw config set models.providers.ollama.models[3].maxTokens 2048 --strict-json
openclaw config set models.providers.ollama.models[4].maxTokens 1024 --strict-json
openclaw gateway restart
(From your pasted order: [3]=qwen3.5:9b, [4]=qwen3.5:4b.)
Then test
openclaw agent --message "Reply with exactly OK"
If that works fast, you found it.
If you still see long waits after this, next check is network latency Mac→Windows:
time curl -s http://100.115.73.115:11434/api/tags > /dev/null
Docs:
- https://docs.openclaw.ai/providers/ollama
- https://docs.openclaw.ai/gateway/local-models
- https://docs.openclaw.ai/cli/config
If you want, paste the new openclaw config get models.providers.ollama after changes and I’ll sanity-check it.
i got an error on the commands: zsh: no matches found: models.providers.ollama.models[3].contextWindow
zsh: no matches found: models.providers.ollama.models[4].contextWindow
zsh: no matches found: models.providers.ollama.models[3].maxTokens
zsh: no matches found: models.providers.ollama.models[4].maxTokens
Yep — that’s zsh globbing on [].
Use quotes around the config path (or escape brackets).
Run this exactly:
openclaw config set 'models.providers.ollama.models[3].contextWindow' 16384 --strict-json
openclaw config set 'models.providers.ollama.models[4].contextWindow' 8192 --strict-json
openclaw config set 'models.providers.ollama.models[3].maxTokens' 2048 --strict-json
openclaw config set 'models.providers.ollama.models[4].maxTokens' 1024 --strict-json
openclaw gateway restart
Then verify:
openclaw config get models.providers.ollama
If you want, paste that output and I’ll check that the values landed correctly.
emma@Mac-mini-van-Sam ~ % openclaw config get models.providers.ollama
🦞 OpenClaw 2026.4.2 (d74a122) — I can't fix your code taste, but I can fix your build and your backlog.
{
"baseUrl": "http://100.115.73.115:11434",
"apiKey": "OPENCLAW_REDACTED",
"api": "ollama",
"models": [
{
"id": "kimi-k2.5:cloud",
"name": "kimi-k2.5:cloud",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 128000,
"maxTokens": 8192,
"api": "ollama"
},
{
"id": "minimax-m2.5:cloud",
"name": "minimax-m2.5:cloud",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 128000,
"maxTokens": 8192,
"api": "ollama"
},
{
"id": "glm-5:cloud",
"name": "glm-5:cloud",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 128000,
"maxTokens": 8192,
"api": "ollama"
},
{
"id": "qwen3.5:9b",
"name": "qwen3.5:9b",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 16384,
"maxTokens": 2048,
"api": "ollama"
},
{
"id": "qwen3.5:4b",
"name": "qwen3.5:4b",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 8192,
"maxTokens": 1024,
"api": "ollama"
}
]
}
emma@Mac-mini-van-Sam ~ %
Perfect — those values are now set correctly ✅
Next step: test if latency is fixed.
openclaw agent --message "Reply with exactly OK"
Then in Discord thread, send:
/new
(so you start a fresh session and don’t carry old heavy context)
If it’s still slow, run this from the Mac mini and paste time/output:
time curl -s http://100.115.73.115:11434/api/chat \
-H 'Content-Type: application/json' \
-d '{"model":"qwen3.5:4b","messages":[{"role":"user","content":"Reply with OK"}],"stream":false,"options":{"num_ctx":8192}}'
If this curl is fast but OpenClaw is still slow, we’ll inspect openclaw logs --follow next.
error with the openclaw command:
emma@Mac-mini-van-Sam ~ % openclaw agent --message "Reply with exactly OK"
🦞 OpenClaw 2026.4.2 (d74a122) — Self-hosted, self-updating, self-aware (just kidding... unless?).
Gateway agent failed; falling back to embedded: Error: Pass --to <E.164>, --session-id, or --agent to choose a session
Error: Pass --to <E.164>, --session-id, or --agent to choose a session
emma@Mac-mini-van-Sam ~ %
Ah yep — that error is just because openclaw agent needs a session selector.
Use:
openclaw agent --agent main --message "Reply with exactly OK"
(or with JSON output)
openclaw agent --agent main --message "Reply with exactly OK" --json
You can also just test directly in this Discord thread now:
- send
/new
- send
Reply with exactly OK
If it still hangs after the context-window fix, next step is openclaw logs --follow while you send that test message.
now we get another error: emma@Mac-mini-van-Sam ~ % openclaw agent --agent main --message "Reply with exactly OK"
🦞 OpenClaw 2026.4.2 (d74a122) — No $999 stand required.
│
◇
Gateway agent failed; falling back to embedded: GatewayClientRequestError: FailoverError: Model context window too small (8192 tokens). Minimum is 16000.
17:33:55+02:00 [agent/embedded] low context window: ollama/qwen3.5:4b ctx=8192 (warn<32000) source=modelsConfig
17:33:55+02:00 [agent/embedded] blocked model (context window too small): ollama/qwen3.5:4b ctx=8192 (min=16000) source=modelsConfig
17:33:55+02:00 [diagnostic] lane task error: lane=main durationMs=571 error="FailoverError: Model context window too small (8192 tokens). Minimum is 16000."
17:33:55+02:00 [diagnostic] lane task error: lane=session:agent:main:main durationMs=573 error="FailoverError: Model context window too small (8192 tokens). Minimum is 16000."
17:33:55+02:00 [model-fallback/decision] model fallback decision: decision=candidate_failed requested=ollama/qwen3.5:4b candidate=ollama/qwen3.5:4b reason=unknown next=none
FailoverError: Model context window too small (8192 tokens). Minimum is 16000.
emma@Mac-mini-van-Sam ~ %
Great debug output — this explains it exactly.
OpenClaw enforces a minimum context window of 16000, so 8192 gets blocked.
Set 4b to 16384 (just above minimum), restart, and retest:
openclaw config set 'models.providers.ollama.models[4].contextWindow' 16384 --strict-json
openclaw gateway restart
openclaw agent --agent main --message "Reply with exactly OK"
If you want extra safety (so it doesn’t hard-fail), add a fallback too:
openclaw models fallbacks add ollama/qwen3.5:9b
You were very close — this is now a config threshold issue, not a mystery hang.
emma@Mac-mini-van-Sam ~ % openclaw agent --agent main --message "Reply with exactly OK"
🦞 OpenClaw 2026.4.2 (d74a122) — If something's on fire, I can't extinguish it—but I can write a beautiful postmortem.
│
◇
LLM request failed: network connection error.
emma@Mac-mini-van-Sam ~ %
Nice — we’ve moved past the model config issue.
Now this is a pure connectivity problem to Ollama (network connection error).
Please run this from the Mac mini:
curl -sS --max-time 8 http://100.115.73.115:11434/api/tags
- If this fails/timeouts: OpenClaw can’t reach Ollama host.
- If this works: share output + we check OpenClaw logs next.
Then on the Windows Ollama machine, verify Ollama is actually up:
ollama list
If curl from Mac fails, likely causes are:
- Windows Ollama not running
- Ollama not listening for remote connections
- firewall/Tailscale route blocking port
11434 - Tailscale IP changed (100.x address drift)
Also useful while testing:
openclaw logs --follow
Paste the result of the Mac curl and we can pinpoint exactly which of those it is.
the network connection is because of my tailscale not beïng active on the mac 😅
Haha yep, that would do it 😄
Nice catch — that explains the network error perfectly.
So your final fix set is:
- Tailscale active on Mac + Windows
- Ollama context windows lowered (and
qwen3.5:4bkept at>=16000, e.g. 16384)
You should be in much better shape now.
Glad that helped! 🎉
If this fixed it, please right-click the solution message → Apps → ✅ Mark Solution.
This helps Answer Overflow index the solution for future users.
reply from the agent: mma@Mac-mini-van-Sam ~ % openclaw agent --agent main --message "Reply with exactly OK"
🦞 OpenClaw 2026.4.2 (d74a122) — Greetings, Professor Falken
│
◇
OK
emma@Mac-mini-van-Sam ~ %
Perfect — that confirms the model path is working again ✅ (OK response from openclaw agent).
You should now be good in Discord too (use /new once if the old thread context was huge).
Glad that helped! 🎉
If this fixed it, please right-click the solution message → Apps → ✅ Mark Solution.
This helps Answer Overflow index the solution for future users.