#Openclaw disconnects from external ollama server

1 messages · Page 1 of 1 (latest)

slender flower
#

I have a ollama VM and a openclaw VM , i setup openclaw to use my local ollama VM through the API. At first everything works, the agent does the things said to him, but then randomly i get an error either This response is taking longer than expected. Send another message to continue. or run error: LLM request timed out. I verified that the ollama server was running the WHOLE time and it was responding to API calls.

fallow needleBOT
#

First thing: what OpenClaw version are you on? Please run:

openclaw --version

This sounds less like “Ollama went down” and more like OpenClaw’s model request idle/HTTP watchdog firing. A server can still answer /api/tags or quick curl calls while a real /api/chat generation stalls long enough that OpenClaw aborts it.

A few things to check:

  1. Make sure you’re using Ollama’s native API URL, not OpenAI-compatible /v1:
models: {
  providers: {
    ollama: {
#
      api: "ollama",
      baseUrl: "http://YOUR_OLLAMA_VM:11434",
      timeoutSeconds: 300
    }
  }
}

No /v1 at the end. The Ollama docs call this out because /v1 can break tool calling. https://docs.openclaw.ai/providers/ollama

  1. Increase the provider timeout, not just the whole agent timeout. For slow remote/local models, models.providers.<id>.timeoutSeconds extends the provider HTTP request + streaming idle watchdog. Start with 300, maybe 600 if the model is large/slow. Docs: https://docs.openclaw.ai/concepts/model-providers

  2. Run the lean smoke test from the OpenClaw VM, using the exact model ref:

#
OLLAMA_API_KEY=ollama-local \
  openclaw infer model run \
    --local \
    --model ollama/YOUR_MODEL \
    --prompt "Reply with exactly: pong" \
    --json

If that works but normal agent turns fail, the model is probably struggling with the full agent/tool prompt or long context rather than basic connectivity.

  1. Grab logs around the failure:
openclaw status --all
openclaw logs --follow
#

Also worth checking Ollama VM resources during a failing turn: CPU/GPU saturation, RAM/swap, context length, and whether /api/chat is producing streaming chunks or sitting quiet for minutes. Small/quantized Ollama models can start fine and then randomly choke once the session context/tools grow. 🦞

slender flower
#

The resources are good the entire time, there is no overallocation
openclaw --version
OpenClaw 2026.5.6 (c97b9f7)
"models": {
"mode": "merge",
"providers": {
"ollama": {
"baseUrl": "http://192.168.1.231:11434",
"api": "ollama",
"timeoutSeconds": 600,
"apiKey": "OLLAMA_API_KEY",

10:15:51+00:00 warn fetch-timeout {"subsystem":"fetch-timeout"} {"timeoutMs":300000,"elapsedMs":300000,"operation":"fetchWithSsrFGuard","url":"http://192.168.1.231:11434/api/chat"} fetch timeout reached; aborting operation
10:15:51+00:00 warn agent/embedded {"subsystem":"agent/embedded"} {"event":"embedded_run_agent_end","tags":["error_handling","lifecycle","agent_end","assistant_error"],"runId":"c430edbc-e5c9-4b44-b7e5-40dce280680f","isError":true,"error":"LLM request timed out.","failoverReason":"timeout","model":"glm-4.7-flash:latest","provider":"ollama","rawErrorPreview":"request timed out","rawErrorHash":"sha256:93deb3798f1b","providerRuntimeFailureKind":"timeout"} embedded run agent end
["error_handling","failover","assistant","surface_error"],"runId":"c430edbc-e5c9-4b44-b7e5-40dce280680f","stage":"assistant","decision":"surface_error","failoverReason":"timeout","profileFailureReason":null,"provider":"ollama","model":"glm-4.7-flash:latest","sourceProvider":"ollama","sourceModel":"glm-4.7-flash:latest","profileId":"sha256:9c018ec112cf","fallbackConfigured":false,"timedOut":false,"aborted":false,"rawErrorPreview":"request timed out","rawErrorHash":"sha256:93deb3798f1b","providerRuntimeFailureKind":"timeout"} embedded run failover decision
10:15:51+00:00 error diagnostic {"subsystem":"diagnostic"} lane task error: lane=main durationMs=1615607 error="FailoverError: LLM request timed out."
10:15:51+00:00 error diagnostic {"subsystem":"diagnostic"} lane task error: lane=session:agent:main:main durationMs=1615609 error="FailoverError: LLM request timed out.

fallow needleBOT
#

That log is the important bit:

timeoutMs: 300000 ... /api/chat

So OpenClaw did not use your timeoutSeconds: 600 for that request — it aborted at 300s. If the active config had applied, I’d expect timeoutMs: 600000.

Two likely causes:

  1. Gateway is still running with older config / wasn’t restarted after the edit.
  2. The active provider/model config isn’t the same one you pasted — e.g. model selected from a different provider entry/profile, or an inline model entry/default is winning.

Please run:

#
openclaw models status
openclaw config get models.providers.ollama
openclaw gateway restart

Then trigger one failing request again and check whether the log says timeoutMs:600000 or still 300000.

If it still says 300000 after restart, that smells like a config-resolution bug or a different provider path. In that case, please paste the output of:

openclaw models status --json
openclaw config get agents.defaults.model
openclaw config get agents.defaults.models
openclaw config get models.providers.ollama
#

One more diagnostic: from the OpenClaw VM, test the exact /api/chat path, not just /api/tags:

time curl -N http://192.168.1.231:11434/api/chat \
  -H 'Content-Type: application/json' \
  -d '{"model":"glm-4.7-flash:latest","stream":true,"messages":[{"role":"user","content":"Reply with exactly: pong"}]}'

If that can sit silent for 5+ minutes before the first chunk, OpenClaw is behaving as configured and the model/request is stalling. If curl streams quickly but OpenClaw times out, then we should focus on OpenClaw’s provider config/resolution. 🦞

slender flower
#

openclaw models status
openclaw config get models.providers.ollama
openclaw gateway restart

🦞 OpenClaw 2026.5.6 (c97b9f7) — It's not "failing," it's "discovering new ways to configure the same thing wrong."

Config : ~/.openclaw/openclaw.json
Agent dir : ~/.openclaw/agents/main/agent
Default : ollama/glm-4.7-flash:latest
Fallbacks (0) : -
Image model : -
Image fallbacks (0): -
Aliases (0) : -
Configured models (5): ollama/glm-4.7-flash:latest, openai/to qwen3-coder, ollama/qwen3-coder, ollama/qwen3-coder:30b, ollama/qwen2.5-coder:32b

Auth overview
Auth store : ~/.openclaw/agents/main/agent/auth-profiles.json
Shell env : off
Providers w/ OAuth/tokens (0): -

  • ollama effective=profiles:~/.openclaw/agents/main/agent/auth-profiles.json | profiles=1 (oauth=0, token=0, api_key=1) | ollama:default=marker(ollama-local) | models.json=OL...EY | source=models.json: ~/.openclaw/agents/main/agent/models.json | synthetic=plugin-owned | source=models.providers.ollama (synthetic local key)

OAuth/token status

  • none

🦞 OpenClaw 2026.5.6 (c97b9f7) — I've read more man pages than any human should—so you don't have to.

{
"baseUrl": "http://192.168.1.231:11434",
"apiKey": "OPENCLAW_REDACTED",
"api": "ollama",
"timeoutSeconds": 600,

the time commnad output after running it 2 times:
real 0m7.104s
user 0m0.004s
sys 0m0.003s

real 0m1.524s
user 0m0.002s
sys 0m0.006s

fallow needleBOT
#

Good data. Two separate things here:

  • Your tiny direct /api/chat curl is fine, so basic network/Ollama availability is not the issue.
  • The original failing OpenClaw request was a full agent prompt request and hit a fetch watchdog at 300s.

The part I’m still suspicious of: your config says timeoutSeconds: 600, but the failing log showed timeoutMs:300000. After the openclaw gateway restart, please check the next failure log. If it now says timeoutMs:600000, the config is applying and the model is simply going silent too long on full agent turns. If it still says 300000, that’s likely a config-resolution/timeout propagation bug worth filing: https://github.com/openclaw/openclaw/issues

For a better reproduction than the tiny curl, run this from the OpenClaw VM:

OLLAMA_API_KEY=ollama-local \
  openclaw infer model run \
    --local \
    --model ollama/glm-4.7-flash:latest \
    --prompt "Reply with exactly: pong" \
    --json
#

If infer model run works but chat-agent turns fail, then GLM is probably choking on the larger agent/tool context. Things to try next:

/model ollama/qwen3-coder:30b

or temporarily reduce tool/context pressure for that Discord thread/agent. Local Ollama models can look “healthy” on small prompts and still stall once OpenClaw sends tools + history + system context.

Also: your configured list has openai/to qwen3-coder in it — that looks like a typo/stray model ref. Probably unrelated to this timeout, but I’d clean it up later so model selection doesn’t get weird.

slender flower
#

so how could i fix the problem? like why is it randomly timing out? for first 2-3 mins its doing the task perfectly and then just run error: LLM request timed out. or This response is taking longer than expected. Send another message to continue.

fallow needleBOT
#

The “random” part is usually because OpenClaw is not making one single Ollama call for the whole task.

A normal agent turn can look like:

  1. model call decides what to do
  2. tool runs
  3. model call reads tool result
  4. another tool runs
  5. model call summarizes / continues

So it can work perfectly for 2–3 minutes, then a later /api/chat call has much more history + tool schemas + tool output in the prompt. That call may sit silent while Ollama evaluates the huge prompt/context. If no bytes arrive before the watchdog, OpenClaw aborts it as LLM request timed out.

I’d fix it in this order:

1. Confirm your timeout actually applies after restart

Next time it fails, check the log line. You want to see:

#
timeoutMs:600000

If it still says timeoutMs:300000, your provider timeout is not being applied to the active run, and that should be filed as a bug with your config + logs: https://github.com/openclaw/openclaw/issues

2. Turn on local model lean mode

This is the most practical fix for Ollama/local models choking on full agent context:

openclaw config set agents.defaults.experimental.localModelLean true
openclaw gateway restart

It removes the heaviest default tools (browser, cron, message) from the prompt, which often stops local models from stalling mid-task. Docs: https://docs.openclaw.ai/concepts/experimental-features#local-model-lean-mode

#

3. Use a fresh thread/session after changing it

Old sessions may already have a huge transcript/tool history. Test in a new thread or clear/switch session so you’re not carrying the same bloated context forward.

4. If it still stalls, reduce the task/tool surface

Ask it to do smaller phases, or temporarily use a narrower tool allowlist / fewer tools for that agent. If openclaw infer model run --gateway ... works but real agent tasks fail, the transport is fine — the model is struggling with agent-sized prompts.

5. Try another Ollama model for agentic work

Some local models are fine at chat but unreliable with tools. Your config has qwen3-coder:30b; I’d test that against the same task. If it survives longer, GLM is probably the weak link for this workload.

Short version: your Ollama VM can be healthy and still have a model call go silent because the later OpenClaw prompt is much heavier than your quick curl test. Lean mode + fresh session is the first real fix I’d try. 🦞

slender flower
#

how to clear the entire hisotry

fallow needleBOT
#

If you just want this Discord thread/channel to start fresh, send this as a standalone message:

/reset

or:

/new

That starts a new session for the chat. It does not delete old transcript files, but the next prompt won’t carry the old conversation context. Docs: https://docs.openclaw.ai/concepts/session

If you mean truly wipe stored session history on disk, do a backup first, then reset sessions from the OpenClaw VM:

#
openclaw backup create
openclaw reset --dry-run
openclaw reset --scope config+creds+sessions --yes --non-interactive
openclaw gateway restart

Careful: config+creds+sessions also removes config/credentials, not just chat history. Session files live under:

~/.openclaw/agents/<agentId>/sessions/

For your Ollama timeout test, I’d start with just /reset or /new here, then retry after enabling lean mode.

slender flower
#

after clearing the chat and opening new one in the tui, and after doing fixes you suggested i did this:

prompt: can you check if apache2 is installed and if it is in the document root create a little cool CV website for myself. i am a sys admin

response can you check if apache2 is installed and if it is in the document root create a little cool CV website for myself. i am a sys admin
it was spinning for around 30 seconds and then This response is taking longer than expected. Send another message to continue.

10:15:51+00:00 warn agent/embedded {"subsystem":"agent/embedded"} {"event":"embedded_run_agent_end","tags":["error_handling","lifecycle","agent_end","assistant_error"],"runId":"c430edbc-e5c9-4b44-b7e5-40dce280680f","isError":true,"error":"LLM request timed out.","failoverReason":"timeout","model":"glm-4.7-flash:latest","provider":"ollama","rawErrorPreview":"request timed out","rawErrorHash":"sha256:93deb3798f1b","providerRuntimeFailureKind":"timeout"} embedded run agent end
10:15:51+00:00 warn agent/embedded {"subsystem":"agent/embedded"} {"event":"embedded_run_failover_decision","tags":["error_handling","failover","assistant","surface_error"],"runId":"c430edbc-e5c9-4b44-b7e5-40dce280680f","stage":"assistant","decision":"surface_error","failoverReason":"timeout","profileFailureReason":null,"provider":"ollama","model":"glm-4.7-flash:latest","sourceProvider":"ollama","sourceModel":"glm-4.7-flash:latest","profileId":"sha256:9c018ec112cf","fallbackConfigured":false,"timedOut":false,"aborted":false,"rawErrorPreview":"request timed out","rawErrorHash":"sha256:93deb3798f1b","providerRuntimeFailureKind":"timeout"} embedded run failover decision
10:15:51+00:00 error diagnostic {"subsystem":"diagnostic"} lane task error: lane=main durationMs=1615607 error="FailoverError: LLM request timed out."
10:15:51+00:00 error diagnostic {"subsystem":"diagnostic"} lane task error: lane=session:agent:main:main durationMs=1615609 error="FailoverError: LLM request timed out."

fallow needleBOT
#

One catch: the log you pasted is still timestamped 10:15:51, so it looks like the old failure, not the new TUI run. For the new run, grab fresh logs immediately after it happens:

openclaw logs --follow

Then retry the prompt and watch for the new runId + fetch-timeout line.

Also: “This response is taking longer than expected. Send another message to continue.” is not always the same as “the model failed.” In the TUI/CLI it can mean the client-side wait timed out while the gateway run is still going. If you send another message like continue or check /status, the run may still be active.

For TUI, start it with a longer wait timeout:

openclaw tui --timeout-ms 600000

or if using the CLI agent path:

#
openclaw agent --timeout 600 "can you check if apache2 is installed..."

Now, for that Apache prompt specifically: it requires tool use (exec, filesystem writes, maybe elevated permissions). Local Ollama models often pass simple chat tests but struggle with OpenClaw’s full agent/tool loop. So I’d test in this order:

openclaw infer model run --gateway --model ollama/glm-4.7-flash:latest --prompt "Reply with exactly: pong" --json

Then test tool-calling with something tiny in TUI:

Run `which apache2 || which apachectl || echo no-apache` and tell me the output. Do not create files.

If that hangs or just echoes your prompt, GLM is not handling the tool loop reliably. Try switching to your Qwen coder model:

#
/model ollama/qwen3-coder:30b

Then /reset and retry the tiny command test.

If Qwen works and GLM doesn’t, the fix is “don’t use GLM for tool-enabled OpenClaw agent work.” If both fail, paste the fresh logs from the new run, especially any line with fetch-timeout, model.call, or tool_call.

slender flower
#

i have this issues with all models

fallow needleBOT
#

If it happens with all Ollama models, I’d stop treating this as a bad model and split the two symptoms:

1. This response is taking longer than expected... after ~30s

That is usually the client wait timeout, not necessarily the agent failing. The Gateway may still be working. In TUI, start it with a longer wait:

openclaw tui --timeout-ms 600000

Then retry. While it spins, check from another shell:

openclaw status
openclaw logs --follow
#

2. Actual LLM request timed out

That only matters if you see a fresh fetch-timeout log for the new run. Your pasted log was still the old 10:15:51 run. For the new run, look for:

fetch-timeout ... timeoutMs:600000 ... /api/chat

If it still says 300000 after restart, that’s a bug/config-resolution issue. If it says 600000, then OpenClaw is waiting correctly and Ollama is leaving the /api/chat request silent too long.

For your Apache/CV task, also test whether tool calling works at all:

Run `date` using the shell tool and tell me the output. Do not do anything else.

If that also just spins/echoes, the issue is in the agent/tool loop, not Apache or the model choice.

#

One more important check: run the Gateway-path model test, not only raw curl:

OLLAMA_API_KEY=ollama-local \
openclaw infer model run \
  --gateway \
  --model ollama/glm-4.7-flash:latest \
  --prompt "Reply with exactly: pong" \
  --json

If --gateway works but TUI tool tasks don’t, focus on tool/runtime config. If --gateway also hangs/timeouts across all models, focus on OpenClaw → Ollama transport/config and file an issue with the fresh logs: https://github.com/openclaw/openclaw/issues

slender flower
#

well i tried the openclaw tui --timeout-ms 600000 and ran the same prompt to the same model. it says hobnobbing.... 5m 26 | connected than after it starts streaming and says connected and it does that for a minute and then :
This response is taking longer than expected. Send another message to continue.

fallow needleBOT
#

Ah — that specific behavior is now clearly the TUI streaming watchdog, not necessarily Ollama disconnecting.

I checked the TUI code: that exact message is shown after 30 seconds without assistant text deltas while a run is active. --timeout-ms 600000 raises the agent timeout, but it does not change that TUI “no streaming text for 30s” watchdog.

So what’s happening is likely:

  1. OpenClaw sends a big tool/agent request to Ollama.
  2. Ollama is still working, but sends no assistant text for >30s — common with local models during prompt eval/tool planning.
  3. TUI says:
This response is taking longer than expected. Send another message to continue.

That message by itself does not prove the model failed.

What to do right now:

#
  • After that message appears, send:
continue
  • In another shell, watch whether the run is still active or actually failed:
openclaw status
openclaw logs --follow

If there is no fresh LLM request timed out / fetch-timeout line, the Gateway is probably still working and TUI just gave up waiting for text.

For a real workaround, use a surface that tolerates long silent local-model turns better, or keep sending continue in TUI. For actual fixing, this is worth a GitHub issue: TUI’s streaming watchdog is hardcoded too aggressively for local Ollama models that can legitimately go silent while evaluating/tool-planning.