#Increase undici timeout

1 messages · Page 1 of 1 (latest)

near raft
#

I’m trying to run OpenClaw on an Oracle A1 VPS (4 CPUs and 24GB of memory) with a local LLM (ollama/qwen3.5:9b). I don’t mind if the agent is slow because anyway I will give it asynchronous tasks and it can run them in hours or days.

I increased the OpenClaw timeout but it seems that the HTTP client undici has an hardcoded timeout somewhere, if the model is not able to ingest the context and produce the first output token in less than 5 min.

How can I increase this one as well?

snow dragonBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> âś… Mark Solution

steady duneBOT
grim flameBOT
#

First, check openclaw --version.

This is usually not just one timeout. For slow local Ollama runs, the important knobs are:

  • agents.defaults.timeoutSeconds = whole agent run
  • agents.defaults.llm.idleTimeoutSeconds = how long OpenClaw waits for the first/next model chunk

Try setting both explicitly, then restart the gateway:

openclaw config set agents.defaults.timeoutSeconds 21600
openclaw config set agents.defaults.llm.idleTimeoutSeconds 21600
openclaw gateway restart

If you truly do not care about hung model calls, you can disable the LLM idle watchdog entirely:

#
openclaw config set agents.defaults.llm.idleTimeoutSeconds 0
openclaw gateway restart

Also, there was a real Ollama-related fix in 2026.4.14: slow embedded Ollama runs could still hit the global undici stream cutoff instead of honoring the configured run timeout. So if you're on an older version, update first.

Docs:

If you paste:

openclaw --version
openclaw status --all
#
openclaw logs --follow

and the exact timeout line, I can tell you which layer is still firing.

Also, gently, qwen3.5:9b on 4 A1 CPUs is going to be pretty brutal on first-token latency with big contexts, so a smaller quant/model may help more than just increasing timeouts.

near raft
#

06:04:31+00:00 warn agent/embedded {"subsystem":"agent/embedded"} {"event":"embedded_run_agent_end","tags":["error_handling","lifecycle","agent_end","assistant_error"],"runId":"d8914715-5c17-43cb-b0e5-c4678399d6b1","isError":true,"error":"LLM request failed: network connection error.","failoverReason":"timeout","model":"qwen3.5:9b","provider":"ollama","rawErrorPreview":"fetch failed | Headers Timeout Error","rawErrorHash":"sha256:7aa8f0ebe2ea","providerRuntimeFailureKind":"timeout"} embedded run agent end
06:09:39+00:00 warn agent/embedded {"subsystem":"agent/embedded"} {"event":"embedded_run_agent_end","tags":["error_handling","lifecycle","agent_end","assistant_error"],"runId":"d8914715-5c17-43cb-b0e5-c4678399d6b1","isError":true,"error":"LLM request failed: network connection error.","failoverReason":"timeout","model":"qwen3.5:9b","provider":"ollama","rawErrorPreview":"fetch failed | Headers Timeout Error","rawErrorHash":"sha256:7aa8f0ebe2ea","providerRuntimeFailureKind":"timeout"} embedded run agent end
06:09:39+00:00 warn agent/embedded {"subsystem":"agent/embedded"} {"event":"embedded_run_failover_decision","tags":["error_handling","failover","assistant","surface_error"],"runId":"d8914715-5c17-43cb-b0e5-c4678399d6b1","stage":"assistant","decision":"surface_error","failoverReason":"timeout","profileFailureReason":null,"provider":"ollama","model":"qwen3.5:9b","sourceProvider":"ollama","sourceModel":"qwen3.5:9b","profileId":"sha256:9c018ec112cf","fallbackConfigured":false,"timedOut":false,"aborted":false,"rawErrorPreview":"fetch failed | Headers Timeout Error","rawErrorHash":"sha256:7aa8f0ebe2ea","providerRuntimeFailureKind":"timeout"} embedded run failover decision

grim flameBOT
#

Yep, that log is the smoking gun.

fetch failed | Headers Timeout Error means you’re hitting the undici HTTP-layer timeout, not the normal OpenClaw timeoutSeconds watchdog.

The important bit is this:

  • There is not a separate public config knob just for undici headers/body timeout
  • On newer OpenClaw, that timeout is supposed to be derived from the embedded run timeout
  • This specific Ollama wiring bug was fixed in 2026.4.14

So if you’re seeing a hard ~5 minute cutoff, my bet is you’re on an older build.

What I’d do

  1. Check version:
#
   openclaw --version
  1. If it’s older than 2026.4.14, update first:
    https://docs.openclaw.ai/install/updating

  2. After updating, set a large run timeout and LLM idle timeout:

    openclaw config set agents.defaults.timeoutSeconds 21600
    openclaw config set agents.defaults.llm.idleTimeoutSeconds 21600
    openclaw gateway restart
    
  3. If you really want “wait forever for first token” behavior:

#

If you paste openclaw --version, I can tell you immediately whether this is just “needs update” or something weirder.

near raft
#

OpenClaw 2026.4.15 (041266a)

grim flameBOT
#

Ah, thanks, that changes the diagnosis.

Since you’re on 2026.4.15, you already have the known Ollama timeout fix.
And your error is specifically:

fetch failed | Headers Timeout Error

So this is not the normal agents.defaults.llm.idleTimeoutSeconds watchdog firing. It is dying before the HTTP response headers arrive.

What I think is happening

For Ollama, OpenClaw sends num_ctx from the model’s configured/discovered context window. On a 4-core A1 CPU, if that window is large, qwen3.5:9b can spend forever ingesting prompt/context before it sends anything back, and then undici times out waiting for headers.

So the best workaround is to shrink the model context hard.

Try this

#

Add an explicit Ollama model entry with a much smaller context window, for example 8192 or 16384:

{
  models: {
    providers: {
      ollama: {
        models: [
          {
            id: "qwen3.5:9b",
            contextWindow: 8192,
            maxTokens: 1024
          }
        ]
      }
    }
#
  }
}

Then restart:

openclaw gateway restart

I’d start with 8192 on that VPS.

Important nuance

Because this is a headers timeout, changing only:

#
  • agents.defaults.timeoutSeconds
  • agents.defaults.llm.idleTimeoutSeconds

probably won’t solve this specific failure. Those help after the model request is underway, but here the client is still waiting for the first HTTP response.

If it still dies at ~5 minutes after lowering context

Then I’d call it a likely bug in the guarded Ollama fetch path, and it should go to GitHub:

https://github.com/openclaw/openclaw/issues

Docs:

If you want, paste your current Ollama model config and I can suggest a sane contextWindow/maxTokens for that A1 box.

near raft
#

Isn’t 8192 too low?

grim flameBOT
#

Yep, good catch, my 8192 suggestion was too low.

On the OpenClaw side, the hard floor is actually 16,000 tokens, and it starts warning below 32,000. So for your setup I’d try:

  • 16384 first
  • then 24576 or 32768 if needed

So something like:

{
  models: {
    providers: {
      ollama: {
        models: [
          {
#
            id: "qwen3.5:9b",
            contextWindow: 16384,
            maxTokens: 1024
          }
        ]
      }
    }
  }
}

Then restart the gateway.

So the short answer is:

  • Yes, you should stay at or above ~16k
#
  • No, 8192 is not valid for OpenClaw here
  • For that A1 box, 16384 is the smallest sensible value to try

If it still times out at 16384, I’d try 24576 once, and if it still hard-fails at ~5 minutes, I’d start suspecting a real bug rather than just oversized context.

near raft
#

What should I write in the issue report? Can you write it for me?

grim flameBOT
#

Yep, I can draft it. I can’t file it for you, but this should be close to paste-ready for https://github.com/openclaw/openclaw/issues.

Title

Ollama local model still hits ~5 minute "Headers Timeout Error" on 2026.4.15 despite increased OpenClaw timeouts

Body

## Summary

When running OpenClaw against a slow local Ollama model, the request still appears to die at about 5 minutes with:

`fetch failed | Headers Timeout Error`

This happens even after increasing the OpenClaw timeout, and on a version that already includes the Ollama timeout fix (`2026.4.15`).
#

It looks like the failure happens before the first response token/chunk arrives, so this seems to be the HTTP/undici layer rather than the normal OpenClaw LLM idle timeout.

## Environment

- OpenClaw: `2026.4.15 (041266a)`
- Host: Oracle A1 VPS
- CPU / RAM: 4 CPUs, 24 GB RAM
- Model provider: local Ollama
- Model: `qwen3.5:9b`
- Use case: async/background agent tasks, so slow execution is acceptable

## What I expected

If I increase the OpenClaw timeout, I expect slow local Ollama runs to keep waiting for the first token/chunk instead of failing after about 5 minutes.
#

## What actually happens

The run fails after about 5 minutes with a timeout that looks like an undici headers timeout:

```text
06:04:31+00:00 warn agent/embedded {"subsystem":"agent/embedded"} {"event":"embedded_run_agent_end","tags":["error_handling","lifecycle","agent_end","assistant_error"],"runId":"d8914715-5c17-43cb-b0e5-c4678399d6b1","isError":true,"error":"LLM request failed: network connection error.","failoverReason":"timeout","model":"qwen3.5:9b","provider":"ollama","rawErrorPreview":"fetch failed | Headers Timeout Error","rawErrorHash":"sha256:7aa8f0ebe2ea","providerRuntimeFailureKind":"timeout"} embedded run agent end

06:09:39+00:00 warn agent/embedded {"subsystem":"agent/embedded"} {"event":"embedded_run_agent_end","tags":["error_handling","lifecycle","agent_end","assistant_error"],"runId":"d8914715-5c17-43cb-b0e5-c4678399d6b1","isError":true,"error":"LLM request failed: network connection error.","failoverReason":"timeout","model":"qwen3.5:9b","provider":"ollama","rawErrorPreview":"fetch failed | Headers Timeout Error","rawErrorHash":"sha256:7aa8f0ebe2ea","providerRuntimeFailureKind":"timeout"} embedded run agent end

06:09:39+00:00 warn agent/embedded {"subsystem":"agent/embedded"} {"event":"embedded_run_failover_decision","tags":["error_handling","failover","assistant","surface_error"],"runId":"d8914715-5c17-43cb-b0e5-c4678399d6b1","stage":"assistant","decision":"surface_error","failoverReason":"timeout","profileFailureReason":null,"provider":"ollama","model":"qwen3.5:9b","sourceProvider":"ollama","sourceModel":"qwen3.5:9b","profileId":"sha256:9c018ec112cf","fallbackConfigured":false,"timedOut":false,"aborted":false,"rawErrorPreview":"fetch failed | Headers Timeout Error","rawErrorHash":"sha256:7aa8f0ebe2ea","providerRuntimeFailureKind":"timeout"} embedded run failover decision

Repro

#
1. Run OpenClaw `2026.4.15`
2. Configure it to use local Ollama with `qwen3.5:9b`
3. Trigger an embedded agent run with enough context that the model is very slow to produce its first output
4. Increase OpenClaw timeout settings
5. Observe that the run still fails at about 5 minutes with `Headers Timeout Error`

## Notes

- This is on `2026.4.15`, so it seems to persist even after the recent Ollama timeout fix
- The error is specifically `Headers Timeout Error`, which suggests the request may still be hitting an undici-level timeout before the first byte/chunk is returned
- I am using a very slow CPU-only setup intentionally, so long first-token latency is expected in this environment
- It would be helpful if either:
  - the Ollama request fully respected the configured OpenClaw timeout, or
  - there were a documented config knob for the underlying undici headers/body timeout