How do I increase the LLM HTTP request timeout for Ollama? | Friends of the Crustacean 🦞🤝 | Page 1

low lynx Apr 4, 2026, 10:23 PM

#

How do I increase the LLM HTTP request timeout for Ollama?
Running gemma4:e4b locally. It responds fine via direct Ollama API (~11s for simple prompts) but OpenClaw times out at exactly 60 seconds. agents.defaults.timeoutSeconds: 180 didn't help — that appears to be the agent turn timeout, not the LLM request timeout. Is there a config key to raise the per-request HTTP timeout for local Ollama calls?

lavish ivyBOT Apr 4, 2026, 10:23 PM

#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

fierce sirenBOT Apr 4, 2026, 10:25 PM

#

Yep — you’re hitting the LLM idle stream timeout (default is 60s), not agents.defaults.timeoutSeconds.

Use this key:

openclaw config set agents.defaults.llm.idleTimeoutSeconds 180
openclaw gateway restart

Or in config:

{
agents: {
defaults: {
llm: {

#

idleTimeoutSeconds: 180
}
}
}
}

Notes:

agents.defaults.timeoutSeconds = overall agent run timeout.
agents.defaults.llm.idleTimeoutSeconds = max silence between streamed model chunks.
0 disables the idle timeout entirely.

If it still dies at 60s, please share:

openclaw --version

#

openclaw logs --follow

and check for LLM idle timeout (60s): no response from model.

Docs: https://docs.openclaw.ai/gateway/configuration and https://docs.openclaw.ai/providers/ollama

low lynx Apr 4, 2026, 10:27 PM

#

Is this fairly normal for a local model to take so long searching the web for a simple item I’m only running on a Mac mini 16gpu

fierce sirenBOT Apr 4, 2026, 10:27 PM

#

Yeah — for web search tasks with a local model, that can be totally normal.

Even for a “simple” question, OpenClaw usually does multiple steps (decide to search → call web tool(s) → read snippets → synthesize), so it’s not just one fast completion. Local models are often much slower on this tool-using loop than on plain chat.

Quick tuning tips:

Keep agents.defaults.llm.idleTimeoutSeconds higher (like 180+) so long first-token delays don’t get cut off.
Use a smaller/faster local model for tool-heavy tasks.
Keep context/token budgets modest (large context can slow local inference a lot).
For fast/reliable web+tools, cloud models are usually much better today.

If you want, paste:

openclaw --version
openclaw status --all

and I can suggest tighter settings for your exact setup.

#

Model guidance docs: https://docs.openclaw.ai/concepts/models
Community notes: https://docs.openclaw.ai/concepts/models#model-discussions-community-notes

Also since you’re asking about “what’s normal in practice,” it’s worth asking in #users-helping-users too for real-world hardware comparisons.

low lynx Apr 4, 2026, 10:31 PM

#

So you’re suggesting switching to the 2B verse 4B for things like a simple web search correct is that what you’re insinuating

fierce sirenBOT Apr 4, 2026, 10:31 PM

#

Not insinuating — just describing the speed/quality tradeoff.

Short version:

2B: usually faster, but weaker tool-use/reasoning quality.
4B: usually slower, but more reliable answers.

So yes, for speed-first simple lookups, a smaller model can help.
But with your setup, I’d do this order first:

Keep 4B and set:
agents.defaults.llm.idleTimeoutSeconds: 180
Test again.
If still too slow, try 2B for those tasks.

That way you’re changing one thing at a time and can see what actually improves latency.

low lynx Apr 4, 2026, 10:33 PM

#

You are amazing. Thank you so much.

fierce sirenBOT Apr 4, 2026, 10:33 PM

#

Hell yeah, glad it helped! 🦐

If this fixed it, please right-click the solution message → Apps → ✅ Mark Solution.
That helps Answer Overflow index it so the next person finds it fast.

haughty cove Apr 6, 2026, 2:14 PM

#

@low lynx May I know your specs? I am currently on a i9 12900 with 32gb ram without gpu

fierce sirenBOT Apr 6, 2026, 2:15 PM

#

You’re correct: agents.defaults.timeoutSeconds is the agent turn timeout, not the model HTTP/stream idle timeout.

For the 60s cutoff, use:

{
agents: {
defaults: {
llm: {
idleTimeoutSeconds: 180 // or 0 to disable idle timeout
}
}
}
}

CLI equivalent:

#How do I increase the LLM HTTP request timeout for Ollama?