#Openclaw chat extremely slow

1 messages · Page 1 of 1 (latest)

median quarry
#

Hello. Using a desktop with 16GB Ram and a Ryzen 3200G without dedicated GPU to try running openclaw with local models, in a Ubuntu.
After installing and configured the desktop with Openclaw with Ollama and Gemma4:e4b, I can use the TUI and directly in the terminal with Ollama with responses within seconds, but using the webchat it takes more than 20 minutes or sometimes it fails to answer.
I tried uninstall and install again, checked all dependencies and settings using AI but still same issue. Even tried installing other models that are smaller to test (mistral and gemma4:e2b and others, but still the same.
What could be the issue?

desert needleBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> âś… Mark Solution

dense daggerBOT
#

This is usually one of a few things — and the important bit is that “Ollama replies fast in terminal/TUI” doesn’t always mean the full WebChat agent path is healthy.

First: what does openclaw --version show? Version mismatch/old builds cause a lot of weird behavior, so check that early.

Most likely causes:

  1. WebChat is running the full agent loop, not a tiny Ollama prompt. It loads system/project context, tools, memory/session history, and may ask the model to reason about tool use. Small local models on CPU-only hardware can be fast for plain chat but painfully slow or unreliable as an OpenClaw agent.

  2. WebChat may not be using the model you think it is. In WebChat, run /model status and confirm it says the exact ollama/... model and the expected Ollama base URL. Also check openclaw models status from the same environment that runs the gateway.

  3. Make sure Ollama is configured with the native endpoint, not /v1. For OpenClaw + Ollama it should be like:
    baseUrl: "http://127.0.0.1:11434"
    not http://127.0.0.1:11434/v1.
    Docs: https://docs.openclaw.ai/providers/ollama

  4. Long/old WebChat session history can make this worse. Try a fresh WebChat session/thread, or compact/reset the session if it has grown large.

#

Good quick test:

OLLAMA_API_KEY=ollama-local openclaw infer model run \
  --local \
  --model ollama/<your-model>:latest \
  --prompt "Reply with exactly: pong" \
  --json

If that is fast, but WebChat is slow, the issue is probably agent/tool/session overhead or model tool-following capability — not raw Ollama speed.

My honest recommendation: on a Ryzen 3200G with 16GB RAM and no GPU, local models are okay for simple text, but they’re often a rough fit for OpenClaw’s full agent mode. If you want reliable WebChat/tool use, try a cloud model as the primary and keep Ollama for experiments or lower-stakes local tasks.

Also useful docs:
https://docs.openclaw.ai/concepts/models
https://docs.openclaw.ai/web/webchat

median quarry
#

openclaw --version
OpenClaw 2026.5.7

openclaw models status

🦞 OpenClaw 2026.5.7 (eeef486) — Your personal assistant, minus the passive-aggressive calendar reminders.

Config : ~/.openclaw/openclaw.json
Agent dir : ~/.openclaw/agents/main/agent
Default : ollama/gemma2:9b
Fallbacks (0) : -
Image model : -
Image fallbacks (0): -
Aliases (0) : -
Configured models (5): ollama/llama3.2:3b, openai/ollama:gemma2:9b, ollama/gemma2:9b, ollama/gemma4:e2b, ollama/gemma4:e4b

Auth overview
Auth store : ~/.openclaw/agents/main/agent/auth-profiles.json
Shell env : off
Providers w/ OAuth/tokens (0): -

  • ollama effective=profiles:~/.openclaw/agents/main/agent/auth-profiles.json | profiles=1 (oauth=0, token=0, api_key=1) | ollama:default=marker(ollama-local) | models.json=OL...EY | source=models.json: ~/.openclaw/agents/main/agent/models.json | synthetic=plugin-owned | source=models.providers.ollama (synthetic local key)

OAuth/token status

  • none

/models status
Unknown provider: status

Available providers:

  • ollama
  • openai

Use: /models <provider>

/models ollama status
Models (ollama · 🔑 api-key (ollama:default)) — showing 1-4 of 4 (page 1/1)

  • ollama/gemma2:9b
  • ollama/gemma4:e2b
  • ollama/gemma4:e4b
  • ollama/llama3.2:3b

Switch: /model <provider/model>
All: /models list ollama all

  1. There is no V1 in the baseUrl, so it seems correct

  2. This is a fresh install and I always restart the gateway the webchat with /new

dense daggerBOT
#

Thanks — that actually narrows it down a lot.

Nothing in that output screams “broken install”. I’d focus on model/runtime pressure, not reinstalling.

A few concrete things I’d change/test:

  1. Your default is ollama/gemma2:9b. On a Ryzen 3200G + 16GB RAM + no GPU, that’s likely too heavy for the full OpenClaw agent prompt even if it answers normal Ollama prompts quickly. Try making ollama/llama3.2:3b the default first:
openclaw models set ollama/llama3.2:3b
openclaw gateway restart

Then start a fresh WebChat /new and ask something tiny like “reply pong”.

  1. Run the narrow OpenClaw→Ollama test, not just raw Ollama:
OLLAMA_API_KEY=ollama-local openclaw infer model run \
#
  --local \
  --model ollama/llama3.2:3b \
  --prompt "Reply with exactly: pong" \
  --json

If this is fast but WebChat is slow, that confirms the problem is the agent/tool/session layer, not Ollama connectivity.

  1. Enable the “lean local model” path. OpenClaw has a config option specifically for this kind of setup:
{
  agents: {
    defaults: {
      experimental: {
        localModelLean: true,
      },
#
      model: { primary: "ollama/llama3.2:3b" }
    }
  }
}

That removes some heavier tool surface from local-model agent turns. Docs: https://docs.openclaw.ai/providers/ollama

  1. If it still hangs, cap the Ollama context. Auto-discovered local models can expose more context than your machine can comfortably run. In explicit model config, use something conservative like num_ctx: 2048 or 4096 for this hardware. Huge context on CPU can look exactly like “WebChat takes 20 minutes”.

  2. One oddity: you have openai/ollama:gemma2:9b listed too. Your default is the correct native ollama/... provider, so it may not matter — but if you previously tried OpenAI-compatible Ollama config, remove/ignore that route. Native Ollama should be api: "ollama" and base URL without /v1.

My honest read: 2026.5.7 is current enough, your provider is basically resolving, and this hardware is just getting crushed by the full agent turn — especially with gemma2:9b. I’d test llama3.2:3b + localModelLean + small num_ctx before touching install/dependencies again.

If that still fails, paste openclaw status --all and the slow-turn log tail from openclaw logs --follow — that should show whether it’s waiting on Ollama, tool setup, memory, or a timeout.

median quarry
#
  1. Done, using llama3.2:3b.
  2. After running that command I get this error: "Error: No text output returned for provider "ollama" model "llama3.2:3b"."
  3. Done after testing again I (prompted "reply pong") and the there's an error after a minute or so: "[assistant turn failed before producing content]"
dense daggerBOT
#

Good catch — that error changes the diagnosis.

If openclaw infer model run --local ... returns “No text output returned”, then this is failing before the full WebChat agent layer. So don’t keep reinstalling WebChat/gateway; we need to find why OpenClaw’s Ollama call is getting an empty response.

Please test the native Ollama API directly:

curl -s http://127.0.0.1:11434/api/chat \
  -H 'Content-Type: application/json' \
  -d '{
    "model":"llama3.2:3b",
    "messages":[{"role":"user","content":"Reply with exactly: pong"}],
    "stream":false,
    "options":{"num_ctx":2048}
  }' | jq

What matters is whether the JSON has:

#
"message": { "content": "pong" }

If that direct /api/chat call is also empty or weird, it’s an Ollama/model/template issue. Try:

ollama pull llama3.2:3b
ollama show llama3.2:3b

and check journalctl --user -u ollama -f or however Ollama is running on your Ubuntu box.

If the direct /api/chat call returns pong, then OpenClaw is the part mishandling the provider/config. Next checks:

openclaw infer model run --gateway \
  --model ollama/llama3.2:3b \