openclaw isn't communicating with Ollama it seems | Friends of the Crustacean 🦞🤝 | Page 1

fair crest Apr 18, 2026, 8:37 AM

#

I ran this by some people and was given something to try out to see if I got a response. My openclaw is on 192.168.1.101, my Ollama is on 192.168.1.102
I can get to the openclaw browser and login, but he is just dumb. no response. We know that Ollama is working as we can test it from teh command line and a browser. but we ran out of things to try to find out where the bug or issue is until I read this to try. . It shows that if you have a listener sitting on the ollama server and hit it with something from openclaw, you should see it respond. I am not seeing anything using sudo tcpdump -i lo port 11434 despite me pestering the openclaw interface . We know that Ollama on 192.168.1.102 responds to requests. but it seems openclaw is not sending any? Does anyone have any idea where I may look for the problem here.

sterile gustBOT Apr 18, 2026, 8:37 AM

#

torpid mantleBOT Apr 18, 2026, 8:37 AM

#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

sterile gustBOT Apr 18, 2026, 8:37 AM

#

drifting ginkgoBOT Apr 18, 2026, 8:39 AM

#

Let me check the current docs/source for the Ollama bits so I don’t guess.

#

I found one likely gotcha already: tcpdump -i lo on the Ollama box would miss traffic coming from 192.168.1.101, because that arrives on the NIC, not loopback. I’m checking the rest so I can give you a solid checklist instead of vibes.

#

A couple things jump out.

tcpdump -i lo port 11434 is the wrong interface for this test.
If OpenClaw is on 192.168.1.101 and Ollama is on 192.168.1.102, that traffic hits the network NIC, not loopback.

On the Ollama box, use:

sudo tcpdump -ni any host 192.168.1.101 and port 11434

Test from the OpenClaw host (192.168.1.101), not from your browser machine:

curl http://192.168.1.102:11434/api/tags
openclaw --version

#

openclaw status
openclaw models status
openclaw logs --follow

Common Ollama gotcha: the base URL should be the native Ollama API, not /v1.

Good:

baseUrl: "http://192.168.1.102:11434",
api: "ollama"

Bad:

#

baseUrl: "http://192.168.1.102:11434/v1"

OpenClaw’s Ollama docs are here: https://docs.openclaw.ai/providers/ollama
Fast triage docs: https://docs.openclaw.ai/help/faq
Troubleshooting: https://docs.openclaw.ai/gateway/troubleshooting

Best end-to-end smoke test:

openclaw infer model run --model ollama/<your-exact-model> --prompt "Reply with exactly: smoke-ok" --json

Run that while openclaw logs --follow and the tcpdump above are open.

If you paste these, we can usually pin it down pretty fast:

#

openclaw --version
openclaw models status
result of curl http://192.168.1.102:11434/api/tags from 192.168.1.101
your models.providers.ollama config, if you set one manually

If curl from .101 fails, this is probably bind/firewall on the Ollama host. If curl works but infer model run shows no packets, OpenClaw probably is not actually selecting the Ollama provider.

fair crest Apr 18, 2026, 11:28 AM

#

the response from the smoke test - 11:26:38 [agent/embedded] embedded run agent end: runId=f85051b5-17d6-40c4-a6b1-cda16fd8cdf3 isError=true model=qwen3.5:latest provider=ollama error=500 {"error":"llama runner process has terminated: %!w(\u003cnil\u003e)"} rawError=500 {"error":"llama runner process has terminated: %!w(\u003cnil\u003e)"}
11:27:01 [agent/embedded] embedded run agent end: runId=f85051b5-17d6-40c4-a6b1-cda16fd8cdf3 isError=true model=qwen3.5:latest provider=ollama error=LLM request failed: network connection error. rawError=fetch failed | other side closed
11:27:28 [agent/embedded] embedded run agent end: runId=f85051b5-17d6-40c4-a6b1-cda16fd8cdf3 isError=true model=qwen3.5:latest provider=ollama error=500 {"error":"llama runner process has terminated: %!w(\u003cnil\u003e)"} rawError=500 {"error":"llama runner process has terminated: %!w(\u003cnil\u003e)"}
11:27:59 [agent/embedded] embedded run agent end: runId=f85051b5-17d6-40c4-a6b1-cda16fd8cdf3 isError=true model=qwen3.5:latest provider=ollama error=LLM request failed: network connection error. rawError=fetch failed | other side closed
11:27:59 [agent/embedded] embedded run failover decision: runId=f85051b5-17d6-40c4-a6b1-cda16fd8cdf3 stage=assistant decision=surface_error reason=timeout provider=ollama/qwen3.5:latest profile=sha256:9c018ec112cf
{
"ok": true,
"capability": "model.run",
"transport": "local",
"provider": "ollama",
"model": "qwen3.5:latest",
"attempts": [],
"outputs": [
{
"text": "LLM request failed: network connection error.",
"mediaUrl": null
}
]
}

drifting ginkgoBOT Apr 18, 2026, 11:29 AM

#

This is useful, and honestly a relief: it means OpenClaw is reaching Ollama. I’m checking one config detail before I tell you the next thing to try.

#

Yep, this changes the diagnosis.

OpenClaw is reaching Ollama. The failing piece is Ollama itself:

500 {"error":"llama runner process has terminated..."} is an Ollama-side error
fetch failed | other side closed is what OpenClaw sees when the Ollama runner dies mid-request

So I’d stop chasing OpenClaw networking now and debug the Ollama runner/model.

Try these next:

On 192.168.1.102, watch Ollama logs live while you run the smoke test
- if it’s a service: journalctl -u ollama -f
- if not, run ollama serve in the foreground
From the OpenClaw host (192.168.1.101), hit Ollama directly with the same model:

#

curl http://192.168.1.102:11434/api/chat \
  -H 'content-type: application/json' \
  -d '{"model":"qwen3.5:latest","messages":[{"role":"user","content":"Reply with exactly: smoke-ok"}],"stream":false}'

If that also dies, it’s definitely not OpenClaw.

Double-check the OpenClaw Ollama URL is the native one, not /v1:

baseUrl: "http://192.168.1.102:11434",
api: "ollama"

Paste:

#

openclaw --version
ollama --version

My guess is one of these:

Ollama runner crash / model bug
RAM/VRAM pressure
this specific model is unstable under agent-style prompts

If the direct /api/chat works but OpenClaw still knocks it over, then the model is probably choking on the larger agent prompt/tool payload. In that case I’d test with a different model that you already know is stable in Ollama.

Docs: https://docs.openclaw.ai/providers/ollama

fair crest Apr 18, 2026, 11:34 AM

#

the base url is correct

drifting ginkgoBOT Apr 18, 2026, 11:34 AM

#

Nice, then I’d stop suspecting the URL.

At this point the strongest signal is still: Ollama runner crash, not OpenClaw misrouting.

Why I’m pretty confident:

OpenClaw already got a real Ollama response: 500 {"error":"llama runner process has terminated..."}
that only happens if the request made it to Ollama
the later other side closed fits Ollama dying/resetting mid-request

So the next useful data is:

curl http://192.168.1.102:11434/api/chat \
  -H 'content-type: application/json' \
  -d '{"model":"qwen3.5:latest","messages":[{"role":"user","content":"Reply with exactly: smoke-ok"}],"stream":false}'

#

Run that from 192.168.1.101 and, on the Ollama box, watch logs at the same time.

Also please paste:

openclaw --version
ollama --version

If the direct /api/chat also blows up, we’ve basically proven:

Ollama/model bug, or
RAM/VRAM pressure, or
that model is unstable under this prompt shape

If you can grab the Ollama-side log from the exact moment of the crash, that’s probably the smoking gun. Docs here too: https://docs.openclaw.ai/providers/ollama

fair crest Apr 18, 2026, 11:35 AM

#

OpenClaw 2026.4.12 (1c0672b)
ollama version is 0.20.4

drifting ginkgoBOT Apr 18, 2026, 11:36 AM

#

Thanks, that helps. I’m doing one quick search for that exact Ollama runner error so I don’t send you on a wild goose chase.

#

Those versions don’t scream “OpenClaw bug” to me.

I also found multiple Ollama issues for this exact error string, llama runner process has terminated, so this really does look like Ollama/model/backend rather than OpenClaw.

At this point I’d do 3 quick checks:

Direct native Ollama API test from 192.168.1.101

curl http://192.168.1.102:11434/api/chat \
  -H 'content-type: application/json' \
  -d '{"model":"qwen3.5:latest","messages":[{"role":"user","content":"Reply with exactly: smoke-ok"}],"stream":false}'

Watch Ollama logs on 192.168.1.102 at the same time
If it logs OOM, GPU backend failure, runner exit, or model load failure, that’s the culprit.
Try one other smaller model you already have locally

#

Just as a sanity check. If that works in OpenClaw and qwen3.5:latest does not, then the problem is basically narrowed to the Ollama runner + that model combination.

So the short version is: OpenClaw already proved it can reach Ollama, we’re past the tcpdump/routing stage now.

If you paste:

the result of that direct /api/chat call
the matching Ollama log lines

…I think we can pin it down pretty fast.

fair crest Apr 18, 2026, 11:36 AM

#

from previous test on ollama logs Enabled KvSize:4096 KvCacheType: NumThreads:2 GPULayers:[] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}"
Apr 18 11:31:23 ollama ollama[136389]: time=2026-04-18T11:31:23.284Z level=INFO source=ggml.go:136 msg="" architecture=qwen35 file_type=Q4_K_M name="" description="" num_tensors=883 num_key_values=52
Apr 18 11:31:23 ollama ollama[136389]: load_backend: loaded CPU backend from /usr/local/lib/ollama/libggml-cpu-alderlake.so
Apr 18 11:31:23 ollama ollama[136389]: time=2026-04-18T11:31:23.291Z level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX_VNNI=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 compiler=cgo(gcc)
Apr 18 11:31:23 ollama ollama[136389]: time=2026-04-18T11:31:23.679Z level=INFO source=runner.go:1290 msg=load request="{Operation:alloc LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:4096 KvCacheType: NumThreads:2 GPULayers:[] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}"
Apr 18 11:31:24 ollama ollama[136389]: time=2026-04-18T11:31:24.697Z level=INFO source=runner.go:1290 msg=load request="{Operation:commit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:4096 KvCacheType: NumThreads:2 GPULayers:[] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}"
Apr 18 11:31:24 ollama ollama[136389]: time=2026-04-18T11:31:24.697Z level=INFO source=device.go:245 msg="model weights" device=CPU size="6.1 GiB"
Apr 18 11:31:24 ollama ollama[136389]: time=2026-04-18T11:31:24.697Z level=INFO source=device.go:256 msg="kv cache" device=CPU size="1.4 GiB"
Apr 18 11:31:24 ollama ollama[136389]: time=2026-04-18T11:31:24.697Z level=INFO source=device.go:267 msg="compute graph" device=CPU size="433.7 MiB"
Apr 18 11:31:24 ollama ollama[136389]: time=2026-04-18T11:31:24.697Z level=INFO source=device.go:272 msg="total memory" size="7.9 GiB"
Apr 18 11:31:24 ollama ollama[136389]: time=2026-04

drifting ginkgoBOT Apr 18, 2026, 11:37 AM

#

That log is actually pretty telling.

The big clue is here:

CPU backend
total memory = 7.9 GiB

So Ollama is trying to run that model fully on system RAM, and it wants almost 8 GiB just for the model + KV cache + graph. If that machine is an 8 GB box, or has other stuff running, that is very likely the problem.

That would also explain the pattern:

tiny/manual tests may seem okay
OpenClaw sends a bigger real prompt
Ollama runner falls over
OpenClaw only sees other side closed / llama runner process has terminated

I’d check this next on 192.168.1.102:

#

free -h
dmesg -T | tail -50
# or
journalctl -k -n 50

You’re looking for OOM killer / out-of-memory / killed process messages.

Fastest sanity check:

try OpenClaw with a smaller model you already have locally
if that works, then this is basically confirmed as a RAM/capacity issue, not an OpenClaw routing issue

If you can paste the next 20-30 log lines after total memory size="7.9 GiB", especially anything with killed, oom, or runner exited, we can probably confirm it cleanly.

fair crest Apr 18, 2026, 11:44 AM

#

it is a 16gb machine

drifting ginkgoBOT Apr 18, 2026, 11:44 AM

#

That’s actually useful. I’m checking whether qwen3.5 + Ollama CPU backend has a known crash pattern, because 16 GB makes plain RAM exhaustion less convincing.

#

Ah, that helps.

With 16 GB RAM, the 7.9 GiB load is no longer a smoking gun by itself. It could still be memory pressure if the box is busy, but it makes me lean much more toward:

Ollama runner bug
something specific to the qwen3.5 model family in Ollama
or a bad interaction that only shows up on the bigger OpenClaw prompt

I did a quick check and there are multiple Ollama issues around qwen35 architecture and llama runner process has terminated style crashes, so this does not look unique to OpenClaw.

At this point, my fastest discriminator would be:

Try one different Ollama model you already know works locally
- if OpenClaw works with that, the problem is almost certainly this qwen3.5:latest + Ollama combo
Avoid :latest if possible
- if ollama list shows a specific tag/variant you know runs cleanly, test that exact one instead of latest

#

Still grab the exact crash tail

journalctl -u ollama -n 100 --no-pager
dmesg -T | tail -50

So my current bet is:

OpenClaw is fine, Ollama is reachable, and qwen3.5:latest is the unstable part here.

If you test another Ollama model and it works, I’d call that basically confirmed.

fair crest Apr 18, 2026, 11:45 AM

#

Apr 18 11:42:07 ollama kernel: Out of memory: Killed process 137135 (ollama) total-vm:25494504kB, anon-rss:15465488kB, file-rss:2772kB, shmem-rss:0kB, UID:1000 pgtables:38412kB oom_score>
lines 1-50/50 (END)

drifting ginkgoBOT Apr 18, 2026, 11:45 AM

#

There it is, that’s the smoking gun.

Out of memory: Killed process ... (ollama)

So the root cause is:

OpenClaw is talking to Ollama
Ollama is loading/running the model
the kernel OOM killer is killing Ollama mid-request
OpenClaw then surfaces other side closed / llama runner process has terminated

Even on a 16 GB box, the real working set apparently grew to about 15.4 GB RSS, which is enough to get it killed.

So this is not an OpenClaw routing bug. It’s a capacity issue on the Ollama side.

Best fixes:

#

Use a smaller / lower-memory Ollama model
Reduce effective context / prompt pressure
Give the Ollama host more RAM or swap as a stopgap
Make sure that box is not doing other heavy work at the same time

That also explains why simple manual tests can look okay, but OpenClaw fails: agent prompts are just bigger and push the model over the edge.

Docs: https://docs.openclaw.ai/providers/ollama

Glad we caught the real issue. If you switch to a smaller model and it starts replying, that should confirm it end to end.

fair crest Apr 18, 2026, 11:46 AM

#

suggest correct model for 16gb

drifting ginkgoBOT Apr 18, 2026, 11:46 AM

#

I wouldn’t try to name exact models here, because that gets stale fast.

For 16 GB RAM, the safe guidance is:

stay with materially smaller Ollama models than the one that OOM’d
think single-digit-B class, quantized, not large :latest variants
expect local to be okay for basic chat/testing
for real agentic OpenClaw use, I’d honestly recommend cloud models instead, because 16 GB is pretty tight for reliable tool-heavy runs

So my practical recommendation is:

try a much smaller local model first

verify it with:

openclaw infer model run --model ollama/<model> --prompt "Reply with exactly: smoke-ok" --json

for exact current community picks, ask in #models because that advice changes fast

#

Docs:

If this thread is now basically solved, please right-click the solution message → Apps → ✅ Mark Solution so Answer Overflow can index it for the next person.

fair crest Apr 18, 2026, 11:47 AM

#

cloud models cost tokens and we burn through those too quickly wiht mistakes

drifting ginkgoBOT Apr 18, 2026, 11:48 AM

#

Yep, that’s fair. Burning cloud tokens while debugging is annoying as hell.

If you want to stay cost-conscious, I’d do this:

Use local as the primary model
- but make it a smaller local model that actually survives on your box

Test with infer model run first

much cheaper than learning via failed full chat turns

openclaw infer model run --model ollama/<model> --prompt "Reply with exactly: smoke-ok" --json

Keep sessions fresh while debugging
- long chat history = bigger prompts = more local RAM pressure too
Only use cloud as fallback, if at all

#

local for experimentation
- cloud only when you really need a reliable answer

So I think your best move is still:

pick a smaller local model
verify it with infer model run
then try the UI again

For exact current budget-friendly picks, I’d still ask in #models since that changes fast:

If this fixed the root cause, please right-click the solution message → Apps → ✅ Mark Solution so it gets indexed on Answer Overflow.

fair crest Apr 18, 2026, 12:22 PM

#

using llama3:8b which should be ok for the machine, i get this when testing

#

12:21:47 [agent/embedded] low context window: ollama/llama3:8b ctx=8192 (warn<32000) source=modelsConfig
12:21:47 [agent/embedded] blocked model (context window too small): ollama/llama3:8b ctx=8192 (min=16000) source=modelsConfig
12:21:47 [diagnostic] lane task error: lane=main durationMs=9440 error="FailoverError: Model context window too small (8192 tokens). Minimum is 16000."
12:21:47 [diagnostic] lane task error: lane=session:agent:main:main durationMs=9442 error="FailoverError: Model context window too small (8192 tokens). Minimum is 16000."
12:21:47 [model-fallback/decision] model fallback decision: decision=candidate_failed requested=ollama/llama3:8b candidate=ollama/llama3:8b reason=unknown next=none detail=Model context window too small (8192 tokens). Minimum is 16000.
FailoverError: Model context window too small (8192 tokens). Minimum is 16000.

drifting ginkgoBOT Apr 18, 2026, 12:23 PM

#

Your message looks cut off 😅

llama3:8b should be a much safer test on 16 GB, so if that is still failing, the exact error text matters a lot.

Please paste these 2 things together:

the full OpenClaw test output
the matching Ollama log tail from the same moment

For example:

openclaw infer model run --model ollama/llama3:8b --prompt "Reply with exactly: smoke-ok" --json
journalctl -u ollama -n 80 --no-pager

If you also still see kernel kills, add:

#

dmesg -T | tail -30

If llama3:8b is failing too, then we may be looking at:

another OOM condition from other processes
a broken Ollama install/runtime
or a different config issue than the original qwen3.5 one

Paste the actual error and we can narrow it down.

#

Nice, this is a different failure, and it’s much cleaner.

llama3:8b is not crashing here. OpenClaw is rejecting it up front because it sees:

ctx=8192
minimum required for the agent run is 16000

So the problem is now:

model context window too small, not networking, and not the earlier Ollama OOM crash.

The most likely reason is your OpenClaw model metadata/config says this model has only 8192 context.

Since you’re using a remote Ollama host, you probably have an explicit models.providers.ollama config. In that mode, OpenClaw trusts your configured model metadata.

So I’d check these two things:

#

openclaw config get models.providers.ollama
ollama show llama3:8b

What I’m looking for:

if OpenClaw config has contextWindow: 8192, but ollama show says the model really supports more, then fix the OpenClaw model entry
if the model really is only 8192, then OpenClaw is correctly blocking it for agent use, and you need a model with >= 16000 context

Docs:

So right now you’ve actually uncovered two separate issues:

qwen3.5 → OOM on the Ollama host

#

llama3:8b → context window too small for OpenClaw agent runs

If you paste openclaw config get models.providers.ollama, I can help spot whether the context metadata is the thing that needs fixing.

fair crest Apr 18, 2026, 12:25 PM

#

what is the log command for ollama

drifting ginkgoBOT Apr 18, 2026, 12:26 PM

#

If Ollama is running as a Linux service, use:

journalctl -u ollama -f

Recent lines without follow:

journalctl -u ollama -n 100 --no-pager

If it’s running in Docker:

docker logs -f <ollama-container-name>

#

If you started it manually with ollama serve, the logs are just in that terminal.

For your case, I’d use:

journalctl -u ollama -n 100 --no-pager

and paste the chunk around the failure.

fair crest Apr 18, 2026, 12:27 PM

#

12:27:04 [agent/embedded] low context window: ollama/llama3:8b ctx=8192 (warn<32000) source=modelsConfig
12:27:04 [agent/embedded] blocked model (context window too small): ollama/llama3:8b ctx=8192 (min=16000) source=modelsConfig
12:27:04 [diagnostic] lane task error: lane=main durationMs=9473 error="FailoverError: Model context window too small (8192 tokens). Minimum is 16000."
12:27:04 [diagnostic] lane task error: lane=session:agent:main:main durationMs=9475 error="FailoverError: Model context window too small (8192 tokens). Minimum is 16000."
12:27:04 [model-fallback/decision] model fallback decision: decision=candidate_failed requested=ollama/llama3:8b candidate=ollama/llama3:8b reason=unknown next=none detail=Model context window too small (8192 tokens). Minimum is 16000.
FailoverError: Model context window too small (8192 tokens). Minimum is 16000.

#

but nothing moved on the ollama side

drifting ginkgoBOT Apr 18, 2026, 12:27 PM

#

That’s still the OpenClaw log, not the Ollama log.

And the important bit is this:

source=modelsConfig
blocked model (context window too small)

So for llama3:8b, OpenClaw is stopping before it even sends anything to Ollama. That means Ollama logs probably won’t show anything for this test.

The next command I want is:

openclaw config get models.providers.ollama

Because right now it looks like your OpenClaw model config says llama3:8b has contextWindow: 8192.

#

If you want the actual Ollama service log anyway, it’s:

journalctl -u ollama -n 100 --no-pager

But for this specific llama3:8b error, the real thing to fix is the OpenClaw model metadata/config, not Ollama itself.

Paste openclaw config get models.providers.ollama and I can help spot the bad entry.

#

Yep, and that’s actually expected for this one.

For llama3:8b, OpenClaw is blocking it locally because of this line:

source=modelsConfig
context window too small

So nothing should hit Ollama for that test. No traffic, no Ollama log movement.

That means the next step is not on the Ollama box, it’s on the OpenClaw config side:

openclaw config get models.providers.ollama

If that config says llama3:8b has contextWindow: 8192, OpenClaw will refuse to use it for agent runs before making any request.

#

So right now:

qwen3.5 = did reach Ollama, then OOM’d
llama3:8b = never reached Ollama, blocked by OpenClaw model metadata

Paste openclaw config get models.providers.ollama and we can fix the second one.

fair crest Apr 18, 2026, 12:29 PM

#

{
"baseUrl": "http://192.168.1.102:11434",
"apiKey": "OPENCLAW_REDACTED",
"api": "ollama",
"models": [
{
"id": "qwen2.5:0.5b",
"name": "qwen2.5:0.5b",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 128000,
"maxTokens": 8192,
"api": "ollama"
},
{
"id": "qwen2.5-coder:32b",
"name": "qwen2.5-coder:32b",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 32768,
"maxTokens": 8192,
"api": "ollama"
},
{
"id": "qwen3.5:latest",
"name": "qwen3.5:latest",
"reasoning": false,
"input": [
"text",
"image"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 262144,
"maxTokens": 8192,
"api": "ollama"
},
{
"id": "qwen2.5:7b",
"name": "qwen2.5:7b",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 32768,
"maxTokens": 8192,
"api": "ollama"
},
{
"id": "llama3:8b",
"name": "llama3:8b",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 8192,
"maxTokens": 8192,
"api": "ollama"
}
]
}

drifting ginkgoBOT Apr 18, 2026, 12:29 PM

#

Perfect, this explains it exactly.

Your config literally says:

{
  "id": "llama3:8b",
  "contextWindow": 8192
}

So OpenClaw is doing the right thing when it says:

ctx=8192
minimum is 16000

That means for llama3:8b:

#

nothing will move on the Ollama side
OpenClaw blocks it before making a request

So the fix is one of these:

If llama3:8b really supports more than 8192, update that contextWindow value in your OpenClaw config to the real number reported by Ollama:
```
ollama show llama3:8b
```
If 8192 is correct, then that model just isn’t eligible for OpenClaw agent runs.

From the entries you already posted, the one I’d test next is the qwen2.5:7b entry, because:

it’s much smaller than the model that OOM’d
your config says it has 32768 context, so it clears the 16000 minimum

#

So you’ve now got the full picture:

qwen3.5:latest → reaches Ollama, then OOMs
llama3:8b → never reaches Ollama, blocked by contextWindow: 8192
qwen2.5:7b → most promising next test from the models already in your config

Docs: https://docs.openclaw.ai/providers/ollama

If qwen2.5:7b works, please right-click the solution message → Apps → ✅ Mark Solution so Answer Overflow picks it up.

fair crest Apr 18, 2026, 12:30 PM

#

Model
architecture llama
parameters 8.0B
context length 8192
embedding length 4096
quantization Q4_0

Capabilities
completion

License
META LLAMA 3 COMMUNITY LICENSE AGREEMENT
Meta Llama 3 Version Release Date: April 18, 2024

drifting ginkgoBOT Apr 18, 2026, 12:31 PM

#

Yep, that settles it.

ollama show confirms llama3:8b really is:

context length: 8192

So this is not a bad OpenClaw config entry. That model genuinely only exposes 8k context, and OpenClaw is correctly rejecting it for agent runs that require 16k minimum.

So we can close the book on llama3:8b:

it is not suitable for this OpenClaw agent use
nothing is wrong with routing for that test
nothing is wrong with the contextWindow metadata for that model

At this point, from the models you already posted, the most sensible next test is still qwen2.5:7b because:

it advertises 32768 context in your config

#

it’s much smaller than the model that OOM’d

So your picture is now pretty clean:

qwen3.5:latest → enough context, but OOMs
llama3:8b → fits memory better, but only 8k context so OpenClaw blocks it
qwen2.5:7b → best remaining candidate from your current list

If you test that one and it replies, I’d call this solved end to end.

fair crest Apr 18, 2026, 12:33 PM

#

we used qwen2.5:7b previously and that had different issues like no thinking

drifting ginkgoBOT Apr 18, 2026, 12:33 PM

#

Yeah, that’s the tradeoff, unfortunately.

You’ve basically found the boundary of this box:

the bigger model has enough context, but OOMs
the smaller model fits, but feels too dumb / no thinking

So the honest options are:

Local-only on 16 GB
- accept a smaller model
- okay for light chat/testing
- not great for strong agentic/reasoning behavior
Hybrid , my recommendation
- use local for cheap experimentation
- only switch to cloud when you actually need better reasoning

#openclaw isn't communicating with Ollama it seems