#Response time

1 messages Ā· Page 1 of 1 (latest)

tulip orbit
#

Running OpenClaw 2026.4.15 on Mac Mini M4 Pro 64GB with ollama/qwen3-32b-ctx32k locally. Two issues:

Response never reaches Telegram: Logs show chat.history ~1200ms meaning the model responds in ~1 second, but Telegram receives the message after 3+ minutes. Logs then show typing TTL reached (2m) and llm-idle-timeout: produced no reply before idle watchdog; retrying same model. The model is fast but something is swallowing the response before it reaches Telegram.

Context window stuck at 16384: Cannot get OpenClaw to use 32k context despite trying: custom Ollama modelfile with num_ctx 32768 (confirmed correct with ollama show), openclaw config set agents.defaults.contextTokens 32768, and launchctl setenv OLLAMA_NUM_CTX 32768. Logs always show ctx=16384 source=agentContextTokens warn<32000.

Any idea what's blocking both of these?

uneven meteorBOT
solar horizonBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> āœ… Mark Solution

uneven meteorBOT
desert ivyBOT
#

Let me check the local docs/source so I don’t make up config keys.

#

I found the context cap logic in source. Now I’m checking the local-model docs because the Telegram symptom smells like a local model/tool-streaming mismatch, not Telegram itself.

#

Yep, this looks like two separate issues.

  1. The 3+ minute Telegram delay is probably not Telegram
    I checked the runtime code, and this sequence:

    • typing TTL reached (2m)
    • [llm-idle-timeout] ... produced no reply before the idle watchdog; retrying same model

    means OpenClaw never got a usable streamed reply chunk in time, so it waited 120s, retried, and then you saw a message later.

    Also, chat.history ~1200ms is just the history fetch path, not proof the model reply already reached dispatch.

    For local Qwen/Ollama, the usual suspects are the full agent prompt/tool surface, not Telegram delivery. I’d test, in this order:

    openclaw config set agents.defaults.llm.idleTimeoutSeconds 300
    openclaw config set agents.defaults.experimental.localModelLean true
    
#
   openclaw gateway restart

If you are using a custom OpenAI-compatible provider in front of Ollama instead of native ollama/*, also check:

  • models.providers.<provider>.models[].compat.requiresStringContent: true
  • models.providers.<provider>.models[].compat.supportsTools: false (diagnostic)
  • models.providers.<provider>.injectNumCtxForOpenAICompat is not false
  1. ctx=16384 source=agentContextTokens is the smoking gun
    That log only happens when OpenClaw is actively capping the model below its detected window.

    So if the log says:

    ctx=16384 source=agentContextTokens
    
#

then the running gateway still sees agents.defaults.contextTokens = 16384 somewhere.
OLLAMA_NUM_CTX does not affect that warning.

I’d check:

openclaw gateway status
openclaw config get agents.defaults.contextTokens
openclaw models status

My strongest guess is config path drift or stale service state on macOS, meaning the daemon is not using the config you edited. openclaw gateway status should show the config path the service is actually using.

If you paste these 4 outputs, I can narrow it down pretty fast:

tulip orbit
#

Service: LaunchAgent (loaded)
File logs: /tmp/openclaw/openclaw-2026-04-19.log
Command: /opt/homebrew/opt/node/bin/node /opt/homebrew/lib/node_modules/openclaw/dist/index.js gateway --port [redacted]
Service file: ~/Library/LaunchAgents/ai.openclaw.gateway.plist
Service env: OPENCLAW_GATEWAY_PORT=[redacted]
Config (cli): ~/.openclaw/openclaw.json
Config (service): ~/.openclaw/openclaw.json
Gateway: bind=loopback (127.0.0.1), port=[redacted] (service args)
Probe target: ws://127.0.0.1:[redacted]
Dashboard: http://127.0.0.1:[redacted]/
Probe note: Loopback-only gateway; only local clients can connect.
Runtime: running (pid 34817, state active)
RPC probe: ok
32768
Config : ~/.openclaw/openclaw.json
Agent dir : ~/.openclaw/agents/main/agent
Default : ollama/qwen3-32b-ctx32k:latest
Fallbacks (0) : -
Image model : -
Image fallbacks (0): -
Aliases (2) : Minimax -> minimax/MiniMax-M2.5, GPT -> openai/gpt-5.1-codex
Configured models (3): minimax/MiniMax-M2.5, openai/gpt-5.1-codex, openai/gpt-5
Auth overview
Auth store : ~/.openclaw/agents/main/agent/auth-profiles.json
Shell env : off
Providers w/ OAuth/tokens (0): -

  • custom-127-0-0-1-[redacted] effective=models.json:o...a | models.json=o...a | source=models.json: ~/.openclaw/agents/main/agent/models.json
  • minimax effective=profiles:~/.openclaw/agents/main/agent/auth-profiles.json | profiles=1 (oauth=0, token=0, api_key=1) | minimax:default=[redacted]
  • ollama effective=missing:missing | models.json=marker(ollama-local) | source=models.json: ~/.openclaw/agents/main/agent/models.json
  • openai effective=profiles:~/.openclaw/agents/main/agent/auth-profiles.json | profiles=1 (oauth=0, token=0, api_key=1) | openai:default=[redacted] | env=[redacted]| source=env: OPENAI_API_KEY
    i deleted some lines to make this fit
desert ivyBOT
#

Nice, this rules out my first guess about config-path drift.

What jumps out now is this part:

  • CLI config = service config, both ~/.openclaw/openclaw.json
  • agents.defaults.contextTokens prints 32768
  • but auth/model status also shows ~/.openclaw/agents/main/agent/models.json
  • and there is a custom-127-0-0-1-[redacted] provider living there

So I’m now pretty suspicious that both problems are coming from the agent-local models.json / custom localhost provider path, not Telegram itself.

Why I think that

  1. The Telegram delay still looks like LLM idle timeout, not send failure
    chat.history is not the model reply time.
    typing TTL reached + [llm-idle-timeout] ... retrying same model means OpenClaw waited for a streamed model response, got nothing usable, then retried. That fits a local OpenAI-compatible / custom-provider quirk a lot better than a Telegram delivery bug.
#
  1. Your 32k config is being seen, but something else is still shaping the runtime
    Since openclaw config get agents.defaults.contextTokens returns 32768, the remaining likely place for stale model metadata/caps is ~/.openclaw/agents/main/agent/models.json.

  2. You appear to have two local paths at once

    • native ollama/*
    • plus a custom localhost provider

    If that custom provider points at Ollama’s OpenAI-compatible surface, that would also explain why the usual local-backend compat issues show up.

Next safe check

Can you run this and paste the output?

rg -n 'custom-127|qwen3-32b-ctx32k|contextTokens|contextWindow|baseUrl|injectNumCtxForOpenAICompat|requiresStringContent|supportsTools' \
  ~/.openclaw/agents/main/agent/models.json \
#
  ~/.openclaw/openclaw.json

I especially want to see:

  • the custom-127-... block
  • the ollama block
  • any contextTokens: 16384
  • any contextWindow: 16384
  • whether injectNumCtxForOpenAICompat is off
  • whether compat flags are present

My current bet

If that custom provider is targeting Ollama, the clean fix is probably to pick one path only, preferably native ollama/*, and not run both native Ollama and a custom localhost OpenAI-compatible provider against the same model.

#

Also, small clue, your service-env dump does not show OLLAMA_NUM_CTX, so I would not trust launchctl setenv as the thing fixing this anyway.

Docs for the compat bits I’m thinking of: https://docs.openclaw.ai/gateway/local-models

tulip orbit
#

PART 1/2 :/Users/jman/.openclaw/agents/main/agent/models.json:4: "baseUrl": "https://api.minimax.io/anthropic/v1",
/Users/jman/.openclaw/agents/main/agent/models.json:22: "contextWindow": 200000,
/Users/jman/.openclaw/agents/main/agent/models.json:29: "baseUrl": "http://127.0.0.1:8000/v1",
/Users/jman/.openclaw/agents/main/agent/models.json:35: "baseUrl": "http://127.0.0.1:11434/v1",
/Users/jman/.openclaw/agents/main/agent/models.json:40: "id": "qwen3-32b-ctx32k",
/Users/jman/.openclaw/agents/main/agent/models.json:41: "name": "qwen3-32b-ctx32k",
/Users/jman/.openclaw/agents/main/agent/models.json:43: "contextWindow": 32768,
/Users/jman/.openclaw/agents/main/agent/models.json:49: "baseUrl": "https://openrouter.ai/api/v1",
/Users/jman/.openclaw/agents/main/agent/models.json:66: "contextWindow": 200000,
/Users/jman/.openclaw/agents/main/agent/models.json:82: "contextWindow": 1048576,
/Users/jman/.openclaw/agents/main/agent/models.json:99: "contextWindow": 262144,
/Users/jman/.openclaw/agents/main/agent/models.json:105: "custom-127-0-0-1-11434": {
/Users/jman/.openclaw/agents/main/agent/models.json:106: "baseUrl": "http://127.0.0.1:11434/v1",
/Users/jman/.openclaw/agents/main/agent/models.json:123: "contextWindow": 16384,
/Users/jman/.openclaw/agents/main/agent/models.json:129: "baseUrl": "https://api.minimax.io/anthropic/v1",
/Users/jman/.openclaw/agents/main/agent/models.json:147: "contextWindow": 200000,
/Users/jman/.openclaw/agents/main/agent/models.json:154: "baseUrl": "https://openrouter.ai/v1",

#

part 2/2 /Users/jman/.openclaw/agents/main/agent/models.json:171: "contextWindow": 131072,
/Users/jman/.openclaw/agents/main/agent/models.json:188: "contextWindow": 131072,
/Users/jman/.openclaw/agents/main/agent/models.json:205: "contextWindow": 262144,
/Users/jman/.openclaw/agents/main/agent/models.json:215: "baseUrl": "https://chatgpt.com/backend-api/v1",
/Users/jman/.openclaw/agents/main/agent/models.json:235: "contextWindow": 272000,
/Users/jman/.openclaw/agents/main/agent/models.json:257: "contextWindow": 272000,
/Users/jman/.openclaw/agents/main/agent/models.json:279: "contextWindow": 272000,
/Users/jman/.openclaw/openclaw.json:36: "baseUrl": "https://api.minimax.io/anthropic",
/Users/jman/.openclaw/openclaw.json:53: "contextWindow": 200000,
/Users/jman/.openclaw/openclaw.json:58: "custom-127-0-0-1-11434": {
/Users/jman/.openclaw/openclaw.json:59: "baseUrl": "http://127.0.0.1:11434",
/Users/jman/.openclaw/openclaw.json:76: "contextWindow": 16384,
/Users/jman/.openclaw/openclaw.json:82: "baseUrl": "http://127.0.0.1:11434",
/Users/jman/.openclaw/openclaw.json:87: "id": "qwen3-32b-ctx32k",
/Users/jman/.openclaw/openclaw.json:88: "name": "qwen3-32b-ctx32k",
/Users/jman/.openclaw/openclaw.json:90: "contextWindow": 32768,
/Users/jman/.openclaw/openclaw.json:99: "model": "ollama/qwen3-32b-ctx32k:latest",
/Users/jman/.openclaw/openclaw.json:110: "contextTokens": 32768,
/Users/jman/.openclaw/openclaw.json:164: "custom-127-0-0-1-11434":

tulip orbit
#

hello?

desert ivyBOT
#

Yep, still here, sorry. 🦐

Your last dump was useful. I think the next place to look is still:

~/.openclaw/agents/main/agent/models.json

because:

  • config path drift is ruled out
  • agents.defaults.contextTokens really is 32768
  • but runtime still behaves like there is a 16k cap
  • and you have that extra custom-127-0-0-1-... provider in agent-local model state

So my current theory is:

  1. Telegram delay: first model attempt is stalling on the local/custom provider path, then the idle watchdog retries, so the reply shows up minutes later.
  2. 16k context: stale/local model metadata in models.json is still forcing 16384 somewhere.
#

If you want, paste just this read-only grep:

rg -n 'custom-127|qwen3-32b-ctx32k|contextTokens|contextWindow|baseUrl|injectNumCtxForOpenAICompat|requiresStringContent|supportsTools' \
  ~/.openclaw/agents/main/agent/models.json \
  ~/.openclaw/openclaw.json

If that file is huge, even just the custom-127... block + the ollama block is enough.

tulip orbit
#

part 1/2 jman@johnnys-Mini ~ % grep -n 'custom-127|qwen3-32b-ctx32k|contextTokens|contextWindow|baseUrl|injectNumCtxForOpenAICompat|requiresStringContent|supportsTools'
~/.openclaw/agents/main/agent/models.json
~/.openclaw/openclaw.json
/Users/jman/.openclaw/agents/main/agent/models.json:4: "baseUrl": "https://api.minimax.io/anthropic/v1",
/Users/jman/.openclaw/agents/main/agent/models.json:22: "contextWindow": 200000,
/Users/jman/.openclaw/agents/main/agent/models.json:29: "baseUrl": "http://127.0.0.1:11434/v1",
/Users/jman/.openclaw/agents/main/agent/models.json:34: "id": "qwen3-32b-ctx32k",
/Users/jman/.openclaw/agents/main/agent/models.json:35: "name": "qwen3-32b-ctx32k",
/Users/jman/.openclaw/agents/main/agent/models.json:37: "contextWindow": 32768,
/Users/jman/.openclaw/agents/main/agent/models.json:53: "baseUrl": "https://openrouter.ai/api/v1",
/Users/jman/.openclaw/agents/main/agent/models.json:70: "contextWindow": 200000,
/Users/jman/.openclaw/agents/main/agent/models.json:86: "contextWindow": 1048576,
/Users/jman/.openclaw/agents/main/agent/models.json:103: "contextWindow": 262144,
/Users/jman/.openclaw/agents/main/agent/models.json:110: "baseUrl": "https://api.minimax.io/anthropic/v1",
/Users/jman/.openclaw/agents/main/agent/models.json:128: "contextWindow": 200000,
/Users/jman/.openclaw/agents/main/agent/models.json:135: "baseUrl": "https://openrouter.ai/v1",
/Users/jman/.openclaw/agents/main/agent/models.json:152: "contextWindow": 131072,
/Users/jman/.openclaw/agents/main/agent/models.json:169: "contextWindow": 131072,
/Users/jman/.openclaw/agents/main/agent/models.json:186: "contextWindow": 262144,
/Users/jman/.openclaw/agents/main/agent/models.json:196: "baseUrl": "https://chatgpt.com/backend-api/v1",
/Users/jman/.openclaw/agents/main/agent/models.json:216: "contextWindow": 272000,

#

part 2/2 272000,
/Users/jman/.openclaw/agents/main/agent/models.json:260: "contextWindow": 272000,
/Users/jman/.openclaw/agents/main/agent/models.json:270: "baseUrl": "http://127.0.0.1:8000/v1",
/Users/jman/.openclaw/agents/main/agent/models.json:275: "custom-127-0-0-1-11434": {
/Users/jman/.openclaw/agents/main/agent/models.json:276: "baseUrl": "http://127.0.0.1:11434/v1",
/Users/jman/.openclaw/agents/main/agent/models.json:293: "contextWindow": 16384,
/Users/jman/.openclaw/openclaw.json:36: "baseUrl": "https://api.minimax.io/anthropic",
/Users/jman/.openclaw/openclaw.json:53: "contextWindow": 200000,
/Users/jman/.openclaw/openclaw.json:58: "custom-127-0-0-1-11434": {
/Users/jman/.openclaw/openclaw.json:59: "baseUrl": "http://127.0.0.1:11434",
/Users/jman/.openclaw/openclaw.json:76: "contextWindow": 16384,
/Users/jman/.openclaw/openclaw.json:82: "baseUrl": "http://127.0.0.1:11434",
/Users/jman/.openclaw/openclaw.json:87: "id": "qwen3-32b-ctx32k",
/Users/jman/.openclaw/openclaw.json:88: "name": "qwen3-32b-ctx32k",
/Users/jman/.openclaw/openclaw.json:90: "contextWindow": 32768,
/Users/jman/.openclaw/openclaw.json:99: "model": "ollama/qwen3-32b-ctx32k:latest",
/Users/jman/.openclaw/openclaw.json:108: "ollama/qwen3-32b-ctx32k:latest": {
/Users/jman/.openclaw/openclaw.json:113: "contextTokens": 32768,
/Users/jman/.openclaw/openclaw.json:167: "custom-127-0-0-1-11434": {

desert ivyBOT
#

Part 1 already tells us a lot.

The key lines are these:

  • ~/.openclaw/agents/main/agent/models.json:29: "baseUrl": "http://127.0.0.1:11434/v1"
  • model qwen3-32b-ctx32k
  • contextWindow: 32768

So yes, you do have a custom OpenAI-compatible provider pointed at Ollama’s /v1 surface, and that model entry itself already says 32k.

That means:

  1. The Telegram delay is still most likely the custom /v1 path stalling, then the 120s idle watchdog retries.
    This still does not look like Telegram swallowing a finished reply.

  2. The 16k cap is not coming from that model entry, because the model metadata there is already 32768.

#

So the remaining suspects are now:

  • stale session state / session override
  • or a 2026.4.15 bug on the Ollama OpenAI-compatible path

Best next test

Please try a brand new session after a gateway restart, then immediately send one tiny prompt.

Reason: if the old session/thread cached 16384, a fresh session should stop inheriting it.

Also, this strongly reinforces my earlier recommendation

If possible, stop routing Ollama through the custom http://127.0.0.1:11434/v1 provider and use the native ollama/* provider only.

That custom /v1 path is exactly where these local-backend compat issues tend to show up.

#

One tiny thing I still want to see

Can you paste just the block around that custom provider?

sed -n '24,45p' ~/.openclaw/agents/main/agent/models.json

I want to see whether it has api: "openai-completions" and whether there are any compat flags hiding nearby.

If you send part 2/2 too, I’ll keep digging.

tulip orbit
#

I cleaned up the file. there's still a pretty big mess from when i tried openrouter models

desert ivyBOT
#

Yep, we found the smoking gun. šŸŽÆ

You have two different definitions pointing at the same Ollama host, and one of them is still 16k:

  • native-ish ollama/qwen3-32b-ctx32k:latest path with contextWindow: 32768
  • custom provider custom-127-0-0-1-11434 with contextWindow: 16384

You pasted the 16k entry in both places:

  • ~/.openclaw/openclaw.json
  • ~/.openclaw/agents/main/agent/models.json

So the answer is basically:

What’s blocking both issues

  1. The stuck 16k context
#

OpenClaw still has a custom localhost provider configured at 16384.
That is why you keep seeing 16384.

  1. The Telegram 3 minute delay
    That same custom provider is the OpenAI-compatible /v1 path to Ollama, which is much more likely to hit the local backend compat / no-first-token / idle-watchdog problem. So OpenClaw waits 2 minutes, retries, and only then Telegram gets something.

What I would do

Use one path only.
Best option: keep the native ollama/* model, and remove/stop using custom-127-0-0-1-11434.

If you absolutely need the custom /v1 provider, then at minimum make it match the real model:

  • change its contextWindow from 16384 → 32768
  • if it still stalls, add compat hints on that custom model:
    • requiresStringContent: true
    • supportsTools: false (diagnostic, if needed)
#

Important

Because the stale 16k entry exists in both:

  • ~/.openclaw/openclaw.json
  • ~/.openclaw/agents/main/agent/models.json

I would:

  1. fix the config entry in openclaw.json
  2. restart the gateway
  3. start a brand new session
  4. re-check that neither file still shows custom-127-0-0-1-11434 at 16384

Also, you can ignore launchctl setenv OLLAMA_NUM_CTX for now. This is a config/provider overlap problem, not an env problem.

If you want, paste just the custom-127-0-0-1-11434 block from openclaw.json, and I’ll tell you exactly what to change.

#

Nice, that was very likely the root cause. šŸ‘

The leftover OpenRouter/custom-provider clutter was probably keeping that stale 16k localhost provider alive.

I’d do this next, in order:

  1. Restart the gateway
  2. Start a brand new session/chat
  3. Re-run the quick check:
    grep -n 'custom-127-0-0-1-11434\|16384\|qwen3-32b-ctx32k' \
      ~/.openclaw/agents/main/agent/models.json \
      ~/.openclaw/openclaw.json
    
  4. Send one tiny test prompt and watch whether:
    • the ctx=16384 warning is gone
    • Telegram replies immediately instead of after the 2m idle timeout
#

If it still stalls after cleanup, then the next thing I’d try is lean local mode:

openclaw config set agents.defaults.experimental.localModelLean true
openclaw gateway restart

That trims the heavy default tool surface, which often helps local Ollama via /v1.

If you want, paste the first log lines after restart plus one test turn, and I’ll sanity check them.

tulip orbit
#

is this right? jman@johnnys-Mini ~ % grep -n 'custom-127-0-0-1-11434|16384|qwen3-32b-ctx32k'
~/.openclaw/agents/main/agent/models.json
~/.openclaw/openclaw.json
/Users/jman/.openclaw/agents/main/agent/models.json:34: "id": "qwen3-32b-ctx32k",
/Users/jman/.openclaw/agents/main/agent/models.json:35: "name": "qwen3-32b-ctx32k",
/Users/jman/.openclaw/agents/main/agent/models.json:170: "maxTokens": 16384
/Users/jman/.openclaw/agents/main/agent/models.json:275: "custom-127-0-0-1-11434": {
/Users/jman/.openclaw/agents/main/agent/models.json:293: "contextWindow": 16384,
/Users/jman/.openclaw/openclaw.json:58: "custom-127-0-0-1-11434": {
/Users/jman/.openclaw/openclaw.json:76: "contextWindow": 16384,
/Users/jman/.openclaw/openclaw.json:87: "id": "qwen3-32b-ctx32k",
/Users/jman/.openclaw/openclaw.json:88: "name": "qwen3-32b-ctx32k",
/Users/jman/.openclaw/openclaw.json:99: "model": "ollama/qwen3-32b-ctx32k:latest",
/Users/jman/.openclaw/openclaw.json:108: "ollama/qwen3-32b-ctx32k:latest": {
/Users/jman/.openclaw/openclaw.json:167: "custom-127-0-0-1-11434": {

desert ivyBOT
#

Not yet, the bad entry is still there.

These are the important lines:

  • ~/.openclaw/agents/main/agent/models.json:275: "custom-127-0-0-1-11434"
  • ~/.openclaw/agents/main/agent/models.json:293: "contextWindow": 16384
  • ~/.openclaw/openclaw.json:58: "custom-127-0-0-1-11434"
  • ~/.openclaw/openclaw.json:76: "contextWindow": 16384

So the stale custom localhost provider is still live in both files.

What ā€œrightā€ should look like

For this cleanup, your grep should ideally show:

  • your ollama/qwen3-32b-ctx32k:latest lines
  • your qwen3-32b-ctx32k model lines
#

and not show:

  • custom-127-0-0-1-11434
  • contextWindow: 16384 for that localhost custom provider

One small note

This line is probably not the main problem:

  • models.json:170: "maxTokens": 16384

That is usually output-token limit metadata for some model entry.
The real smoking gun is still the custom provider + 16384 contextWindow.

So, short version

You still need to remove/disable the custom-127-0-0-1-11434 provider block from ~/.openclaw/openclaw.json, then restart so agent/models.json stops carrying it too.

#

If you want, paste just this chunk:

sed -n '58,80p' ~/.openclaw/openclaw.json

and I’ll tell you exactly which lines to cut.

tulip orbit
#

jman@johnnys-Mini ~ % sed -n '58,80p' ~/.openclaw/openclaw.json
"custom-127-0-0-1-11434": {
"baseUrl": "http://127.0.0.1:11434",
"apiKey": "ollama",
"api": "ollama",
"models": [
{
"id": "qwen3:32b",
"name": "qwen3:32b (Custom Provider)",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 16384,
"maxTokens": 4096
}
]
},

desert ivyBOT
#

Yep, that is the bad block.

You can remove that entire custom-127-0-0-1-11434 object, from:

"custom-127-0-0-1-11434": {

down through its closing:

},

Why:

  • it is a stale duplicate local Ollama provider
#
  • it points at qwen3:32b
  • it hardcodes contextWindow: 16384
  • your real target is already the native model:
    ollama/qwen3-32b-ctx32k:latest

So you do not want both.

After removing that block

  1. save ~/.openclaw/openclaw.json
  2. restart gateway
  3. start a new session
  4. run the grep again

After cleanup, this should be gone:

  • custom-127-0-0-1-11434
#
  • contextWindow: 16384 for that localhost custom provider

If it still shows up only in ~/.openclaw/agents/main/agent/models.json after restart, tell me, and I’ll help with the next cleanup step.

tulip orbit
#

should i get rid of the vllm too?

desert ivyBOT
#

Not necessarily.

The thing I’d remove first is only the stale custom-127-0-0-1-11434 block.
That is the one clearly causing the 16k confusion.

For vLLM:

  • keep it if you actually use it for a different backend
  • remove or disable it temporarily if you want the cleanest possible test setup

My recommendation for debugging this specific issue:

  1. keep only the native ollama path for this qwen model
  2. remove the stale custom localhost Ollama block
  3. restart
  4. test again
#

So, short answer:

  • No, vLLM is not the main culprit
  • but yes, simplifying helps if you are not actively using it right now

If you want the least-chaotic setup while testing, I’d temporarily aim for:

  • ollama/qwen3-32b-ctx32k:latest
  • no stale custom localhost Ollama provider
  • everything else optional later
tulip orbit
#

is it possible that it's not saving??

desert ivyBOT
#

Yes, absolutely.

At this point there are only two real possibilities:

  1. the edit never actually saved
  2. something is rewriting it after you save

The fastest way to tell is:

1. Stop the gateway first

openclaw gateway stop

2. Edit ~/.openclaw/openclaw.json, save it, then immediately check it

#
sed -n '58,80p' ~/.openclaw/openclaw.json
grep -n 'custom-127-0-0-1-11434\|16384' ~/.openclaw/openclaw.json

3. Interpret the result

  • If the block is still there immediately, it did not save.
  • If the block is gone, your edit saved correctly.
  • If it is gone at first, then comes back after startup, something is regenerating it.

4. Then start the gateway again

openclaw gateway start
#

A small but important point:

  • ~/.openclaw/agents/main/agent/models.json can be regenerated/runtime-managed
  • ~/.openclaw/openclaw.json is the one I’d trust as the source of truth here

So yes, I’d verify openclaw.json while the gateway is stopped. That removes a lot of guesswork.

If you want, paste the output of those two checks right after saving, and I’ll tell you which case you’re in.

tulip orbit
#

It saved. Still taking multiple minutes to respond to "hello"

#

Log file: /tmp/openclaw/openclaw-2026-04-24.log
02:40:10+00:00 info gateway {"subsystem":"gateway"} signal SIGTERM received
02:40:10+00:00 info gateway {"subsystem":"gateway"} received SIGTERM; shutting down
02:40:10+00:00 info gmail-watcher {"subsystem":"gmail-watcher"} gmail watcher stopped
02:50:01+00:00 info Gateway LaunchAgent was installed but not loaded; re-bootstrapped launchd service.
02:50:06+00:00 info gateway {"subsystem":"gateway"} loading configuration…
02:50:06+00:00 info gateway {"subsystem":"gateway"} resolving authentication…
02:50:06+00:00 info gateway {"subsystem":"gateway"} starting...
02:50:13+00:00 info gateway {"subsystem":"gateway"} starting HTTP server...
02:50:14+00:00 info gateway/canvas {"subsystem":"gateway/canvas"} canvas host mounted at http://127.0.0.1:18789/__openclaw__/canvas/ (root /Users/jman/.openclaw/canvas)
02:50:14+00:00 info gateway/health-monitor {"subsystem":"gateway/health-monitor"} started (interval: 300s, startup-grace: 60s, channel-connect-grace: 120s)
02:50:14+00:00 info gateway {"subsystem":"gateway"} agent model: ollama/qwen3-32b-ctx32k:latest
02:50:14+00:00 info gateway {"subsystem":"gateway"} ready (6 plugins: acpx, browser, device-pair, phone-control, talk-voice, telegram; 7.8s)
02:50:14+00:00 info gateway {"subsystem":"gateway"} log file: /tmp/openclaw/openclaw-2026-04-24.log
02:50:14+00:00 info gateway {"subsystem":"gateway"} starting channels and sidecars...
02:50:14+00:00 info gateway/hooks {"subsystem":"gateway/hooks"} loaded 4 internal hook handlers

#

02:50:17+00:00 info bonjour: advertised gateway fqdn=johnny’s Mac mini (OpenClaw)._openclaw-gw._tcp.local. host=openclaw.local. port=18789 state=announcing
02:50:19+00:00 warn plugins/self-hosted-provider-setup {"subsystem":"plugins/self-hosted-provider-setup"} Failed to discover vLLM models: TypeError: fetch failed
02:50:20+00:00 info plugins {"subsystem":"plugins"} embedded acpx runtime backend registered (cwd: /Users/jman/.openclaw/workspace)
02:50:20+00:00 info browser/server {"subsystem":"browser/server"} Browser control listening on http://127.0.0.1:18791/ (auth=token)
02:50:20+00:00 info gateway/heartbeat {"subsystem":"gateway/heartbeat"} {"intervalMs":1800000} heartbeat: started
02:50:20+00:00 info cron {"module":"cron","storePath":"/Users/jman/.openclaw/cron/jobs.json"} {"enabled":true,"jobs":0,"nextWakeAtMs":null} cron: started
02:50:20+00:00 info gateway/channels/telegram {"subsystem":"gateway/channels/telegram"} [default] starting provider (@Nacho_2_bot)
02:50:21+00:00 info plugins {"subsystem":"plugins"} embedded acpx runtime backend ready
02:50:42+00:00 info gateway/ws {"subsystem":"gateway/ws"} ⇄ res āœ“ chat.history 1223ms conn=cc620807…e286 id=a396ac28…7d83
02:53:20+00:00 warn tools {"subsystem":"tools"} tools.allow allowlist contains unknown entries (group:memory). These entries won't match any tool unless the plugin is enabled.
02:55:19+00:00 info typing TTL reached (2m); stopping typing indicator
02:55:21+00:00 warn agent/embedded {"subsystem":"agent/embedded"} [llm-idle-timeout] ollama/qwen3-32b-ctx32k:latest produced no reply before the idle watchdog; retrying same model

#

The model is timing out on every fresh inference attempt. need to explain why chat.history shows fast but actual inference triggers the watchdog every time?

tulip orbit
#

Hey did you see these messages? here are some new logs: 0:59:38+00:00 info typing TTL reached (2m); stopping typing indicator
00:59:39+00:00 warn agent/embedded {"subsystem":"agent/embedded"} [llm-idle-timeout] ollama/qwen3-32b-ctx32k:latest produced no reply before the idle watchdog; retrying same model
01:01:10+00:00 error gateway/channels/telegram {"subsystem":"gateway/channels/telegram"} [telegram] Polling stall detected (no completed getUpdates for 222.67s); forcing restart. [diag inFlight=0 outcome=ok startedAt=1777337830976 finishedAt=1777337848104 durationMs=17128 offset=0]
01:01:25+00:00 error gateway/channels/telegram {"subsystem":"gateway/channels/telegram"} [telegram] Polling runner stop timed out after 15s; forcing restart cycle.
01:01:25+00:00 error gateway/channels/telegram {"subsystem":"gateway/channels/telegram"} [telegram][diag] polling cycle finished reason=polling stall detected inFlight=0 outcome=ok startedAt=1777337830976 finishedAt=1777337848104 durationMs=17128 offset=0
01:01:25+00:00 error gateway/channels/telegram {"subsystem":"gateway/channels/telegram"} Telegram polling runner stopped (polling stall detected); restarting in 30s.
01:01:39+00:00 warn agent/embedded {"subsystem":"agent/embedded"} {"event":"embedded_run_failover_decision","tags":["error_handling","failover","assistant","surface_error"],"runId":"eb189911-0dfd-4628-b6f4-3687974117b0","stage":"assistant","decision":"surface_error","failoverReason":"timeout","profileFailureReason":"timeout","provider":"ollama","model":"qwen3-32b-ctx32k:latest","sourceProvider":"ollama","sourceModel":"qwen3-32b-ctx32k:latest","fallbackConfigured":false,"timedOut":true,"aborted":true} embedded run failover decision
01:01:40+00:00 error gateway/channels/telegram {"subsystem":"gateway/channels/telegram"} telegram sendMessage failed: Network request for 'sendMessage' failed!
01:01:40+00:00 error gateway/channels/telegram

desert ivyBOT
#

Yes — these logs make the picture clearer.

Two separate things are happening:

  1. chat.history 1223ms is not the model response.
    That’s the UI/gateway fetching transcript history. The actual model run is the later line:

    [llm-idle-timeout] ... produced no reply before the idle watchdog
    

    So OpenClaw is not seeing assistant output from Ollama before the watchdog fires.

  2. Telegram is also having transport trouble.
    These lines point to Telegram API/network instability, not model latency:

#
   Polling stall detected ...
   telegram sendMessage failed: Network request for 'sendMessage' failed!

That usually means DNS/IPv6/proxy/TLS reachability to api.telegram.org, or a Telegram polling bug in the older build.

For the 16k context: ctx=16384 source=agentContextTokens means OpenClaw is still seeing an OpenClaw-side cap of 16k. The Ollama Modelfile alone won’t fix OpenClaw’s prompt budget. Also OLLAMA_NUM_CTX is probably not the useful env knob here; prefer explicit OpenClaw params.num_ctx or Ollama’s OLLAMA_CONTEXT_LENGTH.

I’d do this order:

openclaw update status
openclaw update --channel stable
openclaw doctor
openclaw channels status --probe --json
#

Then verify the model path separately:

openclaw infer model run --local --model ollama/qwen3-32b-ctx32k:latest --prompt "Reply exactly: pong" --json
openclaw infer model run --gateway --model ollama/qwen3-32b-ctx32k:latest --prompt "Reply exactly: pong" --json

And make sure config has all three pieces aligned:

agents: {
  defaults: {
    contextTokens: 32768,
    timeoutSeconds: 300,
    experimental: {
      localModelLean: true,
#
    },
  },
},
models: {
  providers: {
    ollama: {
      baseUrl: "http://127.0.0.1:11434",
      api: "ollama",
      contextWindow: 32768,
      maxTokens: 8192,
      models: [
        {
          id: "qwen3-32b-ctx32k:latest",
          params: {
            num_ctx: 32768,