#Local llm pc requirements

1 messages ¡ Page 1 of 1 (latest)

distant skiff
#

Should the following pc be able to run qwen3:32b locally?
GMKtec NucBox K10
∙ CPU: Intel Core i9 13900HK (14C/20T, up to 5.4GHz)
∙ RAM: 64GB DDR5
∙ Storage: 1TB SSD
∙ GPU: Intel Iris Xe Graphics (integrated)

subtle patrolBOT
#

Yes — it can run Qwen3 32B locally on that box, but only in quantized form and it’ll be CPU/RAM-bandwidth limited (so: usable for experimentation, not “snappy ChatGPT speed”).

What will / won’t work

  • FP16 / full precision: basically no on 64GB (32B fp16 weights alone are ~64GB, and you still need KV cache + overhead).
  • 8-bit: maybe tight depending on the build/context length, but often ends up cramped/slow.
  • 4-bit (recommended): yes. A 32B model in 4-bit is typically ~16–20GB for weights + some overhead, so 64GB system RAM is enough.

Performance expectations (important)

  • With no discrete GPU (Intel Iris Xe iGPU isn’t very helpful for big LLM inference in most common stacks), expect something like low single-digit tokens/sec on CPU for 32B Q4, sometimes worse depending on context length and settings.
  • It’ll run, but longer prompts / big context windows will slow it down further because the KV cache grows with context.

Practical advice

  • If your goal is “feels fast”, you usually want an NVIDIA GPU with lots of VRAM (e.g., 24GB+ for this class of model).
  • If you’re okay with “runs but slow”, start with Q4 quantization (llama.cpp/Ollama style) and keep context moderate.
  • If you want a smoother local experience on that machine, consider Qwen 14B / 8B tiers instead.

If you tell me what runtime you plan to use (Ollama vs llama.cpp vs something else) and what “acceptable speed” means to you (coding assistant vs occasional Q&A), I can suggest the best quant + settings.

distant skiff
#

What are realistic use cases for qwen 14b / 8b and what types of use cases would push it overboard

subtle patrolBOT
#

For local use, 8B/14B models are the sweet spot: fast enough on CPU (or modest GPU), cheap enough to run often, and “smart enough” for a lot of daily tasks. The main limits are reasoning depth, long-context synthesis, and high-precision coding/debugging.

Realistic “this works well” use cases

Qwen ~8B (good when you want speed)

  • Everyday chat + brainstorming (ideas, outlines, alternatives, pros/cons)
  • Rewrite/cleanup: emails, docs, tone changes, shortening/expanding text
  • Summaries of short/medium content (a few pages worth)
  • Light coding help: explain snippets, write small functions, regex, config examples
  • Basic RAG/Q&A if you feed it tight, relevant chunks (it can be a solid “answer formatter”)

Qwen ~14B (noticeably better “thinking”, still runnable locally)

  • Everything above, but better at:
    • Following instructions + structured outputs (JSON-ish, step lists, templates)
    • More reliable coding: small features, refactors in a file, debugging from a stack trace
    • Multi-step tasks: “compare these options, choose one, justify, produce final output”
    • RAG with a bit more messiness (it tolerates noisier retrieved context)
#

What tends to push 8B/14B “overboard”

These are the common failure modes:

  1. Very long context / lots of documents
  • “Read these 20 pages and find subtle contradictions”
  • Long transcripts + long instructions + long output
    Even if it fits in context, quality drops: it misses details or blends facts.
  1. High-precision reasoning
  • Hard math, tricky logic puzzles, detailed planning with many constraints
    They’ll often give an answer confidently, but make a small fatal mistake.
  1. Serious coding in a real codebase
  • “Add a feature across 20 files”, “diagnose a complex concurrency bug”, “understand this architecture from multiple modules”
    Smaller models lose the thread and invent details unless you heavily guide them (or use stronger models).
  1. High-stakes factuality
#
  • Legal/medical/finance “must be correct” answers without ground truth docs
    Hallucination risk is the real limiter, not raw speed.
  1. Agent/tool reliability
  • If you want consistent tool-calling / step-by-step execution without supervision, smaller models can be flaky (depends on the runtime, prompt discipline, and quantization).

Rule of thumb

  • If you mostly do single-file coding, writing, summarizing, Q&A with RAG → 14B feels like “daily driver”.
  • If you want fast + cheap + good enough → 8B.
  • If you want deep reasoning, large refactors, or messy multi-doc synthesis → that’s where you start wanting bigger than 14B (or a cloud model).

If you tell me your top 2–3 tasks (e.g. “coding assistant”, “RAG over PDFs”, “chat”), I can say whether 8B is enough or 14B is worth it, and what context length/quant to aim for.

distant skiff
#

Is it possible that the 14b takes over 5 minutes to respond on this machine? Or is it likely that my files are corrupted?

subtle patrolBOT
#

Yes, it’s possible for a 14B model to take 5+ minutes on that machine — but only under certain conditions. For a short prompt (like “hi” in a fresh chat), 5 minutes is not normal and usually points to settings/runtime issues rather than “corrupted files”.

When 5+ minutes can be normal

  1. Huge prompt / huge context being fed in

    • If your UI/RAG is injecting a lot of text (logs, docs, multiple files), the model has to “prefill” (process) all of it before it can emit the first token.
    • With 10k–50k tokens of input, “minutes to first token” can happen on CPU.
  2. Too-high precision / weak quantization choice

    • If you accidentally loaded something like FP16 or a very heavy quant (or a format/backend that doesn’t use good CPU kernels), it can crawl.
  3. Swapping / memory pressure

    • If RAM is getting close to full and the OS starts swapping, performance falls off a cliff.
  4. Low-power / throttling

    • Mini PCs can thermal throttle hard; if clocks drop a lot under sustained load, everything slows.

When it suggests “something’s wrong”

#
  • If even a tiny prompt in a new session takes 5 minutes to produce the first token, that’s usually:
    • wrong backend (not using AVX/BLAS optimizations),
    • thread count set weirdly low,
    • stuck download/verify step in the runner,
    • or the app is feeding a massive hidden system prompt/context.

Corrupted model files are less likely — corruption usually causes load errors / crashes, not just “super slow but still works”.

Quick way to narrow it down

  1. Try a fresh chat, prompt: Say "test" and stop.
  2. While it’s “thinking”, check:
    • Is CPU pegged near 100%? (then it’s likely just crunching a big prefill or slow quant)
    • Is RAM near the limit / swap active?
  3. Tell me what you’re using to run it (Ollama / LM Studio / llama.cpp) and (if you know) the quantization (Q4/Q5/Q8/etc) + your context length setting.

With those 3 details I can tell you if the timing is expected and what to change to get it down to seconds.

distant skiff
#

Ollama
Q4
262144 length
Context token cap at 50000
All I said was “say ‘hi’” and it’s going on 10 minutes without a response

subtle patrolBOT
#

That’s almost certainly the context length setting, not corrupted files.

In Ollama (llama.cpp under the hood), num_ctx isn’t just a “cap” — it drives how big the KV cache is, and a lot of backends effectively allocate/prepare memory sized for that max. At 262,144 ctx, a 14B model can need way more than 64GB worth of KV/cache/overhead (or it’ll start swapping like crazy). Result: you get “no output for 10 minutes” even for say "hi".

What to do

  1. Drop context length hard: try 8192 (or 16384 if you really want).
  2. Restart Ollama after changing it (so it actually reloads with the new ctx).
  3. Test again with a truly fresh prompt.

If you’re using “Ollama + a UI”

Many UIs have a “context length / num_ctx” slider. Set it to 8k/16k. Your “context token cap at 50000” doesn’t save you if the runner is still configured for 262k max.

If you want 50k+ context

On that machine, it’s usually not the right approach. Use RAG/chunking (retrieve only the relevant parts) instead of trying to stuff 50k tokens into the prompt window.

If you tell me which UI you’re using (Open WebUI, LM Studio, etc.), I can point to the exact setting name/location to change.

distant skiff
#

Can you explain this very simply for me pretty please. Like I’m your grandma

subtle patrolBOT
#

Sure.

The simple version

“Context length” is how much the AI is allowed to remember at once (how big its “working memory” is).

You set it to 262,144, which is like telling your computer:

“Before you answer even ‘hi’, please set up a giant table big enough to hold an entire library.”

Your PC can’t comfortably set up a table that big with only 64GB of RAM, so it starts struggling (it uses the SSD as emergency memory, which is much slower). That’s why you get no response for 10 minutes.

It’s not that the model is corrupted — it’s just trying to prepare for an absurdly huge memory setting.

What to do (easy fix)

  1. Change context length from 262144 to 8192 (or 16384).
  2. Restart Ollama (or restart whatever app/UI you’re using with Ollama).
  3. Try again: say "hi" should respond normally.
#

Why your “cap at 50,000” didn’t help

Even if you promise you won’t use the whole library, the computer still tries to prepare the huge table because you told it the max could be 262,144.

If you tell me what app you’re using on top of Ollama (Open WebUI, LM Studio, etc.), I’ll tell you exactly where that setting is.

distant skiff
#

So what’s the trade off?

subtle patrolBOT
#

Trade‑off is basically “how much it can hold in its head” vs “how fast and smooth it runs.”

If you set a small context (like 8k–16k)

Pros

  • Replies start fast
  • Uses less RAM (so your PC stays responsive)
  • Less chance of the system going into slow “swap to SSD” mode

Cons

  • It “forgets” older parts of the conversation sooner
  • You can’t paste huge documents and expect it to consider all of it at once

If you set a big context (like 32k, 50k, 100k+)

Pros

  • You can include more chat history / larger documents
  • Better for “analyze this long report” if your hardware can handle it
#

Cons

  • Uses a lot more RAM (can exceed your 64GB quickly)
  • Slower to start responding (it has to read/process more before it can talk)
  • If you push it too far, it can get unusable (minutes of waiting, or it thrashes the SSD)

Practical advice for your machine (64GB RAM, CPU-only)

  • Default to 8k or 16k
  • Go to 32k only if you truly need it and it still feels okay
  • Avoid 50k+ unless you’re doing something very specific and accept it being slow
  • 262k is basically “server settings”, not mini‑PC settings

If you tell me what you’re trying to do (chat? coding? summarizing PDFs? long documents?), I’ll suggest a context number that won’t melt your machine.

distant skiff
#

Lowered to 16384 and it says “request aborted” after 10 minutes

subtle patrolBOT
#

“request aborted after 10 minutes” usually means the app/UI gave up waiting (timeout) — not that the model finished and returned an error.

At 16k context, a 14B Q4 model should not take 10 minutes to answer “hi”, so something is still off. The three most common causes:

  1. Your UI is still sending a huge context anyway

    • Even if you changed the slider, the chat might still contain a ton of hidden text (system prompt, previous messages, RAG snippets).
    • Fix: start a brand new chat (not the same thread/history) and try again.
  2. Ollama is still effectively using the giant context setting

    • Sometimes the change doesn’t apply until you restart Ollama.
    • Fix: restart Ollama, then retry.
  3. Your machine is swapping (RAM full)

    • If RAM fills up, it starts using the SSD as “fake RAM”, and everything becomes painfully slow.
    • Fix: watch Task Manager / Activity Monitor: if Memory is near 100% and Disk is busy, that’s the culprit.

Quick isolation test (important)

#

Run it without any UI, directly:

  • Open a terminal and run: ollama run qwen3:14b
  • Then type: say hi

If this responds quickly, your UI settings/timeout are the issue (very common).
If this also hangs, it’s Ollama/system resource (swap, stuck model load, etc.).

If you tell me which UI you’re using (Open WebUI?) + your OS, and paste the output of ollama ps, I can pinpoint which of the above it is.

distant skiff
#

I have been using telegram

subtle patrolBOT
#

Got it — that’s a big clue.

Telegram isn’t the “UI” that runs the model. It’s just the chat app. So your setup is really:

Telegram (phone/desktop) → some bot/bridge/server → Ollama → bot/bridge → Telegram

That means the “request aborted after 10 minutes” could be happening in the Telegram bot/bridge layer (timeouts are very common), even if Ollama is still grinding away.

The fastest way to figure out where the problem is

On the same machine that’s running Ollama, open a terminal and run:

  • ollama run qwen3:14b
  • then type: say hi

If that responds quickly:

  • Ollama is fine
  • Your Telegram integration is timing out / not streaming / waiting too long for first output
#

If that still hangs:

  • then it’s an Ollama/performance setting issue (RAM/swap/context still too high, etc.)

One more important Telegram gotcha

Many Telegram bot frameworks don’t stream tokens. They wait for the full answer before sending anything back. So even a small slowdown can look like “nothing is happening” until it hits a timeout.

Tell me:

  1. Is Ollama running on the GMKtec mini PC?
  2. What’s the “bridge” between Telegram and Ollama (a Telegram bot you set up? Open WebUI plugin? something else)?

With that, I can tell you exactly where to adjust the timeout / streaming behavior.

distant skiff
#

Over 2 min and counting waiting directly on ollama no gui or telegram

subtle patrolBOT
#

2 minutes on a direct terminal run is still “something’s wrong / misconfigured” for a 14B Q4 saying hi.

The most likely explanation: Ollama is still configured to use that gigantic context internally, so even though you think you lowered it, it’s still trying to set up a huge memory buffer and crawling (or swapping).

Do this simplest test (no chat history)

Cancel the current run (Ctrl+C), then run a one-shot prompt:

ollama run qwen3:14b "say hi"

That avoids any hidden “conversation history” inside the interactive session.

Check what Ollama thinks the context is

Run:

#
ollama show qwen3:14b

Look for something like num_ctx / context settings in the output. If it’s still huge, that’s the smoking gun.

Quick fix: make a “normal context” copy of the model

Create a file called Modelfile with this inside:

FROM qwen3:14b
PARAMETER num_ctx 8192

Then run:

#
ollama create qwen3:14b-8k -f Modelfile
ollama run qwen3:14b-8k "say hi"

If that suddenly responds quickly, we’ve confirmed the issue was the oversized context setting.

Also: watch RAM

While it’s “thinking”, open Task Manager / Activity Monitor:

  • If RAM is near full and disk is busy, it’s swapping → everything becomes minutes.

If you paste the output of ollama show qwen3:14b here, I can tell you immediately if the context/config is still stuck on a huge value.

distant skiff
#

It responded quickly to “ollama run qwen3:14b “say hi””. I’m talking seconds

subtle patrolBOT
#

Perfect — that tells us two important things:

  1. Your model files are fine (not corrupted).
  2. Your PC can run Qwen3 14B Q4 totally нормально — seconds-to-first-token is exactly what we’d expect.

So the “10 minutes → request aborted” problem is not Ollama itself. It’s whatever layer you were using before (Telegram bridge / GUI) doing one (or more) of these:

What’s probably happening

  • It’s sending a giant amount of text along with your “say hi” (old chat history, hidden system prompt, RAG snippets, etc.).
  • Or it’s still telling Ollama to use a massive context length (like 262k), which makes it allocate huge memory and crawl.
  • Or the Telegram/GUI layer doesn’t stream and has a 10-minute timeout, so if it doesn’t get output fast enough, it aborts.

What to do next (practical)

  • In the Telegram/GUI tool, set context / num_ctx to 8k or 16k (not 262k).
  • Start a brand new chat (old history can silently bloat the prompt).
  • Look for a setting like “stream responses” (turn on if available) and/or increase request timeout if it exists.
#

If you tell me exactly what you’re using to connect Telegram to Ollama (name of the bot/bridge/app), I can tell you where those settings live.

distant skiff
#

It’s the telegram bot

subtle patrolBOT
#

Got it — then the bottleneck is almost certainly how the Telegram bot is calling Ollama, not Ollama itself.

Since ollama run ... "say hi" is fast, the Telegram bot is probably doing one of these:

1) Re-sending the entire chat history every time

Many bots build a big “messages so far” payload and send it on every prompt. As the Telegram chat gets longer, the bot keeps sending more and more text, so even “say hi” turns into “say hi + 40 pages of previous chat”, which gets slow.

Fix: look for a bot command like /reset, /new, clear history, etc.
If you control the bot code/config: limit history to “last N messages” or “last N tokens”.

2) It’s still setting a gigantic context length (num_ctx) in the API call

Even if you lowered it somewhere else, the bot may still be sending num_ctx: 262144 (or similar) in the Ollama request.

Fix: set num_ctx to 8192 (or 16384) in the bot config.

3) Bot has a hard timeout (10 minutes) + no streaming

So it waits… then aborts.

#

Fix: enable streaming (if the bot supports it) or increase the timeout. But honestly, if “say hi” was timing out, that’s usually because of #1 or #2.

To be specific: which Telegram bot is it (name/repo/link), or is it something you wrote yourself? If you paste the config snippet (redact tokens), I can point to the exact setting to change (history length / num_ctx / timeout).

distant skiff
#

It’s just BotFather

subtle patrolBOT
#

BotFather isn’t actually “the bot” that talks to Ollama.

Think of BotFather like the clerk at the DMV: it helps you create a bot and gives you the token/password. After that, something else has to run 24/7 (a program, Docker container, server, etc.) that:

reads your Telegram messages → sends them to Ollama → sends the reply back to Telegram

So if you’re getting “request aborted” / slow replies, that’s coming from the program you’re using as the bridge, not from BotFather.

Two quick questions to find it:

  1. What did you install to connect Telegram to Ollama? (a GitHub project? Docker? Open WebUI? OpenClaw? something else)
  2. Where did you paste the BotFather token? (that location usually is the bridge app)

If you tell me that, I can tell you exactly where to change the settings (history length / context / timeout).

distant skiff
#

Openclaw

Openclaw config file

subtle patrolBOT
#

Ahh, that explains everything.

In your case Telegram is “just the chat surface” — the thing actually talking to Ollama is OpenClaw. And OpenClaw can (depending on config) accidentally force Ollama into that insane context size again.

What’s happening (simple)

If your OpenClaw config points Ollama at the OpenAI-compatible endpoint (.../v1) and/or uses api: "openai-completions", OpenClaw will often inject num_ctx based on your configured contextWindow.

So if you told OpenClaw something like “contextWindow = 262144”, it will effectively tell Ollama:

“allocate a 262k context KV cache”

…and you’re right back to the “10 minutes then abort” problem.

Docs call this out here (especially the warning about not using /v1 for Ollama with OpenClaw):
https://docs.openclaw.ai/providers/ollama

What to fix in openclaw.json

#

Open your OpenClaw config (usually ~/.openclaw/openclaw.json) and look for your Ollama provider.

You want the native Ollama API, like:

models: {
  providers: {
    ollama: {
      baseUrl: "http://127.0.0.1:11434", // IMPORTANT: no /v1
      api: "ollama",                     // native Ollama API
      apiKey: "ollama-local"
    }
  }
}

Then either:

#
  • remove any huge contextWindow override you added, or
  • set it to something sane like 8192 or 16384 for this machine.

After editing, restart:

openclaw gateway restart

If you paste this snippet, I’ll tell you exactly what line is causing it

Paste (redacting tokens) the models.providers.ollama block + the model entry you’re using (the part where contextWindow is set). That’s the part that’s almost certainly forcing 262144 again.

distant skiff
#

"custom-127-0-0-1-11434": {
"baseUrl": "http://127.0.0.1:11434/v1",
"apiKey": "ollama",
"api": "openai-completions",
"models": [
{
"id": "qwen3:14b",
"name": "qwen3:14b (Custom Provider)",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 16384,
"maxTokens": 4096
}
]

subtle patrolBOT
#

Yep — that config snippet is the problem.

You’re currently using Ollama’s OpenAI-compatible endpoint:

  • baseUrl: "http://127.0.0.1:11434/v1"
  • api: "openai-completions"

OpenClaw’s docs are pretty explicit: don’t use /v1 with Ollama in OpenClaw — it can break streaming/tool calling and generally behaves weird compared to Ollama’s native API.
Source: https://docs.openclaw.ai/providers/ollama (see the warning at the top)

Fix (minimal change)

Keep your provider name the same, just change these two lines:

"custom-127-0-0-1-11434": {
  "baseUrl": "http://127.0.0.1:11434",   // <- remove /v1
#
  "api": "ollama",                       // <- switch to native
  "apiKey": "ollama",
  ...
}

Leave your model block as-is (your contextWindow: 16384 is fine).

Then restart the gateway:

openclaw gateway restart

One more thing (Telegram session may still be “poisoned”)

In Telegram, send /new to start a fresh session, then try “say hi” again.

#

If it’s still slow after that, paste your agents.defaults.model.primary line too — I want to make sure you’re actually targeting custom-127-0-0-1-11434/qwen3:14b (and not some other model/provider silently).

distant skiff
#

i made those changes - no response or even a "typing" signal.

subtle patrolBOT
#

Two key things here:

  1. Telegram “typing” isn’t guaranteed (OpenClaw can do draft streaming, but it depends on channels.telegram.streaming etc.), so don’t rely on that as the signal.
  2. No reply at all usually means OpenClaw is erroring before it can send anything back (or the gateway/channel isn’t running).

Most common mistake after that edit

If you changed only the baseUrl (removed /v1) but left:

"api": "openai-completions"

then OpenClaw will try to call OpenAI-style endpoints on a non-OpenAI URL and it will basically 404/fail → no response.

For native Ollama it must be both:

#
baseUrl: "http://127.0.0.1:11434"
api: "ollama"

Fast debug checklist (takes 60 seconds)

On the machine running OpenClaw:

  1. Restart gateway (just to be sure config applied):
openclaw gateway restart
  1. Check if Telegram channel + gateway are healthy:
openclaw status
#
openclaw channels status
  1. Tail logs, then send a Telegram message (“hi”) and watch what error shows up:
openclaw logs --follow

Next thing I need from you

Paste your updated provider block (the same snippet you pasted before, after your edits). That will tell immediately if the api/baseUrl combination is mismatched.

(If you can also paste openclaw status --all, it’s safe to share—tokens are redacted—and it’ll usually point right at the failure.)

distant skiff
#

"custom-127-0-0-1-11434": {
"baseUrl": "http://127.0.0.1:11434",
"apiKey": "ollama",
"api": "ollama",
"models": [
{
"id": "qwen3:14b",
"name": "qwen3:14b (Custom Provider)",
"reasoning": false,
"input": [
"text"
],
"cost": {
how long do i have to wait after typing hi for the error to show up in the logs?

subtle patrolBOT
#

If you have openclaw logs --follow running before you send the message, you should normally see something in the logs within 1–5 seconds of sending “hi” (at least an “inbound Telegram update/message” line).

If you see nothing at all after ~10–15 seconds, that usually means the Gateway isn’t receiving Telegram updates (channel not running, polling stuck, wrong config file, bot token issue, etc.) — not a model/Ollama error.

A couple nuances:

  • If the “error” is a model timeout, then yeah, the log line might only show up after the timeout period (could be minutes). But you’d still typically see an inbound message log immediately.
  • For fast failures (bad URL, 404, connection refused), you usually see the error almost immediately.

Quick next check (while you’ve got this open): run in another terminal:

openclaw channels status

If Telegram is unhealthy or not started, it’ll usually say so right there.

distant skiff
#

Gateway reachable.

  • Telegram default: enabled, configured, running, out:4m ago, mode:polling, token:config

Tip: status --deep adds gateway health probes to status output (requires a reachable gateway).

subtle patrolBOT
#

Cool — that output means Telegram is up and polling, and the Gateway can talk to Telegram (it even shows an outbound send happened ~4 minutes ago).

At this point we need to answer one question:

Is OpenClaw seeing your “hi” message at all?

1) Check inbound (this is the key step)

Run this, then immediately send “hi” to the bot in Telegram:

openclaw logs --follow --json

You should see an inbound Telegram event within a couple seconds. If you paste the first ~20 lines that appear after you send “hi”, we’ll know exactly where it’s getting stuck (access control vs model vs something else).

2) Common “nothing happens” causes

  • You’re chatting in a group and didn’t mention the bot. By default groups often require @botname mention. Try a DM, or mention it once.
#
  • Model selection mismatch: your provider is named custom-127-0-0-1-11434, so the model id is typically custom-127-0-0-1-11434/qwen3:14b. If your config still says agents.defaults.model.primary: "ollama/qwen3:14b" (or something else), the agent may be failing to pick a working model.

If you can paste your agents.defaults.model.primary line (or the whole agents: { defaults: { model: ... }} block), I can confirm that part too.

distant skiff
#

"type":"meta","file":"/tmp/openclaw/openclaw-2026-03-04.log","cursor":103635,"size":103635}
{"type":"log","time":"2026-03-04T05:30:39.205Z","level":"info","subsystem":"gateway/reload","message":"{"subsystem":"gateway/reload"} config change detected; evaluating reload (models.providers.custom-127-0-0-1-11434.baseUrl, models.providers.custom-127-0-0-1-11434.api, models.providers.custom-127-0-0-1-11434.models, agents.defaults.compaction)","raw":"{"0":"{\"subsystem\":\"gateway/reload\"}","1":"config change detected; evaluating reload (models.providers.custom-127-0-0-1-11434.baseUrl, models.providers.custom-127-0-0-1-11434.api, models.providers.custom-127-0-0-1-11434.models, agents.defaults.compaction)","_meta":{"runtime":"node","runtimeVersion":"22.22.0","hostname":"unknown","name":"{\"subsystem\":\"gateway/reload\"}","parentNames":["openclaw"],"date":"2026-03-04T05:30:39.204Z","logLevelId":3,"logLevelName":"INFO","path":{"fullFilePath":"file:///home/johnny-mangual/.npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1170:14","fileName":"subsystem-DypCPrmP.js","fileNameWithLine":"subsystem-DypCPrmP.js:1170","fileColumn":"14","fileLine":"1170","filePath":".npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js","filePathWithLine":".npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1170","method":"logToFile"}},"time":"2026-03-04T05:30:39.205Z"}"}
{"type":"log","time":"2026-03-04T05:30:39.210Z","level":"info","subsystem":"gateway/reload","message":"{"subsystem":"gateway/reload"} config change applied (dynamic reads: models.providers.custom-127-0-0-1-11434.baseUrl, models.providers.custom-127-0-0-1-11434.api, models.providers.custom-127-0-0-1-11434.models, agents.defaults.compaction)","raw":"{"0":"{\"subsystem\":\"gateway/reload\"}","1":"config change applied (dynamic reads: models.providers.custom-127-0-0-1-11434.baseUrl,

#

"agents": {
"defaults": {
"model": "custom-127-0-0-1-11434/qwen3:14b",
"models": {
"minimax/MiniMax-M2.5": {
"alias": "Minimax"
},
"openai/gpt-5.1-codex": {
"alias": "GPT"
},
"openai/gpt-5": {},
},
"workspace": "/home/johnny-mangual/.openclaw/workspace",
"contextTokens": 50000,
"memorySearch": {
"sources": [
"memory",
"sessions"
],
"experimental": {
"sessionMemory": true
},
"query": {
"hybrid": {
"enabled": true,

subtle patrolBOT
#

The log lines you pasted are only showing “config change applied” — I don’t see any Telegram inbound message in that snippet yet.

So either:

  • you didn’t send “hi” after starting openclaw logs --follow --json, or
  • OpenClaw is not receiving your Telegram message (wrong chat / group-without-mention / wrong bot token / pairing/access block).

Do this exact test

  1. In one terminal:
openclaw logs --follow --json
  1. Immediately (within a few seconds) send a DM to the bot in Telegram (not a group), just: hi

  2. Wait ~5 seconds and paste the new log lines that appear after you sent hi.

#

You should see something that mentions Telegram / an update / a message. If you see nothing at all, OpenClaw isn’t seeing the message.

Quick “DM access” check (takes 5 seconds)

Run:

openclaw pairing list telegram
  • If DM policy is pairing and you aren’t approved, it’ll show up there.
  • If it’s approved and still nothing shows in logs, we’re likely talking to the wrong bot username/token or you’re messaging in a group where mention/privacy mode blocks it.

Paste whatever openclaw pairing list telegram prints (it’s safe).

#

Nice — your agents.defaults.model: "custom-127-0-0-1-11434/qwen3:14b" looks correct. So model selection isn’t obviously the issue.

At this point, if you DM “hi” and get absolutely nothing back, it’s usually one of these:

1) OpenClaw isn’t seeing your DM (or it’s dropping it for auth)

Even if Telegram is “running”, it might be ignoring your message due to DM policy / pairing / allowlist.

Run:

openclaw pairing list telegram

If you see a pending code, approve it:

#
openclaw pairing approve telegram <CODE>

(From the docs: Telegram DMs default to dmPolicy: "pairing".)
https://docs.openclaw.ai/channels/telegram

2) You’re not actually DM’ing the bot (group mention requirement)

If you’re testing in a group, by default it often requires an @botname mention. Try a direct DM to the bot to eliminate that variable.

3) Get the Telegram-specific logs (much easier to read than the full log)

Right after you send “hi”, run:

openclaw channels logs --channel telegram
#

…and paste the last chunk it prints. That usually contains the exact “why” (ignored sender, pairing required, mention required, etc.).


One more performance note (not the “no reply” issue, but important): you have contextTokens: 50000 + session memory enabled. Once it’s replying again, if it feels slow, we should probably lower that for a local 14B model. But first let’s confirm it’s receiving and authorizing your DM.

distant skiff
#

Log file: /tmp/openclaw/openclaw-2026-03-04.log
Channel: telegram
2026-03-04T05:30:55.781Z info {"subsystem":"gateway/channels/telegram"} [default] starting provider (@Nacho_2_bot)
2026-03-04T05:35:22.043Z info {"subsystem":"gateway/channels/telegram"} [default] starting provider (@Nacho_2_bot)
2026-03-04T05:45:23.945Z info {"subsystem":"gateway/channels/telegram"} telegram sendMessage ok chat=7506591428 message=394

subtle patrolBOT
#

That channel log is actually useful:

  • OpenClaw is running a Telegram bot named @Nacho_2_bot
  • It successfully sent a Telegram message at 05:45:23 (sendMessage ok chat=…)

So the “no typing” part isn’t the key issue — the real question is: are you messaging the same bot, and is your DM allowed?

1) First sanity check

In Telegram, make sure you are DM’ing exactly: @Nacho_2_bot
(If you accidentally DM a different bot you created earlier, OpenClaw will never see it.)

2) Check DM policy / pairing (this often results in “silence”)

Run these and paste the output:

openclaw config get channels.telegram.dmPolicy
#
openclaw config get channels.telegram.allowFrom
openclaw pairing list telegram

What we’re looking for:

  • If dmPolicy is pairing: pairing list should show a pending code the first time you DM.
  • If dmPolicy is allowlist: your Telegram numeric user id must be in allowFrom or it will ignore you.

(From the docs: allowFrom must be a numeric Telegram user ID, not a username.)
https://docs.openclaw.ai/channels/telegram#dm-policy

3) One more log request (to confirm inbound)

Right after you send “hi” in the DM, run:

#
openclaw channels logs --channel telegram

and paste the newest lines. If we don’t see any “received update/message” type lines around that time, it means OpenClaw isn’t receiving your DMs (wrong bot / auth / polling issue).

distant skiff
#

openclaw config get channels.telegram.dmPolicy

🦞 OpenClaw 2026.3.1 (2a8ac97) — I'm the reason your shell history looks like a hacker-movie montage.

pairing
johnny-mangual@johnny-mangual-NucBox-K10:~$ openclaw config get channels.telegram.allowFrom
openclaw pairing list telegram

🦞 OpenClaw 2026.3.1 (2a8ac97)
If it's repetitive, I'll automate it; if it's hard, I'll bring jokes and a rollback plan.

Config path not found: channels.telegram.allowFrom

🦞 OpenClaw 2026.3.1 (2a8ac97) — The UNIX philosophy meets your DMs.

│
◇ Doctor warnings ──────────────────────────────────────────────────────────────────────────╮
│ │
│ - channels.telegram.groupPolicy is "allowlist" but groupAllowFrom (and allowFrom) is │
│ empty — all group messages will be silently dropped. Add sender IDs to │
│ channels.telegram.groupAllowFrom or channels.telegram.allowFrom, or set groupPolicy to │
│ "open". │
│ │
├────────────────────────────────────────────────────────────────────────────────────────────╯
No pending telegram pairing requests.
johnny-mangual@johnny-mangual-NucBox-K10:~$ openclaw channels logs --channel telegram

Log file: /tmp/openclaw/openclaw-2026-03-04.log
Channel: telegram
2026-03-04T05:30:55.781Z info {"subsystem":"gateway/channels/telegram"} [default] starting provider (@Nacho_2_bot)
2026-03-04T05:35:22.043Z info {"subsystem":"gateway/channels/telegram"} [default] starting provider (@Nacho_2_bot)
2026-03-04T05:45:23.945Z info {"subsystem":"gateway/channels/telegram"} telegram sendMessage ok chat=7506591428 message=394
johnny-mangual@

subtle patrolBOT
#

That Doctor warning is the smoking gun:

channels.telegram.groupPolicy is "allowlist" but groupAllowFrom (and allowFrom) is empty — all group messages will be silently dropped.

So if you’re testing the bot in a Telegram group, OpenClaw is currently configured to drop everything from groups, with no reply and no pairing prompt (pairing is DM-only).

Fix (pick one)

Option A (simplest): allow groups openly (good for a private group you control)

In your openclaw.json:

channels: {
  telegram: {
    groupPolicy: "open",
  }
#
}

Restart:

openclaw gateway restart

Option B (safer): allow only you in groups

Add your Telegram numeric user id to groupAllowFrom:

channels: {
  telegram: {
#
    groupPolicy: "allowlist",
    groupAllowFrom: ["tg:123456789"], // <- your Telegram user id
  }
}

(IDs must be numeric; usernames won’t work. Docs: https://docs.openclaw.ai/channels/telegram#dm-policy)

Quick check: are you DM’ing or in a group?

  • DM: pairing applies. You’d see a pending request in openclaw pairing list telegram the first time.
  • Group: pairing does not apply. GroupPolicy/allowlists decide everything.

If you tell me whether you’re trying to use it in a DM or a group, I’ll give you the exact minimal config block (including requireMention if needed).

distant skiff
#

dm

subtle patrolBOT
#

If you’re testing in a DM, then groupPolicy/allowFrom doesn’t matter — good.

Given:

  • dmPolicy: pairing
  • pairing list telegram shows no pending requests
  • channel logs show it can send (sendMessage ok …)

…then either you’re already paired or OpenClaw is not actually receiving your DM updates.

Quick sanity check

Are you DM’ing the same bot OpenClaw is running?

Your logs say it’s:

@Nacho_2_bot

#

If you DM a different bot, OpenClaw will never see it.

The decisive test (DM inbound)

  1. Run this:
openclaw logs --follow --json
  1. DM @Nacho_2_bot the message: hi

You should see an inbound Telegram log line within a couple seconds.
If you see nothing at all, then OpenClaw isn’t receiving updates (polling conflict, webhook set somewhere, wrong bot token, etc.).

If inbound logs are still missing, run this next

This checks if a webhook is set (webhook breaks polling):

#
openclaw channels capabilities --channel telegram

(or if you prefer the Telegram API route, we can do getWebhookInfo too)

Tell me: when you DM hi, do you see any new log line in openclaw logs --follow --json? Even a warning/error.

distant skiff
#

{"type":"log","time":"2026-03-04T05:55:26.622Z","level":"debug","subsystem":"diagnostic","message":"{"subsystem":"diagnostic"} run registered: sessionId=64f3438e-f003-4ded-9842-552c35ef937c totalActive=1","raw":"{"0":"{\"subsystem\":\"diagnostic\"}","1":"run registered: sessionId=64f3438e-f003-4ded-9842-552c35ef937c totalActive=1","_meta":{"runtime":"node","runtimeVersion":"22.22.0","hostname":"unknown","name":"{\"subsystem\":\"diagnostic\"}","parentNames":["openclaw"],"date":"2026-03-04T05:55:26.621Z","logLevelId":2,"logLevelName":"DEBUG","path":{"fullFilePath":"file:///home/johnny-mangual/.npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1170:14","fileName":"subsystem-DypCPrmP.js","fileNameWithLine":"subsystem-DypCPrmP.js:1170","fileColumn":"14","fileLine":"1170","filePath":".npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js","filePathWithLine":".npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1170","method":"logToFile"}},"time":"2026-03-04T05:55:26.622Z"}"}
{"type":"log","time":"2026-03-04T05:55:26.623Z","level":"debug","subsystem":"agent/embedded","message":"{"subsystem":"agent/embedded"} embedded run prompt start: runId=cfae997b-bcf4-480c-906c-28466fa786ca sessionId=64f3438e-f003-4ded-9842-552c35ef937c","raw":"{"0":"{\"subsystem\":\"agent/embedd

#

ed\"}","1":"embedded run prompt start: runId=cfae997b-bcf4-480c-906c-28466fa786ca sessionId=64f3438e-f003-4ded-9842-552c35ef937c","_meta":{"runtime":"node","runtimeVersion":"22.22.0","hostname":"unknown","name":"{\"subsystem\":\"agent/embedded\"}","parentNames":["openclaw"],"date":"2026-03-04T05:55:26.623Z","logLevelId":2,"logLevelName":"DEBUG","path":{"fullFilePath":"file:///home/johnny-mangual/.npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1170:14","fileName":"subsystem-DypCPrmP.js","fileNameWithLine":"subsystem-DypCPrmP.js:1170","fileColumn":"14","fileLine":"1170","filePath":".npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js","filePathWithLine":".npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1170","method":"logToFile"}},"time":"2026-03-04T05:55:26.623Z"}"}
{"type":"log","time":"2026-03-04T05:55:26.626Z","level":"debug","subsystem":"agent/embedded","message":"{"subsystem":"agent/embedded"} embedded run agent start: runId=cfae997b-bcf4-480c-906c-28466fa786ca","raw":"{"0":"{\"subsystem\":\"agent/embedded\"}","1":"embedded run agent start: runId=cfae997b-bcf4-480c-906c-28466fa786ca","_meta":{"runtime":"node","runtimeVersion":"22.22.0","hostname":"unknown","name":"{\"subsystem\":\"agent/embedded\"}","parentNames":["openclaw"],"date":"2026-03-04T05:55:26.626Z","logLevelId":2,"logLevelName":"DEBUG","path":{"fullFilePath":"file:///home/johnny-mangual/.npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1170:14","fileName":"subsystem-DypCPrmP.js","fileNameWithLine":"subsystem-DypCPrmP.js:1170","fileColumn":"14","fileLine":"1170","filePath":".npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js","filePathWithLine":".npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1170","method":"logToFile"}},"time":"202

#

T05:55:26.626Z"}"}
{"type":"log","time":"2026-03-04T05:57:25.715Z","level":"info","message":"typing TTL reached (2m); stopping typing indicator","raw":"{"0":"typing TTL reached (2m); stopping typing indicator","_meta":{"runtime":"node","runtimeVersion":"22.22.0","hostname":"unknown","name":"openclaw","date":"2026-03-04T05:57:25.715Z","logLevelId":3,"logLevelName":"INFO","path":{"fullFilePath":"file:///home/johnny-mangual/.npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1010:46","fileName":"subsystem-DypCPrmP.js","fileNameWithLine":"subsystem-DypCPrmP.js:1010","fileColumn":"46","fileLine":"1010","filePath":".npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js","filePathWithLine":".npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1010","method":"console.log"}},"time":"2026-03-04T05:57:25.715Z"}"}

subtle patrolBOT
#

Good news: those debug lines mean OpenClaw did receive your DM and it started a run:

  • run registered … totalActive=1
  • embedded run prompt start …

So we’re past “Telegram isn’t working” — now it’s “the run is taking a long time”.

Most likely reason it’s slow right now

In the config you pasted earlier you have:

  • contextTokens: 50000
  • memorySearch.sources: ["memory","sessions"]
  • experimental.sessionMemory: true

That combo can make the prompt/context huge (or trigger extra retrieval/indexing work), which is brutal for a local 14B model.

Quick speed fix (recommended for local 14B)

#

Change your agent defaults to something like:

agents: {
  defaults: {
    contextTokens: 16384, // or even 8192
    memorySearch: {
      experimental: { sessionMemory: false },
      sources: ["memory"]
    }
  }
}

Then:

#
openclaw gateway restart

One thing I need from the logs

After you send “hi”, wait ~10–20 seconds and paste the next few log lines after embedded run prompt start — especially anything mentioning ollama, llm, memorySearch, or an error/timeout.

That will tell us if it’s:

  • spending forever building context / memory retrieval, or
  • actually calling Ollama and getting stuck there.
#

Those log lines show the run starts, but then it goes quiet — that usually means it’s spending a long time in “build a giant prompt + run the model” (and it doesn’t emit many debug logs while doing it).

Given your config, the biggest foot-gun is:

  • contextTokens: 50000
  • memorySearch.sources: ["memory","sessions"] with sessionMemory: true

That can make every DM effectively “hi + a ton of history”, which is painfully slow on a local 14B.

Fast fix (copy/paste)

Run these on the gateway host:

openclaw config set agents.defaults.contextTokens 16384 --strict-json
openclaw config set agents.defaults.memorySearch.experimental.sessionMemory false --strict-json
openclaw config set agents.defaults.memorySearch.sources '["memory"]' --strict-json
#
openclaw gateway restart

Then in Telegram DM, send:

  • /new
  • then hi

Why this helps

It forces OpenClaw to keep the “working memory” closer to what your Ollama model can actually handle quickly, instead of trying to drag along 50k tokens worth of old session text.

If after doing that it’s still slow, the next thing I’d check is openclaw models status to confirm OpenClaw is actually using custom-127-0-0-1-11434/qwen3:14b and not falling back.

#

That line is actually a good sign:

typing TTL reached (2m); stopping typing indicator

Meaning: OpenClaw did start processing your DM, and it even started a typing indicator — it just hasn’t produced a reply within 2 minutes, so Telegram stops showing “typing”.

So we’re back to “it’s working, but it’s WAY too slow”.

Do this now

  1. In the Telegram DM, send:
  • stop
  • then /new

(“stop” is an abort trigger; it should cancel the stuck run.)

  1. Apply the speed config changes + restart:
#
openclaw config set agents.defaults.contextTokens 16384 --strict-json
openclaw config set agents.defaults.memorySearch.experimental.sessionMemory false --strict-json
openclaw config set agents.defaults.memorySearch.sources '["memory"]' --strict-json
openclaw gateway restart
  1. DM the bot: hi

Why this should fix it

Right now your bot is likely trying to shove a lot of context/history into every request (50k tokens + session memory), which turns even “hi” into a huge prompt that a local 14B model will take minutes to chew through.

If after this it’s still taking >30 seconds for “hi”, tell me what your CPU usage looks like during the run (is it pegged at 100% or mostly idle?). That tells us whether it’s “slow prefill” vs “stuck call”.

distant skiff
#

im ready to send hi but i see it says "typing" after telling me "agent was aborted"

subtle patrolBOT
#

That’s normal.

The typing indicator is “best effort” and it can lag a bit behind reality — especially if:

  • a run was aborted mid-flight, and
  • Telegram is still showing the last “typing…” state until it times out / gets updated.

If OpenClaw already told you “agent was aborted”, you’re safe to proceed.

Do this order (clean test)

  1. Send (as a standalone message in the DM):
/new
  1. Then send:
#
hi

If you still see “typing…” after the abort, ignore it — it should clear on its own (there’s a TTL). If it never clears and it never responds, tell me whether you already ran the config changes + openclaw gateway restart (the contextTokens/memorySearch tweaks).

distant skiff
#

nada...

subtle patrolBOT
#

If you sent /new and then hi in the DM and got nothing, the two most common causes are:

  1. There’s still a run stuck “active” and your new messages are being queued/ignored
  2. The Gateway/channel is fine, but we need to see what the logs say at the moment you send hi

Fast reset (clears stuck runs)

On the machine running OpenClaw:

openclaw gateway restart

Then verify it’s back:

openclaw status
#

Confirm the “speed settings” actually applied

openclaw config get agents.defaults.contextTokens
openclaw config get agents.defaults.memorySearch.sources
openclaw config get agents.defaults.memorySearch.experimental.sessionMemory

Then do the decisive log test again

  1. Run:
openclaw logs --follow --json
  1. Immediately DM the bot: hi

Paste the first new log lines that appear right after you send it.

#

If we don’t see an inbound message / run start line, then Telegram isn’t reaching the agent. If we do see a run start, we’ll know exactly where it’s stalling.

distant skiff
#

is this anything /subsystem-DypCPrmP.js","filePathWithLine":".npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1170","method":"logToFile"}},"time":"2026-03-04T06:05:31.780Z"}"}
{"type":"log","time":"2026-03-04T06:05:31.785Z","level":"info","subsystem":"gateway/ws","message":"{"subsystem":"gateway/ws"} ⇄ res ✓ status 922ms conn=7e6ed3c8…88f5 id=61b4d803…0cfd","raw":"{"0":"{\"subsystem\":\"gateway/ws\"}","1":"⇄ res ✓ status 922ms conn=7e6ed3c8…88f5 id=61b4d803…0cfd","_meta":{"runtime":"node","runtimeVersion":"22.22.0","hostname":"unknown","name":"{\"subsystem\":\"gateway/ws\"}","parentNames":["openclaw"],"date":"2026-03-04T06:05:31.785Z","logLevelId":3,"logLevelName":"INFO","path":{"fullFilePath":"file:///home/johnny-mangual/.npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1170:14","fileName":"subsystem-DypCPrmP.js","fileNameWithLine":"subsystem-DypCPrmP.js:1170","fileColumn":"14","fileLine":"1170","filePath":".npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js","filePathWithLine":".npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1170","method":"logToFile"}},"time":"2026-03-04T06:05:31.785Z"}"}
{"type":"notice","message":"Log tail truncated (increase --max-bytes)."}

subtle patrolBOT
#

That line:

subsystem: "gateway/ws" ⇄ res ✓ status …

is basically just “the Gateway answered a status request over WebSocket” (often from you running openclaw status, the Control UI, etc.). It’s a good sign (gateway is alive), but it’s not your Telegram “hi” being processed, and not an Ollama call.

The important logs for your DM will mention subsystems like:

  • gateway/channels/telegram (inbound update / message handling)
  • agent/embedded (run start)
  • something with ollama / llm (model request)

How to grab the right lines

  1. Start the tail with a bigger buffer (your output says it truncated):
#
openclaw logs --follow --json --max-bytes 2000000
  1. Then immediately DM the bot: hi

  2. Paste the first new lines that show up after you send hi (we’re looking for agent/embedded / telegram).

Also: did you already run the config changes to drop contextTokens + turn off session memory? If not, do that first—otherwise even a “hi” can still take minutes.

distant skiff
#

yes i did those things. no new lines showed up

subtle patrolBOT
#

If no new lines show up at all when you DM hi, that means OpenClaw is not seeing the inbound Telegram message (so it never even starts a run).

Since outbound worked earlier, the top things to check next are webhook vs polling and polling health.

1) Run the Telegram probe (safe to paste)

On the gateway host:

openclaw channels capabilities --channel telegram

Paste the output. This usually reveals stuff like “webhook is set” (which will break polling) or other Telegram-side constraints.

2) Confirm polling is actually receiving anything

Run:

#
openclaw channels status
openclaw channels logs --channel telegram

Then DM the bot hi again and immediately re-run:

openclaw channels logs --channel telegram

If the log still shows only startup + old sendMessage, it’s not receiving updates.

3) Sanity check: are you DM’ing the right bot?

Your gateway logs say the bot is:

@Nacho_2_bot

#

If you DM a different bot, you’ll get exactly “no new lines”.

Reply with the output of openclaw channels capabilities --channel telegram and we can usually pinpoint it in one step.

distant skiff
#

{"type":"log","time":"2026-03-04T06:10:25.857Z","level":"warn","subsystem":"agent/embedded","message":"{"subsystem":"agent/embedded"} embedded run agent end: runId=abd485e4-0b13-4331-83ee-700fa362160b isError=true error=fetch failed","raw":"{"0":"{\"subsystem\":\"agent/embedded\"}","1":"embedded run agent end: runId=abd485e4-0b13-4331-83ee-700fa362160b isError=true error=fetch failed","_meta":{"runtime":"node","runtimeVersion":"22.22.0","hostname":"unknown","name":"{\"subsystem\":\"agent/embedded\"}","parentNames":["openclaw"],"date":"2026-03-04T06:10:25.857Z","logLevelId":4,"logLevelName":"WARN","path":{"fullFilePath":"file:///home/johnny-mangual/.npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1170:14","fileName":"subsystem-DypCPrmP.js","fileNameWithLine":"subsystem-DypCPrmP.js:1170","fileColumn":"14","fileLine":"1170","filePath":".npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js","filePathWithLine":".npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1170","method":"logToFile"}},"time":"2026-03-04T06:10:25.857Z"}"}
{"type":"log","time":"2026-03-04T06:10:27.864Z","level":"debug","subsystem":"agent/embedded","message":"{"subsystem":"agent/embedded"} embedded run agent start: runId=abd485e4-0b13-4331-83ee-700fa362160b","raw":"{"0":"{\"subsystem\":\"agent/embedded\"}","1":"embedded run agent start: runId=abd485e4-0b13-4331-83ee-700fa362160b","_meta":{"runtime":"node","runtimeVersion":"22.22.0","hostname":"unknown","name":"{\"subsystem\":\"agent/embedded\"}","parentNames":["openclaw"],"date":"2026-03-04T06:10:27.863Z","logLevelId":2,"logLevelName":"DEBUG","path":{"fullFilePath":"file

#

22.22.0","hostname":"unknown","name":"{\"subsystem\":\"agent/embedded\"}","parentNames":["openclaw"],"date":"2026-03-04T06:10:27.863Z","logLevelId":2,"logLevelName":"DEBUG","path":{"fullFilePath":"file:///home/johnny-mangual/.npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1170:14","fileName":"subsystem-DypCPrmP.js","fileNameWithLine":"subsystem-DypCPrmP.js:1170","fileColumn":"14","fileLine":"1170","filePath":".npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js","filePathWithLine":".npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1170","method":"logToFile"}},"time":"2026-03-04T06:10:27.864Z"}"}

subtle patrolBOT
#

Yes — that’s very relevant.

embedded run agent end … isError=true error=fetch failed

means: OpenClaw tried to make an HTTP request and the network call itself failed (timeout / connection refused / connection reset). It’s not a “model answered but wrong” type of error — it’s “couldn’t reach the server”.

In your setup, the #1 suspect is the Ollama call to:

http://127.0.0.1:11434 (your custom provider)

Fast way to confirm what’s failing

  1. From the same machine, run:
#
curl -sS http://127.0.0.1:11434/api/tags | head

If this fails/hangs, OpenClaw will fail too → it’s an Ollama reachability issue.

  1. Run:
openclaw models status

and paste the part that mentions your provider custom-127-0-0-1-11434 (it usually prints a clearer reason than just “fetch failed”).

Common causes

  • Ollama isn’t running (or restarted/crashed)
  • Ollama is running but bound differently (less likely if ollama run … works)
#
  • OpenClaw is running in a different network namespace (Docker) where 127.0.0.1 isn’t the host
  • Request is hanging until timeout (so you see “typing…” then eventually “fetch failed”)

Post the output of the curl /api/tags test + the relevant chunk of openclaw models status and we can pinpoint it quickly.

#

That snippet doesn’t really add anything new — it’s just more metadata for the same run.

We need to confirm what “fetch failed” is failing to reach.

Please run these two commands and paste the outputs

1) Test Ollama HTTP directly:

curl -sS http://127.0.0.1:11434/api/tags | head

2) Test using Node’s fetch (this matches what OpenClaw uses):

node -e "fetch('http://127.0.0.1:11434/api/tags').then(r=>r.text()).then(t=>console.log(t.slice(0,200))).catch(e=>{console.error(e); process.exit(1)})"

Interpretation:

#
  • If both fail → Ollama isn’t reachable on that URL/port (or not running).
  • If curl works but node fetch fails → it’s usually proxy/env/network-stack weirdness, and the Node error output will tell us the real reason.
  • If both work → then the “fetch failed” is likely happening on a different URL (and we’ll pull that from openclaw models status next).

After that, if needed, run:

openclaw models status

and paste just the section for custom-127-0-0-1-11434.

distant skiff
#

It just responded to hi after 20+ minutes. Like a proper response

subtle patrolBOT
#

Good — that means the whole chain is working (Telegram → OpenClaw → Ollama → back to Telegram).

But 20+ minutes for “hi” is not remotely normal, even on CPU. That usually means it was doing one of:

  • chewing through a huge prompt/context (history/memory/tools), or
  • thrashing/loading something cold (model unload/reload, swapping), or
  • retrying behind the scenes until something finally returned.

Quick sanity check

Send one more DM like:

  • hi again

If the second reply comes back in seconds / <1 minute, then the first one was a one-time cold-start / backlog situation.

If it’s still many minutes, then we need to reduce what OpenClaw is stuffing into each request (because a local 14B on CPU can’t handle “agent-style” giant prompts quickly).

#

Two knobs that matter most (for local CPU)

Confirm these are actually set how we intended:

openclaw config get agents.defaults.contextTokens
openclaw config get agents.defaults.memorySearch.sources
openclaw config get agents.defaults.memorySearch.experimental.sessionMemory

For a local 14B, I’d keep contextTokens at 8k–16k, and I’d keep session memory search off.

Also worth reading (it’s blunt, but accurate about local models + agent prompts):
https://docs.openclaw.ai/gateway/local-models

Tell me how fast the second “hi again” is, and paste those 3 config values — then we can tune it to something usable.

distant skiff
#

second hi takes forever. context tokens and memory search are as they should be

subtle patrolBOT
#

If the second “hi” is also taking forever, then it’s not a one-time warmup — it’s your per-message workload.

On this hardware (CPU + iGPU), a local 14B can answer fast in Ollama when the prompt is small. OpenClaw’s prompts are often not small (system prompt + tools + skills + session plumbing), so the model ends up “reading a book” before it can say “hi”.

Quick A/B test (proves whether it’s “OpenClaw prompt bloat”)

Temporarily shrink the tool prompt by setting a minimal tool profile:

openclaw config set tools.profile "minimal"
openclaw gateway restart

Then DM the bot: hi

  • If that becomes fast, the slowdown is mainly tool/skill/system prompt size (expected with local CPU models).
  • If it’s still slow, then something else is wrong (we’d go back to Ollama/network/system load).
#

Tool profiles are documented here (minimal/messaging/full):
https://docs.openclaw.ai/gateway/configuration-reference#tools

Practical reality check

OpenClaw + Telegram + “agent with tools” is a much heavier prompt than “ollama run say hi”. On CPU-only boxes, that often means minutes even for simple messages.

If you want it usable locally on that machine, the usual approach is:

  • keep tools/skills very limited (or off) for the local model, or
  • use a hosted model for the “agent brain” and keep local models for simple offline chat.

Try the tools.profile="minimal" test and tell me how long “hi” takes after that.

distant skiff
#

at one minute and counting

subtle patrolBOT
#

1 minute (with tools.profile="minimal") is already a huge improvement vs 20 minutes — so that strongly suggests the slowdown is prompt size / “agent baggage”, not Telegram.

If it still takes ~1–2 minutes even in minimal mode, the next biggest chunk of prompt bloat is usually skills.

Next speed test: disable bundled skills (shrinks the system prompt a lot)

Run:

openclaw config set skills.allowBundled '[]' --strict-json
openclaw gateway restart

Then DM: hi

  • If that becomes “fast-ish” (seconds to ~20s), then you’ve confirmed the bottleneck was the skill/tool prompt size.
  • If it’s still ~1–2 minutes, then you’re basically at the limit of what that CPU-only box can do for an “assistant-style” prompt with a 14B model.
#

Also: once you do get a reply, try a third message right away (“hi3”). If hi3 is much faster, then part of the delay is still just the model warming up / caching.

Tell me how long “hi” takes after disabling bundled skills.

distant skiff
#

no you misinterpreted. it still hasn't returned anything to hi. I meant i was going on over 1 minute. now im at 3:30 and still going

subtle patrolBOT
#

Got it — thanks for clarifying. If you’re at 3:30 for a plain “hi”, that means the model is spending ages just processing the prompt (prefill). On CPU, that can happen if the prompt is still huge.

Even with tools.profile="minimal", OpenClaw can still be feeding a lot of system text (skills, policies, etc.) + whatever context it keeps.

Do this “make it tiny” test (to prove it’s prompt size)

  1. Abort the current run (DM the bot): stop

  2. On the gateway machine, set these and restart:

openclaw config set agents.defaults.contextTokens 4096 --strict-json
openclaw config set tools.profile "minimal"
openclaw config set skills.allowBundled '[]' --strict-json
openclaw gateway restart
  1. In Telegram DM:
#
  • send /new
  • then hi

If that drops your “hi” latency from minutes to something reasonable, we’ve confirmed the core issue is OpenClaw prompt bloat vs CPU-only local model. Then we can re-enable things gradually (messaging tools, a small set of skills, etc.) until it becomes too slow again.

If it’s still taking minutes even with that ultra-minimal setup, then we’re looking at a deeper runtime issue (Ollama hanging under OpenClaw load, swapping, etc.) and I’d want to see CPU/RAM usage while it’s running.

distant skiff
#

says agent failed before reply: model context window too small (4096 tokens). minimum is 16000. logs: openclaw logs --follow

subtle patrolBOT
#

That error is actually very specific and helpful:

model context window too small (4096 tokens). minimum is 16000.

Meaning: OpenClaw is currently talking to Ollama in a way where Ollama is using a 4k context, but OpenClaw needs ~16k+ just to run (system prompt + safety + tooling scaffold).

So even if your model could support 16k, Ollama is effectively running it at 4096 right now.

Fix (recommended): create an Ollama variant with 16k context

On the machine with Ollama:

  1. Create a file named Modelfile:
FROM qwen3:14b
PARAMETER num_ctx 16384
#
  1. Create the new model:
ollama create qwen3:14b-16k -f Modelfile
  1. Update OpenClaw config so the model id matches (both places):
  • in your provider’s models: [{ id: ... }] set id: "qwen3:14b-16k"
  • set agents.defaults.model to custom-127-0-0-1-11434/qwen3:14b-16k
  • keep contextWindow: 16384
  1. Restart gateway:
openclaw gateway restart
#

Then in Telegram DM do /new and try again.

Why this happens

Ollama defaults to a smaller context unless it’s explicitly set (either by the model’s num_ctx or by a client injecting it). OpenClaw is currently seeing/using 4096, so it refuses to run.

If you paste your updated provider models[] entry + agents.defaults.model line after you make the -16k model, I’ll sanity-check it before you restart.

distant skiff
#

FROM: command not found
PARAMETER: command not found

subtle patrolBOT
#

Yep — that happened because those lines are not shell commands. They’re the contents of a file called Modelfile that Ollama reads.

Do it like this (Linux/macOS)

Run these exactly in your terminal:

cat > Modelfile <<'EOF'
FROM qwen3:14b
PARAMETER num_ctx 16384
EOF

ollama create qwen3:14b-16k -f Modelfile

Then verify it exists:

#
ollama list | grep qwen3

If you’re on Windows

Create a text file named Modelfile (no extension) with:

FROM qwen3:14b
PARAMETER num_ctx 16384

Then run:

ollama create qwen3:14b-16k -f Modelfile
#

After you’ve created it, update OpenClaw to use qwen3:14b-16k (and restart the gateway). If you paste your updated OpenClaw model/provider snippet, I’ll sanity-check it before you restart.

distant skiff
#

im on linux, i did the list } grep qwen3

subtle patrolBOT
#

Nice — can you paste the actual output of:

ollama list | grep qwen3

I’m looking to confirm you now have something like qwen3:14b-16k (or similar) in the list.

Next step (once the new model shows up)

Update OpenClaw to point at the new Ollama model id and put contextTokens back to ≥16000 (because setting it to 4096 is what triggered the “minimum is 16000” failure).

Example commands (adjust the model id to whatever your ollama list shows):

openclaw config set models.providers.custom-127-0-0-1-11434.models[0].id "qwen3:14b-16k"
openclaw config set agents.defaults.model "custom-127-0-0-1-11434/qwen3:14b-16k"
#
openclaw config set agents.defaults.contextTokens 16384 --strict-json
openclaw gateway restart

Then DM the bot: /new → hi.

Paste your ollama list | grep qwen3 output and I’ll tell you the exact model id string to use in those config lines.

distant skiff
#

it has not responded yet after 1 min. it's not even showiong that it's typing

subtle patrolBOT
#

1 minute with no typing doesn’t necessarily mean it’s broken — but it does mean we need to look at the logs again to see whether:

  • OpenClaw even started a run for that message, or
  • it failed immediately (context too small, model not found, etc.), or
  • it started and is just slow again.

Do this exact check

On the gateway host:

openclaw logs --follow --json

Then DM the bot: hi

Within a few seconds you should see either:

  • an agent/embedded … start line (run started), or
#
  • an error line explaining why it didn’t start.

If you paste the first ~10 new log lines that appear right after you send hi, I can tell you which case it is.

Quick config sanity (since we changed a lot)

Also paste these outputs:

openclaw config get agents.defaults.model
openclaw config get agents.defaults.contextTokens

Because if contextTokens accidentally stayed at 4096, OpenClaw will refuse to run (it needs ~16000+).

distant skiff
#

the logs do not update after a few seconds. it has been 1 minute since sending hi and the logs have not updated yet

subtle patrolBOT
#

If openclaw logs --follow --json doesn’t print anything new after you DM hi, that usually means the Gateway never saw the Telegram message (so it never even started a run).

Before we chase Telegram, let’s make sure your log tail is actually “live”:

1) Prove the log tail is working

Keep openclaw logs --follow --json running, and in another terminal run:

openclaw status

You should see new log lines appear in the tail (often gateway/ws … status).

  • If nothing appears, your tail session isn’t actually following live logs (wrong terminal, command exited, etc.).
  • If you do see new lines from openclaw status, then the tail is fine → Telegram DM isn’t reaching OpenClaw.

2) If the tail is fine, check Telegram receive health

#

Run:

openclaw channels status
openclaw channels capabilities --channel telegram

Paste those outputs.

Common causes when send works but receive doesn’t:

  • A webhook is set for that bot (polling won’t receive updates)
  • Another process/OpenClaw instance is also polling the same bot token (Telegram “getUpdates” conflicts)

Also double-check you’re DM’ing the exact bot your logs show (@Nacho_2_bot).

distant skiff
#

the bot is right. Gateway reachable.

  • Telegram default: enabled, configured, running, out:6m ago, mode:polling, token:config

Tip: status --deep adds gateway health probes to status output (requires a reachable gateway).

🦞 OpenClaw 2026.3.1 (2a8ac97) — IPC, but it's your phone.

│
◇ Doctor warnings ──────────────────────────────────────────────────────────────────────────╮
│ │
│ - channels.telegram.groupPolicy is "allowlist" but groupAllowFrom (and allowFrom) is │
│ empty — all group messages will be silently dropped. Add sender IDs to │
│ channels.telegram.groupAllowFrom or channels.telegram.allowFrom, or set groupPolicy to │
│ "open". │
│ │
├────────────────────────────────────────────────────────────────────────────────────────────╯
Telegram default
Support: chatTypes=direct,group,channel,thread polls reactions threads media nativeCommands blockStreaming
Actions: send, broadcast, react, delete, edit, topic-create
Bot: @Nacho_2_bot (8762250895)
Flags: joinGroups=true readAllGroupMessages=false inlineQueries=false
Webhook: none

subtle patrolBOT
#

That output is actually great — it rules out the big obvious blocker:

  • Webhook: none ✅ (so polling can work)
  • Bot identity matches ✅
  • Telegram channel is “running” ✅

So if you DM hi and (a) you get no typing and (b) the logs don’t show any new lines, the gateway is basically not receiving Telegram updates right now (even though it can send).

Two common causes:

1) Another process is polling the same bot token

Telegram long-polling (getUpdates) does not like two consumers. If you have any other bot runner / old OpenClaw instance / test script using that same token, it can “steal” updates.

Quick check: do you have OpenClaw running anywhere else with the same token (another terminal, another machine, a Docker container)?

2) Polling is failing due to network/IPv6 weirdness

OpenClaw’s Telegram docs call this out (broken IPv6 egress can make getUpdates flake). Easiest forced-IPv4 fix:

#
openclaw config set channels.telegram.network.autoSelectFamily false --strict-json
openclaw gateway restart

Then DM hi again and immediately run:

openclaw channels logs --channel telegram

If it still receives nothing, the “two pollers” situation is the next thing to hunt down (it’s the most common “send works, receive doesn’t” failure mode).

distant skiff
#
  1. i just transfered from another machine but i ran a stop command on it so i thought it wasn't running it anymore. deleted a bunch of lines but here are some/tmp/openclaw/openclaw-2026-03-04.log
    Channel: telegram
    2026-03-04T05:30:55.781Z info {"subsystem":"gateway/channels/telegram"} [default] starting provider (@Nacho_2_bot)
    2026-03-04T05:35:22.043Z info {"subsystem":"gateway/channels/telegram"} [default] starting provider (@Nacho_2_bot)
    2026-03-04T05:45:23.945Z info {"subsystem":"gateway/channels/telegram"} telegram sendMessage ok chat=7506591428 message=394
    2026-03-04T05:55:25.436Z info {"subsystem":"gateway/channels/telegram"} telegram sendMessage ok chat=7506591428 message=398
    2026-03-04T06:01:29.467Z info {"subsystem":"gateway/channels/telegram"} [default] starting provider (@Nacho_2_bot)
    2026-03-04T06:05:22.995Z info {"subsystem":"gateway/channels/telegram"} [default] starting provider (@Nacho_2_bot)
    2026-03-04T06:15:25.694Z info {"subsystem":"gateway/channels/telegram"} telegram sendMessage ok chat=7506591428 message=407
    2026-03-04T06:25:26.665Z info {"subsystem":"gateway/channels/telegram"} telegram sendMessage ok chat=7506591428 message=409
    2026-0
    2026-03-04T07:24:16.231Z info {"subsystem":"gateway/channels/telegram"} telegram sendMessage ok chat=7506591428 message=415
    2026-03-04T07:24:16.429Z info {"subsystem":"gateway/channels/telegram"} [default] starting provider (@Nacho_2_bot)
    2026-03-04T07:55:22.728Z info {"subsystem":"gateway/channels/telegram"} [default] starting provider (@Nacho_2_bot)
    2026-03-04T08:30:22.741nels/telegram"} [default] starting provider (@Nacho_2_bot)
    2026-03-04T23:25:22.793Z info {"subsystem":"gateway/channels/telegram"} [default] starting provider (@Nacho_2_bot)
    2026-03-05T00:00:32.107Z info {"subsystem":"gateway/channels/telegram"} [default] starting provider
    2026-03-05T00:00:42.595Z error {"subsystem":"gateway/channels/telegram"} telegram deleteWebhook failed: Network request for 'deleteWebhook' failed!
    2
subtle patrolBOT
#

Yeah — this strongly points to the “two machines are fighting over the same Telegram bot token” situation.

Two key clarifications:

  1. Sending “stop” in chat only aborts the current agent run.
    It does not stop the old machine’s Gateway service (the thing that polls Telegram).

  2. With Telegram long polling (getUpdates), you really want exactly one active poller per bot token.

What to do

On the old machine (the one you migrated from), run:

openclaw gateway status
openclaw gateway stop
#

Then verify it’s actually down:

openclaw gateway status

(If you installed it as a service, this is the important part — otherwise it keeps running in the background.)

On the new machine, do:

openclaw gateway restart

About that log line

#
telegram deleteWebhook failed: Network request for 'deleteWebhook' failed!

That’s a separate network/DNS/IPv6 issue talking to api.telegram.org (the docs mention IPv6 weirdness). But first fix the two pollers problem — that alone often resolves the “no inbound messages” symptom.

After you stop the old gateway, if inbound is still flaky, we’ll apply the IPv4 forcing fix (channels.telegram.network.autoSelectFamily: false) you already saw.

distant skiff
#

can i just turn the old machine off?

subtle patrolBOT
#

Yes — powering the old machine off will absolutely stop it from polling Telegram, and that’s the quickest way to eliminate the “two pollers” problem.

After you turn it off:

  1. On the new machine run:
openclaw gateway restart
  1. Then DM the bot again.

If you ever turn the old machine back on later, it’ll likely conflict again unless you also run openclaw gateway stop (or uninstall/disable the service) on that old box.

distant skiff
#

turned it off and restareted gateway. still nothing

subtle patrolBOT
#

If the old machine is truly off and you still get zero inbound replies, then we’re likely dealing with the other issue that showed up in your logs:

deleteWebhook failed: Network request … failed

That points to Telegram API connectivity from the new machine (often IPv6/DNS weirdness on Node 22).

Apply the IPv4/polling stability fix (from the Telegram docs)

On the new machine:

openclaw config set channels.telegram.network.autoSelectFamily false --strict-json
openclaw config set channels.telegram.network.dnsResultOrder "ipv4first"
openclaw gateway restart

Then DM the bot again.

#

Immediately check Telegram channel logs

openclaw channels logs --channel telegram

If it’s still failing to poll, you should now see explicit network errors in there (getUpdates failures, DNS, etc.).

If you paste the fresh output of openclaw channels logs --channel telegram after that restart, we can pinpoint whether it’s DNS/IPv6, a proxy issue, or something else.

distant skiff
#

-03-04T23:25:22.793Z info {"subsystem":"gateway/channels/telegram"} [default] starting provider (@Nacho_2_bot)
2026-03-05T00:00:32.107Z info {"subsystem":"gateway/channels/telegram"} [default] starting provider
2026-03-05T00:00:42.595Z error {"subsystem":"gateway/channels/telegram"} telegram deleteWebhook failed: Network request for 'deleteWebhook' failed!
2026-03-05T00:00:42.598Z error {"subsystem":"gateway/channels/telegram"} Telegram webhook cleanup failed: Network request for 'deleteWebhook' failed!; retrying in 2.37s.
2026-03-05T00:35:22.792Z info {"subsystem":"gateway/channels/telegram"} [default] starting provider (@Nacho_2_bot)

subtle patrolBOT
#

Those log lines are the core problem:

telegram deleteWebhook failed: Network request for 'deleteWebhook' failed!

That means the gateway cannot reliably reach api.telegram.org from this machine. If it can’t reach Telegram’s API, it can’t poll updates → you get “nothing” in DMs.

Before we touch any more OpenClaw settings, we need to answer: is this an OS/network problem or a Node/IPv6 selection problem?

Run these 2 tests on the gateway machine

(No token needed.)

curl -4 -I https://api.telegram.org
curl -6 -I https://api.telegram.org
  • If -4 works but -6 fails/hangs → your IPv6 is broken, and OpenClaw/Node may be tripping over it.
#
  • If both fail → your machine can’t reach Telegram at all (DNS/firewall/proxy/captive portal/VPN issue).
  • If both work → then it’s likely Node fetch behaving differently, and we’ll look at the OpenClaw channels.telegram.network.* settings and logs after restart.

Also, please run this so we know the config fix actually stuck:

openclaw config get channels.telegram.network.autoSelectFamily
openclaw config get channels.telegram.network.dnsResultOrder

Paste the outputs of those four commands and we’ll zero in fast.

distant skiff
#

elegram.org
HTTP/2 302
server: nginx/1.18.0
date: Thu, 05 Mar 2026 04:15:52 GMT
content-type: text/html
content-length: 145
location: https://core.telegram.org/bots
strict-transport-security: max-age=31536000; includeSubDomains; preload
access-control-allow-origin: *
access-control-allow-methods: GET, POST, OPTIONS
access-control-expose-headers: Content-Length,Content-Type,Date,Server,Connection

curl: (7) Failed to connect to api.telegram.org port 443 after 10 ms: Couldn't connect to server
johnny-mangual@johnny-mangual-NucBox-K10:~$ openclaw config get channels.telegram.network.autoSelectFamily
openclaw config get channels.telegram.network.dnsResultOrder

🦞 OpenClaw 2026.3.1 (2a8ac97) — I'll butter your workflow like a lobster roll: messy, delicious, effective.

false

🦞 OpenClaw 2026.3.1 (2a8ac97) — iMessage green bubble energy, but for everyone.

ipv4first

subtle patrolBOT
#

Perfect — that test basically confirms the diagnosis:

  • IPv4 to api.telegram.org works (you got the 302)
  • IPv6 to api.telegram.org is broken (curl: (7) … Couldn't connect)
  • Your OpenClaw config is set to force the IPv4-ish behavior:
    • autoSelectFamily: false
    • dnsResultOrder: ipv4first

So now the goal is simply: make sure the running gateway process is actually using those settings (and not still doing IPv6).

Do this now

  1. Restart the gateway (so the Telegram client is re-created with the new network settings):
openclaw gateway restart
#
  1. Immediately check for fresh Telegram startup errors:
openclaw channels logs --channel telegram

You should not see deleteWebhook failed anymore.

If it still shows deleteWebhook failed

Then the service likely isn’t picking up the config for that process. Use the env overrides the Telegram doc mentions (these are read at process start):

Put this in ~/.openclaw/.env:

OPENCLAW_TELEGRAM_DISABLE_AUTO_SELECT_FAMILY=1
OPENCLAW_TELEGRAM_DNS_RESULT_ORDER=ipv4first
#

Then:

openclaw gateway restart

After that, DM the bot again.

If you paste the new channels logs --channel telegram lines after the restart, we’ll see immediately whether it’s still trying IPv6 or if it’s clean.

distant skiff
#

webhook still fails

subtle patrolBOT
#

If deleteWebhook is still failing after you proved IPv4 works with curl, the missing piece is almost always:

Node’s fetch (undici) is still trying IPv6 first and not falling back.

Let’s confirm that directly with a 10‑second test.

1) Test Node fetch to Telegram (same stack OpenClaw uses)

Run:

node -e "fetch('https://api.telegram.org').then(r=>console.log('OK', r.status)).catch(e=>console.error('FAIL', e))"

If that prints FAIL …, then OpenClaw will also fail.

2) Force IPv4 preference for Node and re-test

Run:

#
NODE_OPTIONS="--dns-result-order=ipv4first" node -e "fetch('https://api.telegram.org').then(r=>console.log('OK', r.status)).catch(e=>console.error('FAIL', e))"
  • If this one prints OK 302 (or similar), then we’ve nailed it: you need that NODE_OPTIONS for the gateway service.

3) Make it permanent for the OpenClaw service

Put this in ~/.openclaw/.env:

NODE_OPTIONS=--dns-result-order=ipv4first
OPENCLAW_TELEGRAM_DISABLE_AUTO_SELECT_FAMILY=1
OPENCLAW_TELEGRAM_DNS_RESULT_ORDER=ipv4first

Then:

#
openclaw gateway restart
openclaw channels logs --channel telegram

If you paste the output of steps (1) and (2), I can tell you immediately which path you’re on.

distant skiff
#

nacho's responses to the earlier his are just coming in now. whats the command for opening up the .env?

subtle patrolBOT
#

On Linux, ~/.openclaw/.env is just a normal text file. Easiest way to edit it:

mkdir -p ~/.openclaw
nano ~/.openclaw/.env

In nano:

  • save: Ctrl+O then Enter
  • exit: Ctrl+X

Then restart OpenClaw so the service picks up the env vars:

openclaw gateway restart
#

(If you prefer one-liner editing instead of an editor, tell me and I’ll give you a safe cat >> ~/.openclaw/.env version.)

distant skiff
#

ok added

subtle patrolBOT
#

Now restart the gateway so it actually picks up the .env values:

openclaw gateway restart

Then confirm Telegram is clean (no more deleteWebhook failed):

openclaw channels logs --channel telegram

If that looks good, DM the bot /new and then hi again and see if replies come back normally (no 20‑minute backlog).

distant skiff
#

idk if i did it right subsystem":"gateway/channels/telegram"} [default] starting provider
2026-03-05T00:00:42.595Z error {"subsystem":"gateway/channels/telegram"} telegram deleteWebhook failed: Network request for 'deleteWebhook' failed!
2026-03-05T00:00:42.598Z error {"subsystem":"gateway/channels/telegram"} Telegram webhook cleanup failed: Network request for 'deleteWebhook' failed!; retrying in 2.37s.

subtle patrolBOT
#

Those lines are from 00:00:42Z — they’re old (from before you edited .env). openclaw channels logs is showing you historical entries, not just “right now”.

What we need to know is: after your .env change + restart, are there any new deleteWebhook failed lines?

Do this

  1. Restart again (to be 100% sure it picked up the env):
openclaw gateway restart
  1. Show only the most recent Telegram channel log lines:
openclaw channels logs --channel telegram | tail -n 30
#

(or if you prefer:)

openclaw logs --limit 50

If you still see deleteWebhook failed with a current timestamp, then the fix didn’t apply. If you don’t see it anymore, you’re good — those midnight errors were just history.

Also safe check: paste your .env contents (it shouldn’t contain secrets in our case):

cat ~/.openclaw/.env
distant skiff
#

04:28:37 error gateway {"subsystem":"gateway"} shutdown timed out; exiting without full cleanup
johnny-mangual@johnny-mangual-NucBox-K10:~$ cat ~/.openclaw/.env
NODE_OPTIONS=--dns-result-order=ipv4first
OPENCLAW_TELEGRAM_DISABLE_AUTO_SELECT_FAMILY=1
OPENCLAW_TELEGRAM_DNS_RESULT_ORDER=ipv4first

subtle patrolBOT
#

That .env looks correct — those are exactly the lines we want.

The message you saw:

shutdown timed out; exiting without full cleanup

just means the gateway was busy/hung when it tried to restart (common if a run is stuck waiting on a model/network call). It’s not a config syntax problem.

Do this now

  1. Check whether the gateway is actually running:
openclaw gateway status
  1. If it’s not running, start it:
#
openclaw gateway start

(or if it says it’s running but feels wedged, do a clean stop/start:)

openclaw gateway stop
openclaw gateway start
  1. Now check the latest Telegram logs (we want to see if deleteWebhook failed is still happening after 04:28):
openclaw channels logs --channel telegram | tail -n 30
#

If you paste that tail output, we’ll know immediately whether the IPv4 env fix actually took effect.

distant skiff
#

2026-03-05T04:28:40.445Z info {"subsystem":"gateway/channels/telegram"} [default] starting provider (@Nacho_2_bot)

subtle patrolBOT
#

Nice — that timestamp (04:28:40Z) is the important part. It means your gateway restarted after the .env change and Telegram is starting cleanly on the new machine.

Now we just need to confirm the old error is gone.

1) Check if deleteWebhook failed still happens after 04:28:40

openclaw channels logs --channel telegram | tail -n 50

If you don’t see any new deleteWebhook failed lines after that 04:28:40 start line, Telegram connectivity is fixed.

2) Then test a DM and watch logs

  • DM the bot: hi
  • In another terminal:
#
openclaw logs --follow --json

You should see an inbound message + agent/embedded run start within a couple seconds. If you paste the first few lines after sending hi, we can verify it’s processing normally again.

distant skiff
#

{"type":"log","time":"2026-03-05T04:35:05.672Z","level":"warn","subsystem":"compaction-safeguard","message":"{"subsystem":"compaction-safeguard"} Compaction safeguard: cancelling compaction with no real conversation messages to summarize.","raw":"{"0":"{\"subsystem\":\"compaction-safeguard\"}","1":"Compaction safeguard: cancelling compaction with no real conversation messages to summarize.","_meta":{"runtime":"node","runtimeVersion":"22.22.0","hostname":"unknown","name":"{\"subsystem\":\"compaction-safeguard\"}","parentNames":["openclaw"],"date":"2026-03-05T04:35:05.672Z","logLevelId":4,"logLevelName":"WARN","path":{"fullFilePath":"file:///home/johnny-mangual/.npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1170:14","fileName":"subsystem-DypCPrmP.js","fileNameWithLine":"subsystem-DypCPrmP.js:1170","fileColumn":"14","fileLine":"1170","filePath":".npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js","filePathWithLine":".npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1170","method":"logToFile"}},"time":"2026-03-05T04:35:05.672Z"}"}
{"type":"log","time":"2026-03-05T04:35:05.673Z","level":"debug","subsystem":"agent/embedded","message":"{"subsystem":"agent/embedded"} embedded run agent start: runId=a8ef0149-7fb6-4d42-b622-8ebe02e3386e","raw":"{"0":"{\"subsystem\":\"agent/embedded\"}","1":"embedded run agent start: runId=a8ef0149-7fb6-4d42-b622-8ebe02e3386e","_meta":{"runtime":"node","runtimeVersion":"22.22.0","hostname":"unknown","name":"{\"subsystem\":\"agent/embedded\"}","parentNames":["openclaw"],"date":"2026-03-05T04:35:05.673Z","logLevelId":2,"logLevelName":"DEBUG","path":{"fullFilePath":"file:///home/johnny-mangual/.npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1170:14","fileName":"subsystem-D

subtle patrolBOT
#

Perfect — this is the “good” kind of log output:

  • Telegram DM made it into the gateway
  • The agent run actually started (agent/embedded … start)

And this line:

Compaction safeguard: cancelling compaction with no real conversation messages to summarize.

is harmless. It basically means OpenClaw considered doing a compaction step, noticed there’s nothing meaningful to summarize yet, and skipped it.

Now the next thing to watch for in the log stream is anything mentioning the model call (Ollama) or an error. If you paste the next ~10–20 lines after that embedded run agent start, we can see whether it’s:

  • actually calling Ollama and just slow, or
  • still hitting a network/context/config issue.
distant skiff
#

"type":"log","time":"2026-03-05T04:36:36.553Z","level":"debug","subsystem":"diagnostic","message":"{"subsystem":"diagnostic"} lane task done: lane=main durationMs=91001 active=0 queued=0","raw":"{"0":"{\"subsystem\":\"diagnostic\"}","1":"lane task done: lane=main durationMs=91001 active=0 queued=0","_meta":{"runtime":"node","runtimeVersion":"22.22.0","hostname":"unknown","name":"{\"subsystem\":\"diagnostic\"}","parentNames":["openclaw"],"date":"2026-03-05T04:36:36.553Z","logLevelId":2,"logLevelName":"DEBUG","path":{"fullFilePath":"file:///home/johnny-mangual/.npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1170:14","fileName":"subsystem-DypCPrmP.js","fileNameWithLine":"subsystem-DypCPrmP.js:1170","fileColumn":"14","fileLine":"1170","filePath":".npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js","filePathWithLine":".npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1170","method":"logToFile"}},"time":"2026-03-05T04:36:36.553Z"}"}
{"type":"log","time":"2026-03-05T04:36:36.554Z","level":"debug","subsystem":"diagnostic","message":"{"subsystem":"diagnostic"} lane task done: lane=session:agent:main:main durationMs=91005 active=0 queued=0","raw":"{"0":"{\"subsystem\":\"diagnostic\"}","1":"lane task done: lane=session:agent:main:main durationMs=91005 active=0 queued=0","_meta":{"runtime":"node","runtimeVersion":"22.22.0","hostname":"unknown","name":"{\"subsystem\":\"diagnostic\"}","parentNames":["openclaw"],"date":"2026-03-05T04:36:36.553Z","logLevelId":2,"logLevelName":"DEBUG","path":{"fullFilePath":"file:///home/johnny-mangual/.npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1170:14","fileName":"subsystem-DypCPrmP.js","fileNameWithLine":"subsystem-DypCPrmP.js:1170","fileColumn":"14","fileLine this is the latest no answer yet

subtle patrolBOT
#

That log line is actually telling us something important:

  • The bot did run (took ~91 seconds): durationMs=91001
  • And it thinks the lane is done: active=0 queued=0

So the “thinking” part finished — the missing piece is delivery to Telegram.

Two likely reasons:

1) It did deliver, but as a Telegram “draft” (easy to miss)

By default, Telegram streaming in DMs can use native drafts (sendMessageDraft). That means the response may show up in your message compose box (where you type), not as a new chat bubble.

Check the DM with the bot: do you see any text sitting in the input field?

2) It finished, but never successfully sent a message

To confirm, check for a sendMessage ok line around 04:36:36:

#
openclaw channels logs --channel telegram | tail -n 50

Quick fix to make behavior obvious (disable draft streaming)

Temporarily turn Telegram streaming off so it always sends a normal message:

openclaw config set channels.telegram.streaming "off"
openclaw gateway restart

Then DM hi again. If you get a normal message bubble back, we’ve found the issue (draft streaming UI confusion).

distant skiff
#

Oh wait. It shows it answered in one minute

subtle patrolBOT
#

Nice — that’s the “normal” outcome for this setup.

On that machine, ~1 minute per turn with OpenClaw + a local 14B on CPU isn’t crazy, because OpenClaw’s prompt (system + safety + routing + whatever tools/skills are enabled) is way heavier than a plain ollama run "hi".

A couple practical takeaways:

  • If 14B is ~1 min/turn, then 32B locally on CPU/iGPU is going to feel painfully slow (think many minutes per reply).
  • If you want it faster, the levers are:
    • use a smaller local model, or
    • keep OpenClaw “light” (minimal tools/skills, smaller context), or
    • use a GPU / hosted model for the agent brain.

Also: keep the old machine’s gateway off (or openclaw gateway stop on it) so you don’t get back into the “two pollers” Telegram weirdness.

If you tell me what “acceptable latency” is for you (5s? 15s? 30s?), I can suggest the most realistic config path for this hardware.

distant skiff
#

I’m not getting notifications for it

subtle patrolBOT
#

That’s expected with how OpenClaw does Telegram streaming in DMs.

By default (channels.telegram.streaming: "partial"), OpenClaw can stream replies via native drafts (sendMessageDraft) and/or message edits. Draft updates/edits generally don’t trigger push notifications the way a brand‑new incoming message does.

Fix: disable Telegram streaming so replies are normal messages

On the gateway host:

openclaw config set channels.telegram.streaming "off"
openclaw gateway restart

Then try again — you should start getting normal Telegram notifications (assuming the chat isn’t muted).

Also check Telegram-side (easy gotcha)

In Telegram, open the DM with the bot → tap the name → make sure Notifications aren’t muted for that chat (and that your OS notifications for Telegram are enabled).

distant skiff
#

I asked what’s your most recent memory. No response for 3 minutes and counting

subtle patrolBOT
#

If it’s been 3 minutes, first: abort the run in Telegram by sending:

stop

Then: did you previously set tools.profile to minimal for speed? If yes, that’s likely why this question stalls — “memory” questions often need the memory tools (memory_search / memory_get), and minimal disables almost everything.

Enable just memory tools (without turning everything on)

On the gateway host:

openclaw config set tools.profile "messaging"
openclaw config set tools.allow '["group:memory"]' --strict-json
openclaw gateway restart
#

Then in Telegram DM do:

  • /new
  • ask again: “what’s your most recent memory?”

This keeps the bot lightweight (no browser/etc), but allows it to actually read/search its memory files.

distant skiff
#

I opened the dashboard in the web browser and this seems to hang. I just typed "/new" into it and it's been five minutes. and now it says " fetch failed type":"log","time":"2026-03-05T12:59:00.543Z","level":"debug","subsystem":"diagnostic","message":"{"subsystem":"diagnostic"} lane task done: lane=main durationMs=600175 active=0 queued=0","raw":"{"0":"{\"subsystem\":\"diagnostic\"}","1":"lane task done: lane=main durationMs=600175 active=0 queued=0","_meta":{"runtime":"node","runtimeVersion":"22.22.0","hostname":"unknown","name":"{\"subsystem\":\"diagnostic\"}","parentNames":["openclaw"],"date":"2026-03-05T12:59:00.543Z","logLevelId":2,"logLevelName":"DEBUG","path":{"fullFilePath":"file:///home/johnny-mangual/.npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1170:14","fileName":"subsystem-DypCPrmP.js","fileNameWithLine":"subsystem-DypCPrmP.js:1170","fileColumn":"14","fileLine":"1170","filePath":".npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js","filePathWithLine":".npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1170","method":"logToFile"}},"time":"2026-03-05T12:59:00.543Z"}"}
{"type":"log","time":"2026-03-05T12:59:00.545Z","level":"debug","subsystem":"diagnostic","message":"{"subsystem":"diagnostic"} lane task done: lane=session:agent:main:main durationMs=600179 active=0 queued=0","raw":"{"0":"{\"subsystem\":\"diagnostic\"}","1":"lane task done: lane=session:agent:main:main durationMs=600179 active=0 queued=0","_meta":{"runtime":"node","runtimeVersion":"22.22.0","hostname":"unknown","name":"{\"subsystem\":\"diagnostic\"}","parentNames":["openclaw"],"date":"2026-03-05T12:59:00.545Z","logLevelId":2,"logLevelNa

subtle patrolBOT
#

In the Dashboard/WebChat, /new should be instant (it’s a gateway command; it shouldn’t wait on Ollama at all).

If it hung for minutes and then timed out with fetch failed + a durationMs=600175 (≈10 minutes), that means /new is being treated as a normal chat message and sent to the model, not as a command.

That usually happens when slash commands aren’t being applied for that sender/surface (disabled or not authorized).

Quick checks (run on the gateway host)

openclaw config get commands.text
openclaw config get commands.allowFrom

What you want:

  • commands.text should be true
  • commands.allowFrom should ideally be unset (or include the WebChat sender), otherwise commands can get treated as plain text and forwarded to the model.

Fast test in the WebChat UI

#

Send:

  • /whoami

If commands are working, it should reply immediately with your sender id. If it hangs, commands aren’t being applied there.

If you paste the output of those two openclaw config get … commands (and whether /whoami hangs), I can tell you exactly what to change.

distant skiff
#

Config path not found: commands.allowFrom

#

responds immediately to whoami with 🧭 Identity
Channel: webchat
User id: openclaw-control-ui
AllowFrom: openclaw-control-ui

subtle patrolBOT
#

Good — that means you don’t have commands.allowFrom set, so we’re not dealing with an explicit command allowlist.

Next check is whether commands are enabled and whether WebChat is considered “authorized” for commands.

1) Check command settings

Run:

openclaw config get commands.text
openclaw config get commands.useAccessGroups

2) Get your WebChat sender id (so we can allow commands there if needed)

Since /whoami in WebChat might be hanging, the easiest way is to look at logs:

  1. In a terminal:
#
openclaw logs --follow --json
  1. In the WebChat UI, send a simple message like: ping

In the log output, look for something like a from.id / sender id for that WebChat message and paste that line here. Once we have that id, we can (safely) set commands.allowFrom to allow commands for just you on WebChat, so /new won’t get forwarded to the model and time out.

If you paste the outputs of commands.text + commands.useAccessGroups, I can tell you which branch you’re on before we touch config.

#

Great — that proves WebChat commands are working and you’re authorized:

  • Channel: webchat
  • User id: openclaw-control-ui

So /new should be recognized.

What’s probably happening is simpler: there was already a run in progress, and /new doesn’t magically cancel it — it just resets the session for the next turn. The thing that actually cancels an in‑flight run is /stop (or plain stop).

In WebChat, when it “hangs”, do this:

  1. Send:
/stop
  1. Then send:
#
/new

If you still see fetch failed after ~10 minutes, that’s the model request timing out (your log showing ~600,000ms is the default 600s timeout).

If you want, paste the fetch failed log line that appears right after the timeout (it often includes which fetch target failed), and we can tell whether it’s Ollama hanging vs something else.

distant skiff
#

ngual/.npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1170:14","fileName":"subsystem-DypCPrmP.js","fileNameWithLine":"subsystem-DypCPrmP.js:1170","fileColumn":"14","fileLine":"1170","filePath":".npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js","filePathWithLine":".npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1170","method":"logToFile"}},"time":"2026-03-05T13:57:30.879Z"}"}
{"type":"log","time":"2026-03-05T13:57:31.260Z","level":"info","subsystem":"gateway/channels/telegram","message":"{"subsystem":"gateway/channels/telegram"} [default] starting provider (@Nacho_2_bot)","raw":"{"0":"{\"subsystem\":\"gateway/channels/telegram\"}","1":"[default] starting provider (@Nacho_2_bot)","_meta":{"runtime":"node","runtimeVersion":"22.22.0","hostname":"unknown","name":"{\"subsystem\":\"gateway/channels/telegram\"}","parentNames":["openclaw"],"date":"2026-03-05T13:57:31.259Z","logLevelId":3,"logLevelName":"INFO","path":{"fullFilePath":"file:///home/johnny-mangual/.npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1170:14","fileName":"subsystem-DypCPrmP.js","fileNameWithLine":"subsystem-DypCPrmP.js:1170","fileColumn":"14","fileLine":"1170","filePath":".npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js","filePathWithLine":".npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1170","method":"logToFile"}},"time":"2026-03-05T13:57:31.260Z"}"}

subtle patrolBOT
#

That line:

gateway/channels/telegram … starting provider (@Nacho_2_bot)

is normal — it just means the Telegram channel runner restarted (usually because you restarted the gateway, or the channel crashed and auto-recovered).

What matters is what comes immediately after:

  • If you see deleteWebhook failed / getUpdates failed / fetch failed right after → Telegram API connectivity is still broken.
  • If it starts cleanly with no errors → Telegram is fine, and any “hang” you’re seeing is almost certainly the model run (Ollama being slow / timing out).

If you paste the next ~10 lines after that “starting provider” line (especially any error lines), I can tell which bucket you’re in.

distant skiff
#

pm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1170:14","fileName":"subsystem-DypCPrmP.js","fileNameWithLine":"subsystem-DypCPrmP.js:1170","fileColumn":"14","fileLine":"1170","filePath":".npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js","filePathWithLine":".npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1170","method":"logToFile"}},"time":"2026-03-06T00:52:30.920Z"}"}
{"type":"log","time":"2026-03-06T00:52:31.302Z","level":"info","subsystem":"gateway/channels/telegram","message":"{"subsystem":"gateway/channels/telegram"} [default] starting provider (@Nacho_2_bot)","raw":"{"0":"{\"subsystem\":\"gateway/channels/telegram\"}","1":"[default] starting provider (@Nacho_2_bot)","_meta":{"runtime":"node","runtimeVersion":"22.22.0","hostname":"unknown","name":"{\"subsystem\":\"gateway/channels/telegram\"}","parentNames":["openclaw"],"date":"2026-03-06T00:52:31.302Z","logLevelId":3,"logLevelName":"INFO","path":{"fullFilePath":"file:///home/johnny-mangual/.npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1170:14","fileName":"subsystem-DypCPrmP.js","fileNameWithLine":"subsystem-DypCPrmP.js:1170","fileColumn":"14","fileLine":"1170","filePath":".npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js","filePathWithLine":".npm-global/lib/node_modules/openclaw/dist/subsystem-DypCPrmP.js:1170","method":"logToFile"}},"time":"2026-03-06T00:52:31.302Z"}"}

subtle patrolBOT
#

That line by itself is not an error — it just means the Telegram runner restarted:

[default] starting provider (@Nacho_2_bot)

To see why it restarted (or whether it’s immediately failing), you need the lines right after it, especially any error lines.

Best way to grab that in one go:

openclaw channels logs --channel telegram | tail -n 200

If you want to narrow it further:

#
openclaw channels logs --channel telegram | tail -n 200 | grep -i error

Paste that output — if it’s restarting due to a conflict/network issue, the reason will be in those lines (e.g. getUpdates failed, deleteWebhook failed, Conflict: terminated by other getUpdates request, etc.).

distant skiff
#

no errors in that

subtle patrolBOT
#

If there are no errors after the “starting provider” line, then Telegram itself is probably fine — that line just means the Telegram runner (re)initialized (usually on gateway restart, config reload, or a reconnect).

So at that point:

  • If you are receiving replies: you can ignore it.
  • If you’re not receiving replies: then it’s not a Telegram “crash”, it’s either:
    1. the message isn’t being received/authorized, or
    2. the agent/model run is slow/timing out.

Quick way to tell which:

  1. Run this:
openclaw logs --follow --json
  1. Send a DM to the bot: ping
#
  • If you see agent/embedded … start, then Telegram is receiving and the model is the bottleneck.
  • If you see nothing at all, then Telegram isn’t delivering updates (but you’d usually see errors if that were the case).

What symptom are you seeing right now — “messages arrive but replies are slow”, or “no replies at all”?

distant skiff
#

im trying to message in the web browser but it's not working. still says fetch failked

subtle patrolBOT
#

“fetch failed” in the browser almost always means the UI can’t reach (or can’t authenticate to) the Gateway WebSocket.

Two quick checks:

1) Are you opening the dashboard/WebChat on the same machine as the gateway?

Run on the gateway machine:

openclaw gateway status

If it’s listening on 127.0.0.1:18789 and you’re trying to open it from another computer/phone, it will fail. In that case you need an SSH tunnel (or Tailscale Serve/bind).

2) Use the “known good” URL + token flow

From the gateway machine run:

#
openclaw dashboard

Then in the UI, paste the token from:

openclaw config get gateway.auth.token

(If the browser got into a bad state, open in an incognito window or restart the browser.)

Tell me: are you opening the web UI locally on the NucBox, or from a different device?

distant skiff
#

Locally on the nuc

subtle patrolBOT
#

Cool — if you’re opening it locally, then “fetch failed” is usually one of two things:

  1. The UI isn’t connected/authenticated to the gateway WS, or
  2. The UI is connected, but the agent run failed and it’s showing the error text.

Quick way to tell which one you’re in

A. Gateway connection/auth check (Control UI)

  • In the dashboard, go to Control → Overview → Gateway Access
  • Paste the token from:
openclaw config get gateway.auth.token

Then click Connect.

(If it’s still weird, hard-refresh the page or open an incognito window — stale localStorage can wedge it.)

#

B. If it only fails when you send a chat message
Tail logs while you send a message from the browser:

openclaw logs --follow --json

If you see agent/embedded … error=fetch failed, then the model provider fetch is failing (usually Ollama hanging/timeout), not the browser UI.

Also: where exactly are you typing /new — the Chat tab message box, or the Agents/Tools panel? /new only behaves like a command in the actual chat surfaces; in other panels it can get treated as plain text and sent to the model.

distant skiff
#

In the chat tab.

Do you think all these problems go away if I drop again to the 8b?

subtle patrolBOT
#

Dropping to 8B will help with the “it takes forever / times out / UI feels hung” part, because you’re CPU‑bound and OpenClaw prompts are heavy. You’ll likely see a noticeable speedup (often multiple times faster).

But it won’t magically fix everything we hit:

What 8B will improve

  • Time‑to‑first‑token and overall latency (WebChat/Telegram feels less “stuck”)
  • Fewer 10‑minute timeouts just from being too slow

What 8B won’t fix (model size doesn’t matter)

  • Telegram polling issues from two gateways fighting over the same bot token
  • Telegram API problems from broken IPv6 / Node DNS behavior (deleteWebhook failed, etc.)
  • Any auth/token/UI connection issues (those are gateway/browser)

Tradeoff

8B will be less reliable (more mistakes, weaker instruction-following), especially once you re-enable tools/skills. If you keep it mostly chat + light tasks, it’s fine.

If your goal is “usable latency on this NucBox”, yeah — 8B is a very reasonable next step. If your goal is “tool-using assistant that feels snappy”, you’ll eventually want either a GPU box or a hosted model for the brain.

distant skiff
#

So 8b is not agentic is what you’re saying

subtle patrolBOT
#

Not exactly. 8B can still be “agentic” in the sense that it can plan + call tools + iterate — it’s just much less reliable at it, especially when you give it a big toolset and expect it to run unattended.

Think of “agentic” as a spectrum:

8B is often fine for

  • simple back-and-forth chat
  • summarizing, rewriting, drafting
  • very constrained tool use (one tool, one step, clear inputs/outputs)
  • “do X, then stop and ask me” workflows

8B tends to fall over with

  • long multi-step tasks with lots of branching
  • coordinating many tools (web + files + exec + messaging)
  • strict JSON/tool schemas (more malformed tool calls / hallucinations)
  • anything where being wrong has a high cost

So: 8B isn’t “non-agentic”, it’s just not the model you want for hands-off automation. If you keep it supervised and keep tools narrow, it can work.

distant skiff
#

So i can't drop my context window below 16000, is that normal?

subtle patrolBOT
#

Yeah — for OpenClaw that’s pretty normal.

OpenClaw isn’t “just a chat prompt”; it carries a fairly chunky system prompt + safety scaffolding + (optionally) tools/skills descriptions + session plumbing. Below a certain point, the gateway basically says “this model/session can’t even fit the minimum prompt budget safely”, and you’ll see a floor around ~16k.

A couple clarifications:

  • The limit you hit is usually the effective context budget OpenClaw needs, not some magical property of Qwen itself.
  • You can sometimes push it lower by stripping things down (minimal tools profile, fewer skills), but OpenClaw is fundamentally designed assuming large-context models.

If you want to see exactly where the tokens go, run this (in WebChat/Telegram — it’s a command and should be fast):

/context detail

It will show you the system prompt + tools + skills sizes, and you’ll immediately see why 4k/8k isn’t viable in your config.

distant skiff
#

So i dropped down to the 8b and still need this 16000 context window. is it possibkle that my machine just isn't good enough cause it's cpu only????

subtle patrolBOT
#

The 16,000 context requirement isn’t about your CPU — it’s about OpenClaw’s minimum prompt size/budget. Dropping from 14B → 8B changes speed, not the minimum context OpenClaw wants available.

So:

  • Needing ~16k context: normal for OpenClaw, regardless of model size or CPU/GPU.
  • CPU-only “not good enough”: depends on what you mean by “good enough”.

On your i9 + 64GB (CPU-only):

  • It will run 8B at 16k context, but expect higher latency (often tens of seconds to a minute) because OpenClaw prompts are big and CPU prompt-processing is slow.
  • 32B on CPU will be painful for interactive “assistant” use.

If you want it to feel snappy (single-digit seconds), you generally need a discrete GPU with plenty of VRAM or a hosted model.

If you run /context detail, you’ll see exactly why the baseline is so high — and what you can trim (tools/skills) to reduce the prompt and speed it up.

distant skiff
#

What is the fastest you would expect my current machine to answer "hey" with the 8b qwen

subtle patrolBOT
#

On that i9‑13900HK CPU‑only box, with OpenClaw + Ollama + an 8B Q4-ish model, the bottleneck is usually prompt processing (OpenClaw’s system prompt + whatever context it injects), not generating the word “hey”.

Realistic latency to first visible text:

  • Best case (warm model, fresh /new, minimal tools/skills): ~5–20 seconds
  • More typical (normal OpenClaw prompt + some history): ~20–90 seconds
  • Worst case (cold start / swapping / huge context): multiple minutes

If you want to sanity-check whether you’re in the “best case” setup, run /context detail — if the system/tools/skills prompt is huge, no 8B model on CPU will feel instant.

distant skiff
#

something claude noted when reviewing /context detail - message tool: 1,070 tok with 86 params — that's enormous for a tool schema

subtle patrolBOT
#

Yep — Claude’s observation is spot on. On smaller/local models, the tool schema can dominate the entire context, and message is one of the worst offenders because it has to describe a giant “send/edit/react/poll/components/etc” surface for multiple providers.

For a local 8B on CPU, paying ~1k tokens of schema every turn is brutal.

The pragmatic fix: don’t show the message tool to the model

Replies on Telegram/WebChat do not require the message tool (the gateway delivers replies itself). The message tool is mainly for “send to some other target/channel” behavior.

So you can safely restrict tools for your local provider to shrink the prompt.

Best pattern: keep your normal models “full”, but make the local Ollama provider “minimal”.

Example (edit to match your provider key):

tools: {
  profile: "full",
#
  byProvider: {
    "custom-127-0-0-1-11434": {
      profile: "minimal",
      allow: ["group:memory"], // optional: keep memory_search/memory_get
    }
  }
}

Then:

openclaw gateway restart

That should drop a lot of schema weight (including message) and make “hey” noticeably faster.

#

If you want, paste your current tools: block and I’ll tell you the smallest safe set for “chat + memory” on the local model.

distant skiff
#

"tools": {
"profile": "messaging",
"allow": [
"group:memory"
],

subtle patrolBOT
#

Nice — that config explains why the message tool schema is showing up:

tools: {
  profile: "messaging",
  allow: ["group:memory"]
}

profile: "messaging" includes the message tool by default, and that tool’s schema is huge.

Easiest win: keep the profile, but deny just message

Add a deny list:

openclaw config set tools.deny '["message"]' --strict-json
#
openclaw gateway restart

What this does:

  • Keeps the useful “messaging profile” stuff (sessions tools, etc.)
  • Removes the gigantic message tool schema from the model prompt
  • Doesn’t break normal replies on Telegram/WebChat (the gateway replies without the model needing the message tool)

If you want it even lighter (fastest)

You can go all the way down to:

  • tools.profile: "minimal"
  • plus group:memory

…but I’d try the deny ["message"] first since it’s a smaller change.

#

After restarting, run /context detail again — you should see the tool schema size drop a lot.

distant skiff
#

is this something that should be reinstated when i go back to a larger Llm?

subtle patrolBOT
#

Only if you want the model to be able to do “agent-y messaging actions” (send to other chats, broadcast, edit/delete messages, reactions, etc.).

A few practical points:

  • Normal replies don’t need the message tool. Telegram/WebChat replies still work without it.
  • Keeping message disabled is safer (it removes an entire class of “oops it messaged someone” risks) and it shrinks the prompt.
  • A larger/stronger model can handle the extra schema better, but it’s still overhead.

Best practice (what I’d do)

Instead of a global tools.deny: ["message"] forever, make it model/provider-specific:

  • Keep your default profile “messaging” (for strong models)
  • Override your local Ollama provider to a lighter profile that doesn’t include message

Example shape:

#
tools: {
  profile: "messaging",
  allow: ["group:memory"],
  byProvider: {
    "custom-127-0-0-1-11434": { profile: "minimal", allow: ["group:memory"] }
  }
}

That way:

  • Big/hosted model gets the full messaging toolset if you want it
  • Local model stays lean (faster, less schema bloat)

So: you don’t have to reinstate it, but if you later want cross-chat “agent sends”, switch to the byProvider pattern instead of keeping a global deny.

distant skiff
#

made the change and now heres the context detail ontext breakdown (detailed)
Workspace: /home/johnny-mangual/.openclaw/workspace
Bootstrap max/file: 20,000 chars
Bootstrap max/total: 150,000 chars
Sandbox: mode=off sandboxed=false
System prompt (run): 29,521 chars (~7,381 tok) (Project Context 18,491 chars (~4,623 tok))
Injected workspace files:

Skills list (system prompt text): 3,076 chars (~769 tok) (7 skills)
Skills: healthcheck, openai-image-gen, openai-whisper-api, skill-creator, weather, wordpress-publisher, xlsx-cn
Top skills (prompt entry size):
• wordpress-publisher: 649 chars (~163 tok)
• healthcheck: 491 chars (~123 tok)
• weather: 416 chars (~104 tok)
• xlsx-cn: 316 chars (~79 tok)
• skill-creator: 296 chars (~74 tok)
• openai-image-gen: 272 chars (~68 tok)
• openai-whisper-api: 249 chars (~63 tok)
Tool list (system prompt text): 429 chars (~108 tok)
Tool schemas (JSON): 5,012 chars (~1,253 tok) (counts toward context; not shown as text)
Tools: message, sessions_list, sessions_history, sessions_send, session_status
Top tools (schema size):
• message: 4,277 chars (~1,070 tok)
• sessions_send: 273 chars (~69 tok)
• sessions_list: 212 chars (~53 tok)
• sessions_history: 161 chars (~41 tok)
• session_status: 89 chars (~23 tok)
Top tools (summary text size):
• session_status: 207 chars (~52 tok)
• message: 132 chars (~33 tok)
• sessions_send: 84 chars (~21 tok)
• sessions_list: 54 chars (~14 tok)
• sessions_history: 36 chars (~9 tok)
Tools (param count):
• message: 86 params
• sessions_send: 5 params
• sessions_list: 4 params
• sessions_history: 3 params
• session_status: 2 params
Session tokens (cached): unknown / ctx=16384
Inline shortcut: a command token inside normal text (e.g. “hey /status”) that runs immediately (allowlisted senders only) and is stripped before the model sees the remaining message.

subtle patrolBOT
#

This /context detail output is super useful — and it shows the key problem:

  • Tool schemas: ~1,253 tok
  • message tool alone: ~1,070 tok (86 params)

So yeah, message is the bulk of your tool-schema tax.

But: your change didn’t take effect yet, because message is still listed under Tools:.

Verify what OpenClaw thinks your tool policy is

Run these on the gateway host and paste the output:

openclaw config get tools.profile
openclaw config get tools.allow
openclaw config get tools.deny
#
openclaw config get tools.byProvider

The “make it actually go away” command (global)

If tools.deny is empty/not set, set it like this:

openclaw config set tools.deny '["message"]' --strict-json
openclaw gateway restart

Then rerun:

/context detail
#

You should see message disappear from the Tools list and the schema tokens drop a lot.

Optional next cut: skills

You’re also spending ~769 tok on skills list (7 skills). That’s not crazy, but on CPU every token counts. Once message is gone, we can trim skills if needed.

distant skiff
#

get tools.profile
openclaw config get tools.allow
openclaw config get tools.deny

🦞 OpenClaw 2026.3.1 (2a8ac97) — Ah, the fruit tree company! 🍎

messaging

🦞 OpenClaw 2026.3.1 (2a8ac97)
Pairing codes exist because even bots believe in consent—and good security hygiene.

[
"group:memory"
]

🦞 OpenClaw 2026.3.1 (2a8ac97) — Chat APIs that don't require a Senate hearing.

Config path not found: tools.deny
johnny-mangual@johnny-mangual-NucBox-K10:~$ openclaw config get tools.byProvider

🦞 OpenClaw 2026.3.1 (2a8ac97)
I keep secrets like a vault... unless you print them in debug logs again.

{
"custom-127-0-0-1-11434": {
"allow": [
"group:memory"
],
"profile": "minimal"
}
}

subtle patrolBOT
#

This is actually super informative:

  • Global tools:
    • tools.profile = "messaging" (this is why message shows up)
    • tools.allow = ["group:memory"]
    • tools.deny is not set
  • You do have a per-provider override:
    • tools.byProvider.custom-127-0-0-1-11434.profile = "minimal"

So in theory, when the active model provider is custom-127-0-0-1-11434, the agent should not see message.

If /context detail is still showing message, one of these is true:

  1. That session is currently running on a different provider/model (pinned via /model), or
  2. There’s a bug/quirk where byProvider isn’t affecting the /context detail tool listing the way we expect.

Fastest “make it go away no matter what” fix

Just deny message globally:

#
openclaw config set tools.deny '["message"]' --strict-json
openclaw gateway restart

Then rerun:

/context detail

You should see the tool schema drop by ~1,070 tokens immediately.

Also check which model/provider the session is actually using

In the same chat where you ran /context detail, run:

#
/model status

If it’s not using custom-127-0-0-1-11434/..., that explains why the per-provider override didn’t help.

distant skiff
#

Current: custom-127-0-0-1-11434/qwen3:8b (selected)
Active: custom-127-0-0-1-11434/qwen3:14b-16k (runtime)
Default: custom-127-0-0-1-11434/qwen3:8b
Agent: main
Auth file: ~/.openclaw/agents/main/agent/auth-profiles.json

[custom-127-0-0-1-11434] endpoint: http://127.0.0.1:11434 api: ollama auth: o...a (models.json: ~/.openclaw/agents/main/agent/models.json)
• custom-127-0-0-1-11434/qwen3:8b

[minimax] endpoint: https://api.minimax.io/anthropic api: anthropic-messages auth: minimax:default=redacted (next, lastGood) (auth-profiles.json: ~/.openclaw/agents/main/agent/auth-profiles.json)
• minimax/MiniMax-M2.5 (Minimax)

[openai] endpoint: default auth: redacted (env: OPENAI_API_KEY)
• openai/gpt-5
• openai/gpt-5.1-codex (GPT)

subtle patrolBOT
#

That /model status output explains a lot:

  • Current (selected): …/qwen3:8b
  • Active (runtime): …/qwen3:14b-16k

So even though you picked 8B, OpenClaw is still actually running the 14B‑16k model right now.

That typically happens when either:

  1. you switched models while a run was already in-flight (the next turn uses the selected model), or
  2. the 8B model can’t satisfy OpenClaw’s minimum context requirements (so it falls back to the 14B‑16k variant that can).

To truly run 8B with OpenClaw, make an 8B “16k context” Ollama variant

On Linux:

cat > Modelfile8b <<'EOF'
#
FROM qwen3:8b
PARAMETER num_ctx 16384
EOF

ollama create qwen3:8b-16k -f Modelfile8b

Then point OpenClaw at qwen3:8b-16k (and keep contextWindow: 16384 in your provider config), restart the gateway, and in chat do /new.

After that, run /model status again — you want Active (runtime) to show the 8B-16k model, not the 14B-16k.

If you paste your models.providers.custom-127-0-0-1-11434.models array, I can tell you the exact config edit to add the new qwen3:8b-16k model alongside the 14B one.

distant skiff
#

how do I disable qwen3 thinking mode for a local Ollama model?

subtle patrolBOT
#

If you’re running it through OpenClaw, the easiest way is to turn thinking off at the session level:

Per-session (recommended)

Send as a standalone message in your chat (Telegram/WebChat/etc):

/think off

Aliases also work:

/thinking off

Make it the default (persistent)

Set the agent default in config and restart the gateway:

#
openclaw config set agents.defaults.thinkingDefault "off"
openclaw gateway restart

Notes:

  • This controls OpenClaw’s “thinking level” directive. If the Ollama model is inherently a reasoning/thinking checkpoint, turning thinking “off” may reduce (or hide) reasoning behavior, but some models will still be slow because the prompt is large and CPU-only inference is the bottleneck.
distant skiff
#

is that 5k prompt injection, which i cant control, the hold up here at 16k context window?

subtle patrolBOT
#

It’s not “prompt injection” in the scary/untrusted-content sense — it’s OpenClaw’s system prompt scaffolding (plus tool/skill schemas + “project context” files).

But yes: that big prompt is a huge part of your latency on CPU. The model has to read/process those thousands of tokens (“prefill”) before it can output even “hey”, and CPU-only inference makes that step slow.

A couple key points:

  • The 16k context window requirement is basically a guardrail: OpenClaw needs enough room for
    system prompt + tools/skills + some chat history + your reply.
    It’s not because you always use 16k every turn.
  • The actual “hold up” is the prompt length you showed (system prompt ~7.4k tok + tool schemas + skills + any injected workspace files). That’s what gets processed every turn.

What you can control (and it helps)

  • Tools: removing the giant message tool schema is a big win (you already identified it).
  • Skills: disable/allowlist skills so fewer are described in the prompt.
  • Project context files: if you have big workspace files being injected, trim them or remove them from what gets included.

Keep using /context detail after each change — it’s the right way to see exactly what’s bloating the prompt.

distant skiff
#

🧠 Context breakdown (detailed)
Workspace: /home/johnny-mangual/.openclaw/workspace
Bootstrap max/file: 20,000 chars
Bootstrap max/total: 150,000 chars
Sandbox: mode=off sandboxed=false
System prompt (run): 25,554 chars (~6,389 tok) (Project Context 15,318 chars (~3,830 tok))
Injected workspace files:

• AGENTS.md: OK | raw 5,645 chars (~1,412 tok) | injected 5,645 chars (~1,412 tok)
• SOUL.md: OK | raw 1,664 chars (~416 tok) | injected 1,664 chars (~416 tok)
• TOOLS.md: OK | raw 981 chars (~246 tok) | injected 981 chars (~246 tok)
• IDENTITY.md: OK | raw 852 chars (~213 tok) | injected 852 chars (~213 tok)
• USER.md: OK | raw 648 chars (~162 tok) | injected 648 chars (~162 tok)
• HEARTBEAT.md: OK | raw 119 chars (~30 tok) | injected 119 chars (~30 tok)
• BOOTSTRAP.md: OK | raw 1,449 chars (~363 tok) | injected 1,449 chars (~363 tok)
• MEMORY.md: OK | raw 3,294 chars (~824 tok) | injected 3,294 chars (~824 tok)
Skills list (system prompt text): 3,076 chars (~769 tok) (7 skills)
Skills: healthcheck, openai-image-gen, openai-whisper-api, skill-creator, weather, wordpress-publisher, xlsx-cn
Top skills (prompt entry size):
• wordpress-publisher: 649 chars (~163 tok)
• healthcheck: 491 chars (~123 tok)
• weather: 416 chars (~104 tok)
• xlsx-cn: 316 chars (~79 tok)
• skill-creator: 296 chars (~74 tok)
• openai-image-gen: 272 chars (~68 tok)
• openai-whisper-api: 249 chars (~63 tok)
Tool list (system prompt text): 185 chars (~47 tok)
Tool schemas (JSON): 89 chars (~23 tok) (counts toward context; not shown as text)
Tools: session_status
Top tools (schema size):
• session_status: 89 chars (~23 tok)
Top tools (summary text size):
• session_status: 207 chars (~52 tok)
Tools (param count):
• session_status: 2 params
Session tokens (cached): unknown / ctx=16384
Inline shortcut: a command token inside normal text (e.g. “hey /status”) that runs immediately (allowlisted senders only) and is stripped before the model sees the remaining message.

Did the message adjustment take?

subtle patrolBOT
#

Yes — it took.

Your /context detail now shows:

  • Tools: session_status only
  • Tool schemas (JSON): ~23 tokens
  • No message tool (and no 1,070‑token schema monster)

So the message tool is successfully removed from the prompt.

If you want the next big wins for speed, the largest remaining chunks are:

  • AGENTS.md (~1,412 tok)
  • MEMORY.md (~824 tok)
  • the skills list (~769 tok for 7 skills)

Trimming/disable‑allowlisting those is where the next noticeable latency improvements will come from on CPU.

distant skiff
#

Honestly, would getting rid of those allow a sub-one minute response?

subtle patrolBOT
#

Probably yes — prompt size is the main reason you’re waiting on CPU, because the model has to “read” all of that before it can answer.

Right now your baseline prompt is roughly:

  • ~6.4k tokens system prompt
    • including ~3.8k tokens of injected workspace files
      • ~0.77k tokens skills list

On an 8B CPU-only model, chopping a few thousand tokens off the front can absolutely be the difference between “~60–120s” and “~20–50s” for a tiny reply like “hey”.

Two reality checks though:

  • It’s not a guarantee (cold starts / swapping / load can still blow it up).
  • “Sub‑one minute” is realistic; “sub‑10 seconds consistently” is usually not on CPU-only with OpenClaw.

If you want the biggest wins with the least pain, start by trimming:

  1. AGENTS.md (1.4k tok)
#
  1. MEMORY.md (0.8k tok)
  2. disable/allowlist skills so the skills list is tiny

Then rerun /context detail and time “hey” again.

distant skiff
#

How do you disable skills

subtle patrolBOT
#

You can disable skills in config in two main ways.

Option A (most direct): disable specific skills

Set skills.entries.<skillKey>.enabled = false.

Example (disable the ones showing in your /context detail):

openclaw config set skills.entries.healthcheck.enabled false --strict-json
openclaw config set skills.entries.weather.enabled false --strict-json
openclaw config set skills.entries.skill-creator.enabled false --strict-json
openclaw config set skills.entries.openai-image-gen.enabled false --strict-json
openclaw config set skills.entries.openai-whisper-api.enabled false --strict-json
openclaw config set skills.entries.wordpress-publisher.enabled false --strict-json
openclaw config set skills.entries.xlsx-cn.enabled false --strict-json
#

Then restart:

openclaw gateway restart

Run /context detail again to confirm the skills list shrank.

Option B: allowlist bundled skills (bundled-only)

This only limits bundled skills (doesn’t affect skills living in your workspace/managed dirs):

openclaw config set skills.allowBundled '[]' --strict-json
openclaw gateway restart

For your case (where you have workspace skills like wordpress-publisher / xlsx-cn), Option A is the one that really removes them from the prompt.

distant skiff
#

sages to see if there's context. Wait, there's no prior conversation.","done":true,"done_reason":"length","context":[151644,872,198,35561,608,26865,151645,198,151644,77091,198,151667,198,32313,11,279,1196,1101,3208,330,35561,608,26865,3263,2938,594,264,2699,10322,13,2379,2578,387,7497,421,358,2776,29338,12440,13,6771,752,1779,279,3681,6605,311,1490,421,1052,594,2266,13,13824,11,1052,594,902,4867,10435,13],"total_duration":34787703100,"load_duration":29685400136,"prompt_eval_count":11,"prompt_eval_duration":491593690,"eval_count":50,"eval_duration":4507258571}12:59 PMFound it. Look at the thinking field:

"Okay, the user just sent 'hey /think'"

Ollama is seeing hey /think as the prompt — thinking is still being injected at the Ollama level despite thinkingDefault off. The model is running in thinking mode every time.
Also notice load_duration: 29.6 seconds just to load the model — that's your cold start tax right there.
The /think flag is being appended somewhere in OpenClaw's prompt construction. Check:
bashopenclaw config get agents.defaults.thinkingDefault
openclaw config get agents.defaults
And ask in the OpenClaw Discord — "why is /think being injected into prompts even with thinkingDefault set to off?" This looks like a bug or the config key is wrong

subtle patrolBOT
#

That snippet is actually telling you two separate things:

1) The “thinking field” showing hey /think is almost certainly your prompt, not “injection”

If you literally sent something like hey /think while benchmarking, and OpenClaw didn’t strip the directive (or you weren’t sending it as a standalone directive), then Ollama will see the raw text and the model will react to it.

For clean benchmarking, do it in two messages:

  1. Directive-only (sets the session state):
/think off
  1. Then your real test:
hey
#

Don’t include /think inline in the “hey” message while you’re timing performance.

2) load_duration: ~29.6s is your real “why does this feel slow?” culprit

That’s the model cold-starting/loading weights. If that happens often, you’ll never feel snappy even with an 8B.

Once the model is warm, you want to see load_duration near-zero and only eval_duration dominate.

If you tell me:

  • are you restarting the gateway a lot, and
  • does Ollama show the model staying loaded (ollama ps),

…I can point you to the best way to reduce cold starts (it’s usually “stop unloading the model” / increase keep-alive, or just avoid restarts).

distant skiff
#

i didnt include "/think" in my message

subtle patrolBOT
#

Got it. Then the most likely explanations are:

  1. It’s coming from earlier context in that same session (you or the system mentioned /think before), and the model is referring back to it.
  2. The model is hallucinating/mislabeling what the user sent inside its “thinking” text (reasoning models do this sometimes).
  3. Less likely, but possible: the UI/plugin you’re using is appending something (but we should prove it before assuming).

Fast way to rule out #1 and #2

Do a totally clean run:

  1. Send:
/new
  1. Then send (directive-only):
#
/think off
  1. Then send:
hey

If it still says “user sent hey /think” in the thinking field after a fresh /new, then it’s either hallucination or something is actually being appended.

I need one concrete artifact

Where exactly are you seeing this “thinking field”?

  • In raw Ollama JSON (from /api/chat output / logs)?
  • In OpenClaw logs?
  • In the UI somewhere?
#

If you can paste a small snippet showing the field name and the surrounding JSON (redact anything sensitive), I can tell whether it’s:

  • the model’s own reasoning text, or
  • something OpenClaw is passing through as part of the user message/history.
distant skiff
#

we got a response in under 3 imnutes