#agent get no replying at all

1 messages · Page 1 of 1 (latest)

gentle spire
#

🦞 OpenClaw 2026.4.21 (f788c88)
🧠 Model: nvidia/minimaxai/minimax-m2.7 · 🔑
api-key (models.json)
🔄 Fallbacks: google/gemini-3.1-flash-lite-preview,
google/gemini-3-flash-preview,
nvidia/moonshotai/kimi-k2-thinking,
ollama/minimax-m2.7:cloud, google/gemini-2.5-flash,
google/gemini-2.5-flash-lite,
ollama/kimi-k2-thinking:cloud, nvidia/z-ai/glm4.7,
google/gemma-4-31b-it,
openrouter/google/gemma-4-31b-it:free
🧮 Tokens: 154k in / 341 out · 💵 Cost: $0.0000
🗄️ Cache: 9% hit · 16k cached, 0 new
📚 Context: 51k/200k (26%) · 🧹 Compactions: 0
🧵 Session: agent:main:main • updated just now
⚙️ Runtime: direct · Think: adaptive
🪢 Queue: collect (depth 1)
connected | idle
agent main (Yumika) | session main (openclaw-tui)
| nvidia/minimaxai/minimax-m2.7 | think adaptive
| tokens 51k/200k (26%)

rare jackalBOT
native summitBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

rare jackalBOT
meager shadowBOT
#

Let me check the docs/source for this exact symptom before I guess.

#

Version looks current, so that part is fine.

Also, Queue: collect is normal, not the bug.

For Discord threads, the usual silent-fail causes are:

  1. mention gating, the bot is waiting for an @mention
  2. guild/channel not allowlisted
  3. Message Content Intent is off in the Discord dev portal
  4. bot is missing Send Messages in Threads permission

Please run these and paste the output:

openclaw status --all
openclaw channels status --probe
#
openclaw logs --follow

Then send one test message in that thread while logs are open.

What to look for in logs:

  • drop guild message (mention required) → mention gating blocked it
  • groupPolicy: allowlist / blocked → channel/user/guild allowlist mismatch
  • Missing Permissions → bot can see the thread but cannot reply there
  • no inbound event at all → usually Message Content Intent

Relevant docs:

#

If you paste the channels status --probe block plus ~10-20 log lines around one failed message, I can usually tell you the exact blocker pretty fast.

gentle spire
#

I mean I run in tui, msg agent, he got no response , just stop there

meager shadowBOT
#

I think this is a different path than Discord. Let me check the TUI/no-reply docs so I don’t send you down the wrong rabbit hole.

#

Ah, got it, this is TUI, not Discord routing.

Then the fastest checks are:

openclaw status
openclaw models status
openclaw logs --follow

And inside the TUI:

/status
/new
#

Then send one tiny prompt like hi.

What the result means:

  • no new log lines at all when you press Enter
    → TUI is not actually reaching the Gateway / wrong URL-auth-session path

  • logs show a run starts, then hangs/errors
    → model/provider/backend problem

  • TUI is connected but still shows nothing
    → reset the session with /new and try again

One important gotcha from the docs:

  • /deliver on is only for sending replies out to chat channels
  • it does not control whether the TUI itself should show the reply
gentle spire
#

Tasks │ 1 active · 0 queued · 1 running │
│ │ · 233 issues · audit 4 errors · │
│ │ 204 warn · 332 tracked
,Security audit
Summary: 3 critical · 2 warn · 2 info
CRITICAL Exec security=full is configured
Full exec trust is enabled for: main, research_agent, doc_scribe_agent, coder_agent, analyst_agent, audit_agent, manager_agent, finance_agent, translator_agent…
Fix: Prefer tools.exec.security="allowlist" with ask prompts, and reserve "full" for tightly scoped break-glass agents only.
CRITICAL Open channels can reach exec-enabled agents
Open DM/group access detected at: - channels.discord.dmPolicy Exec-enabled scopes: - main: security=full, host=gateway - research_agent: security=full, host=ga…
Fix: Tighten dmPolicy/groupPolicy to pairing or allowlist, or disable exec for agents reachable from shared/public channels.
CRITICAL Small models require sandboxing and web tools disabled
Small models (<=300B params) detected: - google/gemma-4-31b-it (31B) @ agents.defaults.model.fallbacks (unsafe; sandbox=off; web=[web_search, web_fetch, browse…
Fix: If you must use small models, enable sandboxing for all sessions (agents.defaults.sandbox.mode="all") and disable web_search/web_fetch/browser (tools.deny=["group:web","browser"]).

#

WARN Some gateway.nodes.denyCommands entries are ineffective
gateway.nodes.denyCommands uses exact node command-name matching only (for example system.run), not shell-text filtering inside a command payload. - Unknown …
Fix: Use exact command names (for example: canvas.present, canvas.hide, canvas.navigate, canvas.eval, canvas.snapshot, canvas.a2ui.push, canvas.a2ui.pushJSONL, canvas.a2ui.reset). If you need broader restrictions, remove risky command IDs from allowCommands/default workflows and tighten tools.exec policy.
WARN Potential multi-user setup detected (personal-assistant model warning)
Heuristic signals indicate this gateway may be reachable by multiple users: - channels.discord.groupPolicy="allowlist" with configured group targets - channels…
Fix: If users may be mutually untrusted, split trust boundaries (separate gateways + credentials, ideally separate OS users/hosts). If you intentionally run shared-user access, set agents.defaults.sandbox.mode="all", keep tools.fs.workspaceOnly=true, deny runtime/fs/web tools unless required, and keep personal/private identities + credentials off that runtime.
Full report: openclaw security audit
Deep probe: openclaw security audit --deep,

meager shadowBOT
#

This is actually useful, it means the agent is running, not dead.

So the security audit output is not the reason it isn’t replying. Those are config warnings, but they don’t explain a single TUI turn hanging.

What your paste suggests:

  • TUI is connected
  • the turn entered the runtime
  • it is likely getting stuck on the model/provider call or that session state

Try this:

In the TUI

/abort
/new
#

Then send a tiny prompt like:

hi

In another terminal

openclaw models status
openclaw logs --follow

If it hangs again, paste the log lines right after sending hi.

I’d look for things like:

  • auth/provider errors
#
  • timeout / 429 / 5xx
  • model call starts but never completes
  • tool/schema errors

If it hangs in every session, restart the gateway once:

openclaw gateway restart

If you want, paste just:

  1. openclaw models status
  2. the openclaw logs --follow snippet from one stuck hi

That should narrow it down fast.

#

Yep, that is still the security audit output, not the stuck-turn log 🙂

For this bug, please ignore openclaw security audit for now. We need the live runtime log.

Do exactly this:

Terminal 1

openclaw logs --follow

TUI

/abort
/new
hi
#

Terminal 2

openclaw models status

Then paste:

  1. the output of openclaw models status
  2. the new lines that appear in openclaw logs --follow right after you send hi

If nothing appears in openclaw logs --follow after hi, say that too, because that points to a different problem.

Right now we’re missing the one thing that will actually identify the cause: the live log for the stuck message.

gentle spire
#

03:31:49+08:00 info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ sessions.list 86ms conn=9cc93aa6…8116 id=1d6599fc…1555
03:34:24+08:00 info cron {"module":"cron","storePath":"/home/rr420/.openclaw/cron/jobs.json"} {"pruned":1,"retentionMs":86400000} cron-reaper: pruned 1 expired cron run session(s)
03:41:48+08:00 info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ sessions.list 59ms conn=9cc93aa6…8116 id=0c3ece99…8773
only has old trades. Our 5 new ones are in the DB but not in the JSON. Let me fix this:

streaming watchdog: no stream updates for 30s; resetting status. The backend may have dropped this run silently — send a new message to
resync.

streaming watchdog: no stream updates for 30s; resetting status. The backend may have dropped this run silently — send a new message to
resync.

streaming watchdog: no stream updates for 30s; resetting status. The backend may have dropped this run silently — send a new message to
resync.

/status

🦞 OpenClaw 2026.4.21 (f788c88)
🧠 Model: nvidia/minimaxai/minimax-m2.7 · 🔑 api-key (models.json)
🔄 Fallbacks: google/gemini-3.1-flash-lite-preview, google/gemini-3-flash-preview, nvidia/moonshotai/kimi-k2-thinking,
ollama/minimax-m2.7:cloud, google/gemini-2.5-flash, google/gemini-2.5-flash-lite, ollama/kimi-k2-thinking:cloud, nvidia/z-ai/glm4.7,
google/gemma-4-31b-it, openrouter/google/gemma-4-31b-it:free
🧮 Tokens: 65.5m in / 317k out · 💵 Cost: $0.0000
🗄️ Cache: 5% hit · 3.2m cached, 0 new
📚 Context: 80k/200k (40%) · 🧹 Compactions: 3
🧵 Session: agent:main:main • updated just now
📌 Tasks: recently finished · cron · polymarket-live-price-refresh · Run: cd ~/my_projects/openclaw/skills/polymarket-signal && source
venv/bin/activate && python3 -c "from scripts.integra…
⚙️ Runtime: direct · Think: adaptive
🪢 Queue: collect (depth 3)

the agent idle at all, seems stop and not doing anything, why?

meager shadowBOT
#

That log line is the clue. I’m checking the exact watchdog behavior so I can tell you whether this is a UI-only stale state or an upstream model drop.

#

Yep, this finally explains it.

That streaming watchdog: no stream updates for 30s line means:

  • the TUI started a run
  • then it stopped receiving stream updates for 30s
  • so the TUI reset itself to idle
  • that is a client-side watchdog, not proof the run finished cleanly

So the agent is not “choosing to do nothing”, it likely hit one of these:

  1. the provider/model stream stalled or was dropped
  2. the final event was lost after a WS hiccup / gateway restart
  3. that long main session got wedged

Also, Queue: collect (depth 3) means you now have 3 messages piled up behind it.

#

Do this first

In TUI:

/abort
/new
hi

That removes the long session/history as a variable.

Then test the model path directly

In a shell:

openclaw infer model run --model <the same model from /status> --prompt "hi" --json
#

If that also hangs, the problem is the provider/model path, not TUI.

If it keeps happening

Restart once:

openclaw gateway restart

How to read the result

  • fresh /new session works → old session was wedged
  • direct infer model run hangs → upstream provider/model issue
  • restart fixes it once, then it returns → dropped stream / transport issue
  • queue depth keeps growing → the active run is not really completing, even if TUI says idle

So the short answer to “why does it look idle?” is:

#

because the TUI watchdog gave up waiting for stream deltas and reset the status bar, even though that run likely stalled upstream.

If you paste the result of:

openclaw infer model run --model <same model> --prompt "hi" --json

I can tell you pretty quickly whether this is a session problem or a provider problem.

gentle spire
#

04:25:29+08:00 info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ logs.tail 151ms conn=e83e79d7…18ef id=422c8638…6193
04:27:07+08:00 warn agent/embedded {"subsystem":"agent/embedded"} {"event":"embedded_run_agent_end","tags":["error_handling","lifecycle","agent_end","assistant_error"],"runId":"b72b7aeb-ccde-44f1-ac77-bd2bc05fc7b2","isError":true,"error":"⚠️ API rate limit reached. Please try again later.","failoverReason":"rate_limit","model":"minimaxai/minimax-m2.7","provider":"nvidia","rawErrorPreview":"429 status code (no body)","rawErrorHash":"sha256:bcb9b7be2645","providerRuntimeFailureKind":"rate_limit"} embedded run agent end

streaming watchdog: no stream updates for 30s; resetting status.
The backend may have dropped this run silently — send a new
message to resync.

streaming watchdog: no stream updates for 30s; resetting status.
The backend may have dropped this run silently — send a new
message to resync.
connected | idle
agent main (Yumika) | session main (openclaw-tui) |
nvidia/minimaxai/minimax-m2.7 | think adaptive | tokens 95k/200k
(48%)

tui, agent just stop at middle again, why ? why no fallback?

meager shadowBOT
#

Short version: it stopped because NVIDIA returned a 429 mid-run.

Your log already shows the real cause:

  • providerRuntimeFailureKind":"rate_limit"
  • failoverReason":"rate_limit"
  • rawErrorPreview":"429 status code (no body)"

And the TUI line:

  • streaming watchdog: no stream updates for 30s

is just the client noticing the stream died, then resetting to idle. It is not the root cause.

Why no fallback?

  • In source, rate_limit is supposed to trigger failover
#
  • OpenClaw usually tries:
    1. another auth profile on the same provider
    2. then the next model in your fallback list

So if you got no visible fallback reply, that usually means one of these:

  1. fallback never actually fired, which looks like a bug
  2. fallback fired, but the next candidate also failed
  3. the fallback run happened, but the final event got lost and TUI only showed the watchdog reset

So, this is not Queue: collect, and it’s not your 95k context either.

Please run one filtered log check right after reproducing:

openclaw logs --follow | rg 'b72b7aeb|model fallback decision|fallback_model|candidate_succeeded|FallbackSummaryError|rate-limit profile rotation cap'