#mid-session failure via telegram and dashboard
1 messages · Page 1 of 1 (latest)
First thing: please paste your openclaw --version. Version mismatch is a very common cause of weird mid-session behavior.
For this symptom I’d start with a safe diagnostic bundle rather than guessing. Please run:
openclaw status --all
openclaw gateway status
openclaw channels status --probe
openclaw logs --follow
Then send the relevant log chunk from right after a simple Telegram/dashboard prompt stalls. Tokens should be redacted by status --all, but still skim for secrets before posting.
A few things to look for in the logs:
- Anthropic/API errors like
overloaded_error,rate_limit_error, aborts, or request timeouts - channel send failures for Telegram
- dashboard/control UI auth/connect errors
- agent/session finishing with text but no visible channel send
- pairing/allowlist drops
Docs that match this triage path:
https://docs.openclaw.ai/help/faq
https://docs.openclaw.ai/gateway/troubleshooting
https://docs.openclaw.ai/channels/troubleshooting
✓ Config: /home/charles/.openclaw/openclaw.json
✓ Secret diagnostics (0)
✓ Restart sentinel: none
! Port 18789
Port 18789 is already in use.
- pid 374963 username: /usr/bin/node /usr/lib/node_modules/openclaw/dist/index.js gateway --port 18789 (127.0.0.1:18789)
- Gateway already running locally. Stop it (openclaw gateway stop) or use a different port.
! Tailscale exposure: REDACTED
✓ Skills: 17 eligible · 0 missing · /home/charles/.openclaw/workspace
✓ Plugin compatibility (none)
Gateway restart attempts (tail): /home/charles/.openclaw/logs/gateway-restart.log
[2026-04-30T22:18:45Z] openclaw restart attempt source=update target=openclaw-gateway.service
[2026-04-30T22:18:46Z] openclaw restart done source=update
[2026-05-09T13:34:24Z] openclaw restart attempt source=update target=openclaw-gateway.service
[2026-05-09T13:34:25Z] openclaw restart done source=update
✓ Channel issues (none)
That port warning is only suspicious if you were trying to start a second gateway. The listed process is the normal OpenClaw gateway already bound to 18789, so by itself it doesn’t explain Telegram/dashboard stalls.
I still need the version and the actual failure log lines. Please send:
openclaw --version
openclaw gateway status
openclaw channels status --probe
Then reproduce one tiny prompt that stalls and immediately grab the tail:
openclaw logs --tail 200
The useful part is the log section around the prompt: model request, abort/overload/timeout, session completion, and any Telegram/dashboard send error. If the logs show Anthropic overloaded_error/timeouts, that’s provider-side pressure; if they show the session completed but no visible send, that points at routing/tool-call delivery instead.
From gateway status:
Service config looks out of date or non-standard.
Service config issue: Gateway service PATH missing required dirs: /home/charles/.nvm/current/bin
Recommendation: run "openclaw doctor" (or "openclaw doctor --repair").
OpenClaw 2026.5.7 (eeef486)
Channel status:
Gateway event loop degraded: reasons=event_loop_utilization,cpu eventLoopDelayMaxMs=0 eventLoopUtilization=1 cpuCoreRatio=1.043
That’s a real lead. eventLoopUtilization=1 + cpuCoreRatio=1.043 means the gateway process is basically pegged, so Telegram/dashboard can look like they silently stall even if channels are configured fine.
Do this in order:
openclaw doctor --repair
openclaw gateway restart
openclaw gateway status
openclaw channels status --probe
The PATH warning matters because the system service may be running with a different Node/npm/nvm environment than your shell. If doctor --repair doesn’t clear it, reinstall the gateway service from the Node version you actually use, then restart it.
Also: you’re on 2026.5.7. If the event-loop degradation comes back immediately after restart, update to the latest available build and check whether a stale/corrupt session is being resumed. We’ve seen 2026.5.7-era gateway stalls where clearing old session state fixed event-loop starvation.
Before deleting anything, grab the current evidence:
openclaw logs --tail 300
openclaw status --all
If you paste the log chunk right after restart, especially the first session/agent lines before CPU pegs, we can tell whether this is service PATH, a stuck session, model/tool loop, or provider timeouts.
18:11:37+00:00 info gateway/ws {"subsystem":"gateway/ws"} webchat connected conn=31e5fea9-7e1e-4143-a5c3-f98ed9a07661 remote=127.0.0.1 client=openclaw-control-ui webchat v2026.5.7
18:11:37+00:00 info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ commands.list 163ms conn=31e5fea9…7661 id=3fbbb25b…6d8b
18:11:37+00:00 info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ chat.history 166ms conn=31e5fea9…7661 id=333b531d…55e2
18:11:37+00:00 info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ health 198ms conn=31e5fea9…7661 id=7f213d6a…3cd4
18:11:37+00:00 info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ node.list 216ms conn=31e5fea9…7661 id=b8c80708…6995
18:11:37+00:00 info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ device.pair.list 217ms conn=31e5fea9…7661 id=3de791f7…578e
18:11:37+00:00 info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ models.list 221ms conn=31e5fea9…7661 id=4a8df215…b7a9
18:11:37+00:00 info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ sessions.list 227ms conn=31e5fea9…7661 id=f353dffc…1829
18:11:37+00:00 info gateway/ws {"subsystem":"gateway/ws"} webchat connected conn=ea89f32f-e6a6-47a1-b781-6e924529d91d remote=127.0.0.1 client=openclaw-control-ui webchat v2026.5.7
18:11:37+00:00 info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ chat.history 110ms conn=ea89f32f…d91d id=9ed6e136…e748
18:11:37+00:00 info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ health 140ms conn=ea89f32f…d91d id=a1d41462…3e2c
18:11:37+00:00 info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ sessions.list 61ms conn=ea89f32f…d91d id=e6074cc7…349c
18:11:37+00:00 info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ device.pair.list 203ms conn=ea89f32f…d91d id=1e692b80…f2a7
18:11:37+00:00 info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ node.list 204ms conn=ea89f32f…d91d id=a176195f…1d91
Those lines look healthy: Control UI connects, and gateway RPC calls are returning in ~60-227ms. That means the dashboard transport itself is alive at 18:11:37.
The missing piece is what happens when you actually send the prompt. Please do this exact sequence so the log has the failure window:
openclaw logs --follow
In another terminal/UI, send one tiny prompt like hi from Telegram or dashboard. Wait until it stalls, then copy the log lines from the moment you sent it through ~30 seconds after.
I’m looking for one of these signatures:
- model/provider call starts, then Anthropic overload/abort/timeout
- session starts but never emits final assistant output
- assistant output exists but no Telegram/dashboard delivery happens
- event loop degraded appears right after session/model/tool work starts
If the only logs are more commands.list, health, sessions.list, etc., then the prompt may not be reaching the session runner at all, which is a different routing/UI issue.
My OpenClaw is responding now.
Good, that points back to the gateway/event-loop state rather than Telegram itself.
If it stays healthy after doctor --repair + restart, I’d leave it there. If it happens again, capture openclaw logs --tail 300 right after the stall and consider updating past 2026.5.7, since that version has had reports around gateway stalls/session state.
Glad that helped. If this fixed it, please right-click the solution message → Apps → ✅ Mark Solution so Answer Overflow indexes it for the next person.
I will mark complete after one more short question. I had days of trouble shooting after upgrading from 5.4 to 5.7. I fear major troubleshooting when upgrading from 5.7 to 5.12.
That fear is reasonable. I would not do a 5.7 → 5.12 upgrade casually in the middle of needing the assistant working.
Safer path:
openclaw --version
openclaw status --all > ~/openclaw-status-before-upgrade.txt
cp -a ~/.openclaw ~/.openclaw.backup-before-5.12
Then upgrade when you have 30-60 minutes to test, run:
openclaw doctor --repair
openclaw gateway restart
openclaw status --all
openclaw channels status --probe
Test one tiny prompt from each surface you care about: Telegram, dashboard, Discord, etc.
The key thing from your case: if 5.7 is now stable, you don’t need to rush the upgrade unless you’re hitting a known fixed bug. But if the event-loop degradation returns, I’d upgrade, because staying on a build that can wedge the gateway costs more time than a controlled upgrade with a backup.
Docs: https://docs.openclaw.ai/help/faq and https://docs.openclaw.ai/gateway/troubleshooting