#Discord Slow Listener Detected - Multiple errors

1 messages ยท Page 1 of 1 (latest)

tribal hill
#

Hello @halcyon oar ,

Trying to debug an issue where my Openclaw has been throwing multiple Slow listener detected. This cause issues with cron today with OpenClaw Gateway error โ€” "Unexpected event order, got content_block_stop before "message_start"". Any ideas how to debug?

Running v2026.3.1

2026-03-05T07:13:33.332Z [discord] Slow listener detected: DiscordMessageListener took 71.4 seconds for event MESSAGE_CREATE
2026-03-05T07:19:13.891Z [discord] Slow listener detected: DiscordMessageListener took 40.8 seconds for event MESSAGE_CREATE
2026-03-05T07:23:59.075Z [discord] Slow listener detected: DiscordMessageListener took 38.2 seconds for event MESSAGE_CREATE
2026-03-05T07:25:53.218Z [discord] Slow listener detected: DiscordMessageListener took 35.8 seconds for event MESSAGE_CREATE
2026-03-05T08:00:37.850Z [agent/embedded] embedded run agent end: runId=9b5150d3-4730-4440-857a-deb7e1fc244e isError=true error=Unexpected event order, got content_block_stop before "message_start"
2026-03-05T09:00:29.841Z [agent/embedded] embedded run agent end: runId=60032f48-95fa-4ed2-9ede-1c790a5369e4 isError=true error=Unexpected event order, got content_block_stop before "message_start"
2026-03-05T04:12:04.828-05:00 [tools] exec failed: Command aborted by signal SIGKILL
2026-03-05T11:00:00.030Z [cron] payload.model 'kimi-coding/k2.5' not allowed, falling back to agent defaults
2026-03-05T11:01:41.856Z [cron] payload.model 'kimi-coding/k2.5' not allowed, falling back to agent defaults
2026-03-05T12:11:34.905Z [agent/embedded] embedded run agent end: runId=80c8e4c7-97c4-4200-ad59-52f2542dee14 isError=true error=Expected property name or '}' in JSON at position 1 (line 1 column 2)
2026-03-05T12:30:00.029Z [cron] payload.model 'kimi-coding/k2.5' not allowed, falling back to agent defaults
2026-03-05T08:48:32.528-05:00 [tools] read failed: ENOENT: no such file or directory, access '/Users/max/.npm-global/lib/node_modules/openclaw/docs/troubleshooting.md'
2026-03-05T13:50:11.818Z [discord] Slow listener detected: DiscordMessageListener took 116.9 seconds for event MESSAGE_CREATE
2026-03-05T13:51:25.981Z [discord] Slow listener detected: DiscordMessageListener took 74.2 seconds for event MESSAGE_CREATE
2026-03-05T13:53:24.239Z [discord] Slow listener detected: DiscordMessageListener took 33.1 seconds for event MESSAGE_CREATE
2026-03-05T08:55:09.159-05:00 [tools] cron failed: gateway timeout after 60000ms

jolly jettyBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> โœ… Mark Solution

halcyon oarBOT
#

Those log lines are a few different problems layered on top of each other, but there is a clean way to triage.

1) โ€œSlow listener detected โ€ฆ DiscordMessageListener took 70โ€“116sโ€

This warning is basically: โ€œyour Discord MESSAGE_CREATE handler is taking longer than the event queue timeout.โ€ That can be totally โ€œexpectedโ€ if a message triggers a long LLM run / tool run โ€” but it also means your gateway is busy long enough that Discord ingestion + cron + RPC can start timing out.

Thereโ€™s a canonical knob for this:

  • channels.discord.eventQueue.listenerTimeout (or per-account: channels.discord.accounts.<accountId>.eventQueue.listenerTimeout)

Docs + example config: https://docs.openclaw.ai/channels/discord (see the accordion about Long-running handlers time out or duplicate replies).

Even if you bump the timeout, you still want to find why itโ€™s taking 70โ€“116s (model latency, tools, CPU starvation, etc.).

2) Cron timeouts + concurrency

Youโ€™ve also got:

  • cron failed: gateway timeout after 60000ms
  • plus cron runs failing with the โ€œUnexpected event order โ€ฆโ€ error
#

If you have multiple jobs around the top of the hour, check:

Also: payload.model 'kimi-coding/k2.5' not allowed, falling backโ€ฆ wonโ€™t cause slowness by itself, but it can hide what model you think youโ€™re running and lead to unexpected runtime behavior. (Either allow it explicitly or remove that override so itโ€™s not silently falling back.)

3) โ€œUnexpected event order, got content_block_stop before message_startโ€

Those message_start / content_block_stop event names are Anthropic-streaming event types, so this usually means the gateway got a broken/partial/out-of-order stream (network/proxy hiccup, cancellation edge case, or a bug in the streaming parser).

Actionable next steps:

  1. Upgrade off 2026.3.1 to the latest 2026.3.x (if you havenโ€™t already) โ€” this smells like the kind of thing that gets patched quickly.
  2. Grab the surrounding gateway logs for one runId (like 9b5150d3-...) with debug logging enabled and share them (or file a GitHub issue with that snippet + runId).

Logging ref: https://docs.openclaw.ai/logging (you can temporarily raise verbosity with OPENCLAW_LOG_LEVEL=debug).

4) Donโ€™t ignore the SIGKILL

[tools] exec failed: Command aborted by signal SIGKILL is often the OS killing a process (commonly memory pressure / OOM, or container limits). If the machine is memory-thrashing, it would explain everything getting slow and timing out.

#

What Iโ€™d like you to paste (so we can pinpoint the bottleneck)

Run this ladder and paste outputs (or at least anything that looks โ€œoffโ€):

openclaw status --all
openclaw gateway status
openclaw doctor
openclaw channels status --probe
openclaw logs --follow

(That ladder is also in https://docs.openclaw.ai/help/troubleshooting.)

And specifically: 20โ€“40 log lines before/after one Slow listener detected and one Unexpected event orderโ€ฆ (with the runId). Thatโ€™ll tell us whether this is โ€œjust slow model callsโ€, event-loop blockage, or an actual streaming bug.

tribal hill
#

Here are the outputs in separate messages:
openclaw status --all
๐Ÿฆž OpenClaw 2026.3.1 (2a8ac97)
Your task has been queued; your dignity has been deprecated.

โ”‚

โ”‚ Version โ”‚ 2026.3.1 โ”‚
โ”‚ OS โ”‚ macos 15.6 (arm64) โ”‚
โ”‚ Node โ”‚ 22.22.0 โ”‚
โ”‚ Config โ”‚ ~/.openclaw/openclaw.json โ”‚
โ”‚ Dashboard โ”‚ http://127.0.0.1:18789/ โ”‚
โ”‚ Tailscale โ”‚ off ยท Running ยท mac-mini-1.tail983ed2.ts.net โ”‚
โ”‚ Channel โ”‚ stable (default) โ”‚
โ”‚ Update โ”‚ pnpm ยท npm update 2026.3.2 โ”‚
โ”‚ Gateway โ”‚ local ยท ws://127.0.0.1:18789 (local loopback) ยท reachable 11ms ยท auth token โ”‚
โ”‚ Security โ”‚ Run: openclaw security audit --deep โ”‚
โ”‚ Gateway self โ”‚ Mac-mini.local (192.168.5.67) app 2026.3.1 macos 15.6 โ”‚
โ”‚ Gateway service โ”‚ LaunchAgent installed ยท loaded ยท running (pid 43011) โ”‚
โ”‚ Node service โ”‚ LaunchAgent not installed โ”‚
โ”‚ Agents โ”‚ 4 total ยท 0 bootstrapping ยท 0 active ยท 132 sessions โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Channels
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Channel โ”‚ Enabled โ”‚ State โ”‚ Detail โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Telegram โ”‚ ON โ”‚ SETUP โ”‚ no token โ”‚
โ”‚ Discord โ”‚ ON โ”‚ OK โ”‚ token configร—3 (sha256:5f12d67f ยท len 72) ยท accounts 3/4 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

#

Discord accounts
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Account โ”‚ Status โ”‚ Notes โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ clark โ”‚ OK โ”‚ token:config โ”‚
โ”‚ dex โ”‚ OK โ”‚ token:config โ”‚
โ”‚ main โ”‚ OK โ”‚ token:config โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Agents
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Agent โ”‚ Bootstrap file โ”‚ Sessions โ”‚ Active โ”‚ Store โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ main (Max) โ”‚ ABSENT โ”‚ 105 โ”‚ 56m ago โ”‚ ~/.openclaw/agents/main/sessions/sessions.json โ”‚
โ”‚ clark (Clark) โ”‚ ABSENT โ”‚ 15 โ”‚ 55m ago โ”‚ ~/.openclaw/agents/clark/sessions/sessions.json โ”‚
โ”‚ dex (Dex) โ”‚ ABSENT โ”‚ 12 โ”‚ 2h ago โ”‚ ~/.openclaw/agents/dex/sessions/sessions.json โ”‚
โ”‚ max_debug (Max Debugger) โ”‚ ABSENT โ”‚ 0 โ”‚ unknown โ”‚ ~/.openclaw/agents/max_debug/sessions/sessions.json โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Diagnosis (read-only)

Gateway connection details:
Gateway target: ws://127.0.0.1:18789
Source: local loopback
Config: /Users/max/.openclaw/openclaw.json
Bind: loopback

โœ“ Config: /Users/max/.openclaw/openclaw.json
โœ“ Restart sentinel: none

#

Gateway last log line:
2026-02-03T01:43:55.696Z Gateway start blocked: set gateway.mode=local (current: unset) or pass --allow-unconfigured.
! Port 18789
Port 18789 is already in use.

  • pid 43011 max: openclaw-gateway (127.0.0.1:18789)
  • Gateway already running locally. Stop it (openclaw gateway stop) or use a different port.
    โœ“ Tailscale: off ยท Running ยท mac-mini-1.tail983ed2.ts.net
    ips: 100.98.75.64, fd7a:115c:a1e0::af01:4ba6
    โœ“ Skills: 42 eligible ยท 0 missing ยท /Users/max/.openclaw/workspace

Gateway logs (tail, summarized): /Users/max/.openclaw/logs

stderr: /Users/max/.openclaw/logs/gateway.err.log

2026-03-05T15:05:11.155Z [discord] Slow listener detected: DiscordMessageListener took 179.4 seconds for event MESSAGE_CREATE
2026-03-05T15:11:12.756Z [discord] Slow listener detected: DiscordMessageListener took 89.1 seconds for event MESSAGE_CREATE
2026-03-05T15:17:09.730Z [discord] Slow listener detected: DiscordMessageListener took 40.5 seconds for event MESSAGE_CREATE
2026-03-05T15:30:25.185Z [agent/embedded] embedded run agent end: runId=a53cfd50-7465-4c5d-b580-63e24abb60f5 isError=true error=Unexpected event order, got content_block_stop before "message_start"
2026-03-05T10:31:34.472-05:00 [tools] cron failed: invalid cron.update params: at /patch: unexpected property 'retry'; must have required property 'jobId'; at root: unexpected property 'id'; must match a schema in anyOf
2026-03-05T15:32:00.106Z [discord] Slow listener detected: DiscordMessageListener took 60.1 seconds for event MESSAGE_CREATE
โ€ฆ 16 lines omitted โ€ฆ
- waiting for locator('aria-ref=e709')
2026-03-05T15:12:10.624-05:00 [tools] browser failed: Error: refs=aria does not support selector/frame snapshots yet.
2026-03-05T20:13:59.996Z [discord] Slow listener detected: DiscordMessageListener took 270.4 seconds for event MESSAGE_CREATE
2026-03-05T20:15:02.475Z [discord] Slow listener detected: DiscordMessageListener took 42.9 seconds for event MESSAGE_CREATE
2026-03-05T20:15:33.879Z [discord] Slow listener detected: DiscordMessageListener took 31.8 seconds for event MESSAGE_CREATE
2026-03-05T20:25:39.886Z [discord] Slow listener detected: DiscordMessageListener took 466.1 seconds for event MESSAGE_CREATE
2026-03-05T20:29:41.112Z [discord] Slow listener detected: DiscordMessageListener took 58.7 seconds for event MESSAGE_CREATE
2026-03-05T20:34:56.507Z [discord] Slow listener detected: DiscordMessageListener took 55.7 seconds for event MESSAGE_CREATE
2026-03-05T20:39:28.816Z [discord] Slow listener detected: DiscordMessageListener took 162.6 seconds for event MESSAGE_CREATE
2026-03-05T20:42:09.002Z [discord] Slow listener detected: DiscordMessageListener took 47.5 seconds for event MESSAGE_CREATE
2026-03-05T20:42:41.059Z [discord] Slow listener detected: DiscordMessageListener took 36 seconds for event MESSAGE_CREATE
2026-03-05T15:43:36.465-05:00 [tools] read failed: Offset 860 is beyond end of file (818 lines total)
2026-03-05T20:47:48.995Z [discord] Slow listener detected: DiscordMessageListener took 260.8 seconds for event MESSAGE_CREATE
2026-03-05T21:00:30.065Z [agent/embedded] embedded run agent end: runId=4523fcaa-0b6d-484e-b6f3-312cfcd44a6b isError=true error=Unexpected event order, got content_block_stop before "message_start"
2026-03-05T21:15:24.494Z [discord] Slow listener detected: DiscordMessageListener took 63.5 seconds for event MESSAGE_CREATE

stdout: /Users/max/.openclaw/logs/gateway.log

2026-03-05T22:10:25.251Z [health-monitor] [discord:dex] health-monitor: restarting (reason: stuck)
2026-03-05T22:10:25.601Z [discord] [dex] Discord Message Content Intent is limited; bots under 100 servers can use it without verification.
2026-03-05T22:10:25.604Z [discord] [dex] starting provider (@Dex)
2026-03-05T22:10:26.096Z [discord] channels resolved: 1468289360620163085/1468756222139174913โ†’1468289360620163085/1468756222139174913

halcyon oarBOT
#

openclaw status --all already shows the most important clue: youโ€™re on 2026.3.1 and thereโ€™s a 2026.3.2 update available. Iโ€™d do that first, because the content_block_* / message_start ordering error smells like a streaming-parser edge case that could easily be a point-release fix.

0) First action: update + restart the LaunchAgent

On macOS, make sure you both update and restart the gateway service (so the LaunchAgent is actually running the new bits). After updating, re-run openclaw status --all and confirm it reports 2026.3.2.

1) โ€œSlow listener detected โ€ฆ took 70โ€“116sโ€

That warning is usually just โ€œyour Discord MESSAGE_CREATE handler ran longer than the listener timeoutโ€.

Per docs, the canonical knob is:

  • channels.discord.eventQueue.listenerTimeout (or per account: channels.discord.accounts.<accountId>.eventQueue.listenerTimeout)

Docs: https://docs.openclaw.ai/channels/discord (accordion: Long-running handlers time out or duplicate replies)

Since your status shows multiple Discord accounts (3/4), double check youโ€™re setting this on the account thatโ€™s actually receiving the events (often accounts.default), not just the single-account top-level.

2) The SIGKILL + cron โ€œgateway timeout after 60000msโ€

#

SIGKILL is a big red flag for resource pressure (often memory). If the machine is memory-thrashing, Nodeโ€™s event loop can stall, which then cascades into:

  • slow Discord listeners
  • cron RPC timeouts
  • weird-looking mid-stream failures

On the Mac mini, check Activity Monitor around the timestamps for memory pressure / killed processes.

What I need next (to pinpoint it)

Paste these outputs (theyโ€™re short and usually immediately diagnostic):

openclaw gateway status
openclaw doctor
openclaw channels status --probe

And if you can, also paste your config snippet for:

#
  • channels.discord.defaultAccount (if set)
  • channels.discord.accounts.<...>.eventQueue (or channels.discord.eventQueue)

Once we see which Discord account + what the queue settings are (and whether 2026.3.2 changes the Anthropic stream error), we can decide if this is just โ€œincrease listenerTimeout + reduce loadโ€, or something that needs a bug report with a specific runId.

tribal hill
#

max@Mac-mini ~ % openclaw channels status --probe

๐Ÿฆž OpenClaw 2026.3.1 (2a8ac97) โ€” Meta wishes they shipped this fast.

โ”‚
โ—‡ Doctor warnings โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ โ”‚
โ”‚ - State dir migration skipped: target already exists (/Users/max/.openclaw). Remove or โ”‚
โ”‚ merge manually. โ”‚
โ”‚ โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
โ”‚
โ—‡
Gateway reachable.

  • Telegram default: enabled, not configured, stopped, mode:polling, token:none, error:not configured
  • Discord clark: enabled, configured, running, disconnected, in:1h ago, out:2h ago, bot:@Clark, token:config, intents:content=limited, works, audit ok
  • Discord default: enabled, not configured, stopped, disconnected, token:none, error:not configured
  • Discord dex: enabled, configured, running, disconnected, in:1h ago, out:2h ago, bot:@Dex, token:config, intents:content=limited, works, audit ok
  • Discord main: enabled, configured, running, disconnected, in:1h ago, out:1h ago, bot:@Max, token:config, intents:content=limited, works, audit ok
tribal hill
halcyon oarBOT
#

Good signal from what you pasted:

  • Gateway is local loopback (ws://127.0.0.1:18789) and reachable
  • You have 3 Discord accounts (clark, dex, main) and 3 real agents wired up (main, clark, dex)
  • Update is available (2026.3.2) โ€” Iโ€™d still take that as step 1

1) Check for โ€œtriple workโ€ (most common gotcha with multi-Discord-account setups)

If all three Discord bots are in the same guild/channel (or share overlapping allowlists), then every message can fan out into multiple MESSAGE_CREATE handlers, which compounds load fast and makes โ€œSlow listener detected โ€ฆ 70โ€“116sโ€ much more likely.

So: are clark, dex, and main all present in the same server/channel(s)? If yes, tighten routing so only one of them is allowed there (or require mentions for the ones that shouldnโ€™t respond).

2) The slow-listener warning is usually just โ€œsomething inside the handler is blockingโ€

In your original excerpt you have this line:

[tools] cron failed: gateway timeout after 60000ms

Thatโ€™s a classic โ€œthe handler waited ~60s on somethingโ€ situation (and then you see a ~70s slow-listener warning). Even though the gateway is โ€œreachableโ€, an RPC can still time out if the process is event-loop stalled (CPU/memory pressure) or saturated.

#

Actionable: in the log file, look around the exact slow-listener timestamps for any [tools] ... failed / [tools] ... timeout lines โ€” that usually identifies the culprit immediately (cron tool, exec tool, web fetch, etc.).

3) โ€œUnexpected event order โ€ฆ content_block_stop before message_startโ€

Thatโ€™s an Anthropic streaming event ordering error. Before going deeper, Iโ€™d do:

  1. Upgrade to 2026.3.2 and restart the LaunchAgent (so the running service is actually the updated build).
  2. If it still happens: grab ~50 lines before/after one of those runId=... blocks from the gateway log and share them.

4) Two things I still need from you

Can you paste these outputs?

openclaw gateway status
openclaw channels status --probe

And one quick config snippet (redact tokens) showing:

#
  • channels.discord.defaultAccount (if set)
  • channels.discord.accounts.<id>.eventQueue (or channels.discord.eventQueue)

With that we can tell:

  • which Discord account is producing the slow-listener logs
  • whether youโ€™re accidentally processing the same message 2โ€“3 times
  • whether you just need a listener timeout bump, or youโ€™ve got a real event-loop stall (SIGKILL strongly hints resource pressure)
tribal hill
#

max@Mac-mini ~ % openclaw gateway status
openclaw channels status --probe

๐Ÿฆž OpenClaw 2026.3.1 (2a8ac97)
Type the command with confidenceโ€”nature will provide the stack trace if needed.

โ”‚
โ—‡ Doctor warnings โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ โ”‚
โ”‚ - State dir migration skipped: target already exists โ”‚
โ”‚ (/Users/max/.openclaw). Remove or merge manually. โ”‚
โ”‚ โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
โ”‚
โ—‡
Service: LaunchAgent (loaded)
File logs: /tmp/openclaw/openclaw-2026-03-05.log
Command: /opt/homebrew/Cellar/node@22/22.22.0/bin/node /Users/max/.npm-global/lib/node_modules/openclaw/dist/index.js gateway --port 18789
Service file: ~/Library/LaunchAgents/ai.openclaw.gateway.plist
Service env: OPENCLAW_GATEWAY_PORT=18789

Config (cli): ~/.openclaw/openclaw.json
Config (service): ~/.openclaw/openclaw.json

Gateway: bind=loopback (127.0.0.1), port=18789 (service args)
Probe target: ws://127.0.0.1:18789
Dashboard: http://127.0.0.1:18789/
Probe note: Loopback-only gateway; only local clients can connect.

Runtime: running (pid 43011)
RPC probe: ok

Listening: 127.0.0.1:18789
Troubles: run openclaw status
Troubleshooting: https://docs.openclaw.ai/troubleshooting

๐Ÿฆž OpenClaw 2026.3.1 (2a8ac97)
I'll refactor your busywork like it owes me money.

โ”‚
โ—‡ Doctor warnings โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ โ”‚
โ”‚ - State dir migration skipped: target already exists โ”‚
โ”‚ (/Users/max/.openclaw). Remove or merge manually. โ”‚
โ”‚ โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
โ”‚
โ—‡
Gateway reachable.

  • Telegram default: enabled, not configured, stopped, mode:polling, token:none, error:not configured
  • Discord clark: enabled, configured, running, disconnected, in:1h ago, out:2h ago, bot:@Clark, token:config, intents:content=limited, works, audit ok
  • Discord default: enabled, not configured, stopped, disconnected, token:none, error:not configured
  • Discord dex: enabled, configured, running, disconnected, in:1h ago, out:2h ago, bot:@Dex, token:config, intents:content=limited, works, audit ok
  • Discord main: enabled, configured, running, disconnected, in:1h ago, out:1h ago, bot:@Max, token:config, intents:content=limited, works, audit ok

Tip: status --deep adds gateway health probes to status output (requires a reachable gateway).

halcyon oarBOT
#

This is super helpful โ€” the โ€œtail, summarizedโ€ basically explains the why behind the slow listeners.

Whatโ€™s actually causing the 1โ€“7 minute โ€œSlow listener detectedโ€ events

Your gateway log is showing tool runs that hang for a long time inside a Discord MESSAGE_CREATE handler, e.g.:

  • โ€ฆ waiting for locator('aria-ref=e709') โ†’ browser automation waiting on an element
  • [tools] browser failed: Error: refs=aria does not support selector/frame snapshots yet. โ†’ a browser call pattern thatโ€™s currently unsupported in that build
  • plus a bunch of long durations: 179s / 270s / 466s

If the Discord listener awaits the full agent/tool run (which it does), then any stuck browser locator (or any tool with a long timeout) will produce exactly the โ€œSlow listener detected โ€ฆ took N secondsโ€ warnings.

So the primary fix is: stop long/stuck tools from running in response to normal Discord messages, or raise the listener timeout to match reality.

Immediate mitigations (do these first)

  1. Update to 2026.3.2 and restart the LaunchAgent.
    Youโ€™re hitting a few sharp edges (browser refs=aria limitation + Anthropic stream ordering error) that are very likely point-release fixes.
#
  1. Increase the Discord listener timeout on the account(s) that are actually doing the work:
    Docs: https://docs.openclaw.ai/channels/discord (accordion โ€œLong-running handlersโ€ฆโ€)

    If your runs are taking 300โ€“500s, a 120s timeout wonโ€™t help; itโ€™ll just keep warning/restarting.

  2. Restrict tools in the busy channel(s) so a casual message canโ€™t kick off browser/cron/etc.
    Tool restriction pattern is documented here: https://docs.openclaw.ai/channels/groups#groupchannel-tool-restrictions-optional
    (For Discord this lives under channels.discord.guilds.<guildId>.channels.<channelKey>.tools / toolsBySender.)

    Practical approach: deny browser (and maybe cron, exec) for the whole channel, then allow it only for yourself via toolsBySender.

Two other concrete issues in your log

  • cron failed: invalid cron.update params โ€ฆ unexpected property 'retry' โ€ฆ must have required property 'jobId' โ€ฆ unexpected property 'id'
    This means something (a skill, prompt snippet, or manual command) is calling the cron.update tool with an old/wrong schema. That will fail every time until updated. If you tell me how youโ€™re updating cron jobs (CLI? a skill? a message you send?), I can point you at the correct shape.

  • Unexpected event order, got content_block_stop before "message_start"
    Still looks like an Anthropic streaming edge case. I would re-check after:

#
  • upgrading to 2026.3.2

    • reducing โ€œstuckโ€ tool runs (browser waits, cron tool timeouts)

    If it persists, the next step is grabbing ~50 lines around one runId=... from the JSONL gateway log and filing it as a bug (include the runId + provider/model line right before the stream starts).

Quick question (important)

Are clark, dex, and main all present in the same Discord channels? If yes, you may be doing 2โ€“3x the work per message unless the allowlists/mention rules are very tight โ€” that alone can push everything into โ€œstuck / health-monitor restartingโ€ territory.

#

That channels status --probe output is a big deal: all 3 configured Discord accounts are currently running but disconnected, and havenโ€™t had in/out traffic for ~1โ€“2 hours.

That lines up perfectly with what you saw in the log tail:

  • multiโ€‘minute MESSAGE_CREATE handlers (179sโ€ฆ466s)
  • browser automation getting stuck (waiting for locator(...))
  • health monitor restarting an adapter (health-monitor: restarting (reason: stuck))

When the gateway gets tied up in long runs (especially browser waits), Discord can miss heartbeats / get wedged, then the health monitor restarts it, and you end up โ€œrunning, disconnectedโ€.

The practical fix path

1) Update + restart first

Youโ€™re on 2026.3.1 and have 2026.3.2 available. I would update and then openclaw gateway restart (or stop/start) so the LaunchAgent is definitely running the new build.

There are at least two things in your log that look like โ€œfixed in a point releaseโ€ material:

  • the Anthropic stream ordering error (content_block_stop before message_start)
  • the browser refs=aria โ€ฆ selector/frame snapshots limitation
#

2) Stop long browser runs from happening in normal Discord message handling

Your slow listener warnings are not โ€œDiscord is slowโ€, theyโ€™re โ€œa Discord message triggered a long/stuck tool runโ€.

Fast mitigation: deny browser in the channel(s) where you donโ€™t explicitly want UI automation (and only allow it in a dedicated channel / via per-sender override).

Tool restriction docs (applies to Discord via guilds.*.channels.* nesting):
https://docs.openclaw.ai/channels/groups#groupchannel-tool-restrictions-optional

3) Set a real listener timeout (or youโ€™ll keep tripping the watchdog)

Youโ€™ve got handlers taking minutes, so a 30โ€“120s listener timeout guarantees spam + restarts.

Config knob (per account):
channels.discord.accounts.<accountId>.eventQueue.listenerTimeout
Docs: https://docs.openclaw.ai/channels/discord (Long-running handlers accordion)

4) One more thing to clean up: phantom โ€œDiscord defaultโ€

Your probe shows:

#
  • Discord default: enabled, not configuredโ€ฆ

โ€ฆbut your real accounts are clark/dex/main. This usually happens when the config still implicitly expects a default account. If youโ€™re running multi-account, set:

  • channels.discord.defaultAccount to one of (main, dex, clark)

(That key is in the config reference, and it helps prevent the โ€œdefault not configuredโ€ noise.)

Next: I need the disconnect reason

Can you paste the few log lines around when one account disconnects (search in gateway.err.log / gateway.log for discord + disconnect / close / ShardError)?

Also: are clark, dex, and main all in the same Discord guild/channel? If yes, you can easily be doing 3x the work per message unless routing/mention gating is very strict.

#

Yep โ€” try openclaw update first.

  • On a git/source install, openclaw update is the โ€œrightโ€ path (pull/build/doctor + restart).
  • On an npm/pnpm global install, openclaw update will try to update via your package manager; if it canโ€™t detect the install cleanly, just do the global install update manually.

Given your status --all shows Update โ€ฆ 2026.3.2, the shortest path is:

openclaw update
# (it restarts by default)
openclaw status --all

If openclaw update complains / doesnโ€™t upgrade, then do:

#

Nice โ€” this confirms the Gateway itself is healthy (RPC probe: ok), so weโ€™re not chasing a โ€œgateway is downโ€ problem.

The standout issue now is:

Discord clark/dex/main: running, disconnected (no in/out for ~1โ€“2h)

Thatโ€™s consistent with what you saw earlier: long-running MESSAGE_CREATE handling + โ€œstuckโ€ restarts can wedge Discord connectivity.

What to do next (in order)

1) Update + restart (still step zero)

Since youโ€™re on 2026.3.1 and 2026.3.2 is available, update first and restart the service:

openclaw update
openclaw gateway restart
#
openclaw channels status --probe

If you update via pnpm instead, do the same gateway restart afterwards.

2) Capture the disconnect reason from logs (this will tell us whatโ€™s really happening)

Right after a restart, tail logs and look for Discord lines that mention close/error/reconnect:

openclaw logs --follow

If you want to grep the file log directly (your gateway status shows it here):
/tmp/openclaw/openclaw-2026-03-05.log

Search for keywords like: discord, disconnect, close, ShardError, Invalid token, 429, rate limit.

#

Paste the first โ€œdisconnect/closeโ€ block you find for one account (e.g. [discord] [main] โ€ฆ).

3) Prevent the โ€œstuckโ€ condition from happening again (root cause of slow listeners)

From your earlier tail, some Discord messages are triggering browser automation that waits for locators for minutes. That can:

  • trigger โ€œSlow listener detectedโ€
  • trip the health monitor โ€œstuckโ€ logic
  • and eventually lead to disconnect/restarts

Mitigations:

4) Optional cleanup: the phantom โ€œDiscord defaultโ€

Your probe shows Discord default: enabled, not configured. If youโ€™re using only clark/dex/main, set:

  • channels.discord.defaultAccount: "main" (or whichever you want)
#

That wonโ€™t fix disconnects by itself, but it removes confusing noise and avoids the system trying to reason about a non-existent default account.

If you paste the discord โ€œclose/disconnectโ€ log snippet after updating/restarting, we can narrow it down to โ€œheartbeat timeout due to stallsโ€ vs โ€œauth/tokenโ€ vs โ€œDiscord API/rate-limitโ€ in one step.

tribal hill
#

Fixed a few things in the config to RequireMention = True for most channels which has fixed the slow discord listeners. Now I see one of my bots disconnecting and reconnecting because: health-monitor: restarting (reason: stuck)

00:09:24 info gateway/health-monitor {"subsystem":"gateway/health-monitor"} [discord:dex] health-monitor: restarting (reason: stuck)
00:09:24 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} [dex] Discord Message Content Intent is limited; bots under 100 servers can use it without verification.
00:09:24 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} [dex] starting provider (@Dex)
00:09:24 debug cron {"module":"cron","storePath":"/Users/max/.openclaw/cron/jobs.json"} {"nextAt":1772766000000,"delayMs":60000,"clamped":true} cron: timer armed
00:09:25 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} discord channels resolved: 1468289360620163085/1468756222139174913โ†’1468289360620163085/1468756222139174913 (guild:Stackd Labs; channel:dev), 1468289360620163085/1473867115588227083โ†’1468289360620163085/1473867115588227083 (guild:Stackd Labs; channel:dex)
00:09:25 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} logged in to discord as 1468757873453432875 (Dex)
00:10:25 debug cron {"module":"cron","storePath":"/Users/max/.openclaw/cron/jobs.json"} {"nextAt":1772766000000,"delayMs":60000,"clamped":true} cron: timer armed
00:11:25 debug cron {"module":"cron","storePath":"/Users/max/.openclaw/cron/jobs.json"} {"nextAt":1772766000000,"delayMs":60000,"clamped":true} cron: timer armed
00:12:25 debug cron {"module":"cron","storePath":"/Users/max/.openclaw/cron/jobs.json"} {"nextAt":1772766000000,"delayMs":60000,"clamped":true} cron: timer armed

halcyon oarBOT
#

Good progress โ€” that log snippet is exactly what Iโ€™d expect after tightening requireMention: fewer heavy runs, fewer โ€œslow listenerโ€ warnings.

Why youโ€™re now seeing health-monitor: restarting (reason: stuck)

That restart is the watchdog saying โ€œthis Discord account looks wedged / not making progressโ€, so it force-restarts the Discord adapter. Itโ€™s usually triggered by one of these:

  1. A long-running handler/tool run (browser waits like your earlier waiting for locator(...), cron tool timeouts, etc.)
  2. Event queue backpressure (bot is in a high-volume channel/guild and the queue canโ€™t keep up)
  3. A real Discord transport hiccup where reconnect logic doesnโ€™t recover cleanly

In your snippet, it restarts and logs in cleanly (logged in to discord as โ€ฆ (Dex)), which suggests itโ€™s recoverable โ€” the key question is: does this happen repeatedly (every few minutes / hours), or only after a specific โ€œheavyโ€ command?

Concrete next steps

  1. Update to 2026.3.2 (still strongly recommended). If this is a watchdog false-positive or a reconnect bug, point releases often fix it.

  2. Limit where dex is even listening, not just mention-gating.
    If dex can โ€œseeโ€ busy channels, it still receives every MESSAGE_CREATE event and has to decide to ignore it. Prefer explicit channel allowlists under channels.discord.guilds.<guildId>.channels... for dex so itโ€™s only present where you actually use it.

#
  1. Make the Discord event queue timeout match reality for dex.
    If you ever do browser automation or anything that can take minutes, set:
    channels.discord.accounts.dex.eventQueue.listenerTimeout high enough that the watchdog doesnโ€™t think itโ€™s stuck mid-run.
    (Docs: https://docs.openclaw.ai/channels/discord)

  2. Grab 30โ€“60 seconds of logs before a restart
    Right now we only see the restart + login. To diagnose the cause, we need the preceding context. Next time it happens, paste the ~50 lines before health-monitor: restarting (filtered to [discord] [dex] if possible). Thatโ€™s where youโ€™ll usually see:

  • a stuck browser locator wait
  • a tool timeout
  • a Discord close code / shard error
  • queue depth warnings

If you paste one โ€œrestart eventโ€ with the ~50 lines before it, we can tell whether this is โ€œbrowser/tool run wedged dexโ€ vs โ€œdex is in a firehose channelโ€ vs โ€œDiscord transport bugโ€.

tribal hill
#

Already updated to 3.2.
How do I do 2/3/4?

halcyon oarBOT
#

Cool โ€” if youโ€™re on 2026.3.2 now, hereโ€™s how to do 2 / 3 / 4 concretely.

2) Limit where dex โ€œlistensโ€

There are two layers here:

A) OpenClaw allowlist (software routing)

Make sure Discord is in allowlist mode and only the guild/channels you want are allowed:

// ~/.openclaw/openclaw.json
{
  channels: {
    discord: {
      groupPolicy: "allowlist",
      guilds: {
        "1468289360620163085": { // Stackd Labs (guild id)
#
          // If you define channels, everything not listed is denied.
          channels: {
            dev: { allow: true, requireMention: true },
            dex: { allow: true, requireMention: true },
          },
        },
      },
    },
  },
}

Notes:

  • Channel keys are the slug names (no #). From your log those are dev and dex.
  • This allowlist applies at the Discord channel layer (it prevents processing outside those channels).
#

B) Discord-side visibility (best way to make it โ€œper botโ€)

OpenClaw config is largely shared across your Discord accounts; it canโ€™t perfectly say โ€œonly this bot sees channel Xโ€ unless you split into separate gateway profiles.

So if you truly want only the Dex bot to receive traffic from #dex, the best move is Discord-side:

  • remove the other bots from that channel (permissions), or donโ€™t invite them to that guild.

After editing config, restart:

openclaw gateway restart
openclaw channels status --probe

3) Set the per-account listener timeout (so โ€œstuckโ€ doesnโ€™t trip mid-run)

In ~/.openclaw/openclaw.json:

#
{
  channels: {
    discord: {
      accounts: {
        dex: {
          eventQueue: {
            listenerTimeout: 300000, // example: 5 minutes; pick > your longest expected run
          },
        },
      },
    },
  },
}

Docs reference: https://docs.openclaw.ai/channels/discord (the โ€œSlow listener detected โ€ฆโ€ accordion)

#

Then:

openclaw gateway restart

4) Grab the ~50 lines before a โ€œstuckโ€ restart

Easiest way (interactive):

openclaw logs --follow

When you see:
health-monitor: restarting (reason: stuck)
scroll up and copy/paste the ~50 lines above it (especially anything [discord] [dex] and any [tools] lines).

If you prefer file-based (since gateway status shows the file log path), you can do:

#
# find the line number(s)
grep -n "health-monitor: restarting (reason: stuck)" /tmp/openclaw/openclaw-2026-03-06.log

# then print some context around one of those line numbers (replace N)
sed -n "$((N-80)),$((N+20))p" /tmp/openclaw/openclaw-2026-03-06.log

If you paste one such โ€œcontext blockโ€, we can usually tell immediately whether it was:

  • a long browser/tool wait,
  • queue/backpressure,
  • or a Discord transport close/reconnect issue.
tribal hill
#

Looks like agent Clark is now showing as health-monitor: stuck

1:32:26 info gateway/reload {"subsystem":"gateway/reload"} config change detected; evaluating reload (channels.discord.accounts.dex.eventQueue)
01:32:26 info gateway/channels {"subsystem":"gateway/channels"} restarting discord channel
01:32:26 info gateway/reload {"subsystem":"gateway/reload"} config hot reload applied (channels.discord.accounts.dex.eventQueue)
01:32:26 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} [clark] Discord Message Content Intent is limited; bots under 100 servers can use it without verification.
01:32:26 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} [clark] starting provider (@Clark)
01:32:26 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} [main] Discord Message Content Intent is limited; bots under 100 servers can use it without verification.
01:32:26 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} [main] starting provider (@Max)
01:32:27 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} [dex] Discord Message Content Intent is limited; bots under 100 servers can use it without verification.
01:32:27 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} [dex] starting provider (@Dex)
01:32:27 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} discord channels resolved: 1468289360620163085/1468756222139174913โ†’1468289360620163085/1468756222139174913 (guild:Stackd Labs; channel:dev), 1468289360620163085/1473867115588227083โ†’1468289360620163085/1473867115588227083 (guild:Stackd Labs; channel:dex)
01:32:27 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} discord channels resolved: 1468289360620163085/1473860616983941192โ†’1468289360620163085/1473860616983941192 (guild:Stackd Labs; channel:clark), 1468289360620163085/1468832551341920369โ†’1468289360620163085/1468832551341920369 (guild:Stackd Labs; channel:research), 1468289360620163085/1476394873681744028โ†’1468289360620163085/1476394873681744028 (guild:Stackd Labs; channel:fraud-news)
01:32:28 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} discord channels resolved: 1468289360620163085/1473867664219836499โ†’1468289360620163085/1473867664219836499 (guild:Stackd Labs; channel:max), 1468289360620163085/1468832551341920369โ†’1468289360620163085/1468832551341920369 (guild:Stackd Labs; channel:research), 1468289360620163085/1475637338695798784โ†’1468289360620163085/1475637338695798784 (guild:Stackd Labs; channel:automation-operator), 1468289360620163085/1475637378545750230โ†’1468289360620163085/1475637378545750230 (guild:Stackd Labs; channel:scout-synthesis) (+2)
01:32:28 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} logged in to discord as 1468757873453432875 (Dex)
01:32:28 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} logged in to discord as 1469183866530042050 (Clark)
01:32:28 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} logged in to discord as 1468288536162730066 (Max)
01:33:01 debug cron {"module":"cron","storePath":"/Users/max/.openclaw/cron/jobs.json"} {"nextAt":1772766000000,"delayMs":60000,"clamped":true} cron: timer armed
01:33:12 info gateway {"subsystem":"gateway"} signal SIGTERM received
01:33:12 info gateway {"subsystem":"gateway"} received SIGTERM; shutting down
01:33:12 info gmail-watcher {"subsystem":"gmail-watcher"} gmail watcher stopped
01:33:15 info gateway/canvas {"subsystem":"gateway/canvas"} canvas host mounted at http://127.0.0.1:18789/__openclaw__/canvas/ (root /Users/max/.openclaw/canvas)
01:33:15 info gateway/heartbeat {"subsystem":"gateway/heartbeat"} {"intervalMs":3600000} heartbeat: started
01:33:15 info gateway/health-monitor {"subsystem":"gateway/health-monitor"} started (interval: 300s, startup-grace: 60s, channel-connect-grace: 120s)

halcyon oarBOT
#

That snippet actually shows two different โ€œrestart-yโ€ things mixed together:

  1. Config hot reload โ†’ โ€œrestarting discord channelโ€
  2. Then a full gateway SIGTERM + restart (LaunchAgent/supervisor restarting the whole process)

A) โ€œconfig change detected โ€ฆ restarting discord channelโ€

This is expected behavior: changing anything under channels.discordโ€ฆ will typically restart the whole Discord subsystem (all accounts), because the adapters share a lot of state.

So donโ€™t worry about that line by itself.

B) โ€œhealth-monitor: stuckโ€ moving from dex โ†’ clark

If clark is now the one getting health-monitor: restarting (reason: stuck), the usual causes are still:

  • a long-running handler/tool run on that bot (browser waits, cron tool timeouts, etc.)
  • or that bot sits in a higher-volume set of channels than the others (your log shows clark is in clark/research/fraud-news)

The simplest immediate thing to try: give clark (and honestly main too) the same eventQueue timeout treatment you just did for dex.

#

Example:

{
  channels: {
    discord: {
      accounts: {
        dex:   { eventQueue: { listenerTimeout: 300000 } },
        clark: { eventQueue: { listenerTimeout: 300000 } },
        main:  { eventQueue: { listenerTimeout: 300000 } },
      },
    },
  },
}

Pick a value thatโ€™s comfortably above your longest real โ€œworkโ€ run. Then:

#
openclaw gateway restart

(Doing one clean restart after edits is usually less confusing than relying on hot reload while debugging.)

C) About the SIGTERM in your log

This line:

01:33:12 โ€ฆ signal SIGTERM received

means the gateway process was intentionally terminated (typically openclaw gateway restart, openclaw update, launchd kickstart, etc.). If you didnโ€™t run a restart around then, thatโ€™s interesting โ€” but itโ€™s separate from the Discord โ€œstuckโ€ issue.

What I need next to nail it

Can you paste the actual health-monitor: restarting (reason: stuck) lines for clark plus ~30โ€“50 lines before them?

Right now you pasted the startup/reload sequence, but not the moment where health-monitor decides clark is stuck. That preceding block is where weโ€™ll see the smoking gun (browser locator wait, tool timeout, Discord close code, etc.).

tribal hill
#

How do I get those lines?

halcyon oarBOT
#

Two easy ways โ€” live (copy/paste) or after-the-fact (extract from the log file).

Option A (simplest): live tail + copy the block

Run this and wait until it happens again:

openclaw logs --follow

When you see health-monitor: restarting (reason: stuck), just scroll up in your terminal and copy ~50 lines above it.

Option B: extract the surrounding lines from the file log

Your gateway status showed the JSONL file log here:

/tmp/openclaw/openclaw-YYYY-MM-DD.log

So you can do:

#
LOG="/tmp/openclaw/openclaw-$(date +%F).log"

# find the last occurrence line number
N=$(grep -n 'health-monitor: restarting (reason: stuck)' "$LOG" | tail -n 1 | cut -d: -f1)

# print ~80 lines before + ~20 lines after
sed -n "$((N-80)),$((N+20))p" "$LOG"

If you want only the clark ones:

N=$(grep -n '\[discord:clark\].*health-monitor: restarting (reason: stuck)' "$LOG" | tail -n 1 | cut -d: -f1)
sed -n "$((N-80)),$((N+20))p" "$LOG"
#

(If the exact [discord:clark] prefix doesnโ€™t match, just use the first command and paste the context block โ€” we can spot the account in the surrounding lines.)

tribal hill
halcyon oarBOT
#

Yep โ€” good find, and that workaround key is real.

gateway.channelHealthCheckMinutes controls the interval for the gatewayโ€™s periodic channel health probing / status updates (and in practice itโ€™s what drives that health-monitor โ€ฆ interval: 300s you saw when it was set to 5).

Workaround: slow down the health monitor checks

In ~/.openclaw/openclaw.json:

{
  gateway: {
    channelHealthCheckMinutes: 10,
  },
}

Then restart:

#
openclaw gateway restart
openclaw channels status --probe

Notes:

  • This doesnโ€™t โ€œfixโ€ the underlying stuck detection bug โ€” it just reduces how often it can trigger (tradeoff: slower detection/recovery when a channel is actually unhealthy).
  • Iโ€™d keep your channels.discord.accounts.<id>.eventQueue.listenerTimeout tuning in place too, since long tool runs can legitimately make a channel look unresponsive.

If you can, dropping a short comment on that issue with your exact 2026.3.2 version + one โ€œstuckโ€ log block helps confirm itโ€™s the same failure mode.

tribal hill
#

I just dropped a comment. So no way to fix it for now?

halcyon oarBOT
#

If itโ€™s the same bug as that issue, then there isnโ€™t a โ€œrealโ€ fix you can apply locally yet โ€” it needs an upstream patch.

But you do have a couple of practical mitigations:

1) Reduce restart frequency (what you already found)

{
  gateway: {
    channelHealthCheckMinutes: 10, // or higher
  },
}

Tradeoff: slower detection/recovery when a channel is genuinely unhealthy.

2) Disable the channel health monitor entirely (temporary workaround)

This stops the automatic โ€œstuck โ†’ restartโ€ behavior:

#
{
  gateway: {
    channelHealthCheckMinutes: 0,
  },
}

Then:

openclaw gateway restart

Tradeoff: you lose automatic channel health probing/restarts, so youโ€™ll be relying on manual restarts / your own monitoring if a bot truly disconnects.

3) Still worth doing alongside the workaround

  • Keep channels.discord.accounts.<id>.eventQueue.listenerTimeout high enough for any long runs.