#gateway is crashing every ~5 minutes

103 messages · Page 1 of 1 (latest)

digital dirge
#

my agent said this

Found the root cause. The gateway is crashing every ~6 minutes with status code 1 (FAILURE). The restart counter is at 131 as of this morning.

Root cause: Corrupted QMD sqlite database

SqliteError: database disk image is malformed
at qmd store.js:1878
at listCollections → updateCollections

Every time qmd (the memory system) tries to boot, it hits a corrupted sqlite database and crashes the gateway. It keeps restarting, hitting the corrupted DB, crashing again. The pattern is every ~6 minutes.

Impact on messages: Yes — if you sent Rush a message during the crash window (between 20:02 and 20:25), it was dropped. That's the exact same pattern.

Fix options:

  1. Rebuild qmd database — find which sqlite file(s) are corrupted and restore/rebuild them
  2. Wipe qmd data — delete the corrupted sqlite files and let qmd rebuild from scratch (loses memory)
  3. Fix the underlying cause — stop the corruption from happening in the first place

The qmd databases are at:

• /home/totaluser/.openclaw/agents/*/qmd/xdg-cache/qmd/index.sqlite (one per agent)

One of those is corrupt. Do you want me to identify which one and fix it?

is said yes figure it out and he said this

The WAL files are 4-5MB each - active uncommitted changes that can't be checkpointed because the gateway keeps crashing before finishing. This is the root cause: every ~6 minutes the gateway crashes, the WAL is left corrupted, then it restarts and tries to read the corrupted WAL.

Fix: Stop the gateway, delete the WAL and SHM files,

quaint sandalBOT
native moonBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> âś… Mark Solution

quaint sandalBOT
dense wigeonBOT
#
done

If one agent is bad, try this first:

cp -a ~/.openclaw/agents/<agent-id>/qmd ~/.openclaw/agents/<agent-id>/qmd.backup.$(date +%s)

mv ~/.openclaw/agents/<agent-id>/qmd/xdg-cache/qmd/index.sqlite-wal \
   ~/.openclaw/agents/<agent-id>/qmd/xdg-cache/qmd/index.sqlite-wal.bak 2>/dev/null || true

mv ~/.openclaw/agents/<agent-id>/qmd/xdg-cache/qmd/index.sqlite-shm \
   ~/.openclaw/agents/<agent-id>/qmd/xdg-cache/qmd/index.sqlite-shm.bak 2>/dev/null || true

openclaw gateway start
#

Plausible, yes, but I would not jump straight to deleting the main DB. database disk image is malformed can be a bad WAL, or a bad index.sqlite.

First, what does this show?

openclaw --version

Then do the least-destructive path, with the gateway fully stopped:

openclaw gateway stop

for db in ~/.openclaw/agents/*/qmd/xdg-cache/qmd/index.sqlite; do
  echo "== $db =="
  sqlite3 "$db" 'PRAGMA integrity_check;'
#
openclaw logs --follow

If it still crashes with the same malformed error, the main QMD index is probably toast. Then move aside that whole agent QMD dir so OpenClaw rebuilds it:

openclaw gateway stop
mv ~/.openclaw/agents/<agent-id>/qmd ~/.openclaw/agents/<agent-id>/qmd.bad.$(date +%s)
openclaw gateway start

That will lose QMD memory for that agent, but it is cleaner than deleting random SQLite files.

Also, the roughly 5 minute cadence matches QMD's default background update interval in the docs, so the timing fits: https://docs.openclaw.ai/concepts/memory-qmd
General triage ladder: https://docs.openclaw.ai/help/faq

#

Paste openclaw --version and the integrity_check output here, and I can help narrow it down.

digital dirge
#

i will split the output into multiple messages. start replying when i say "im done"

dense wigeonBOT
#

Got it, I’ll wait. Send the chunks, and I’ll hold off until you say “im done”.

digital dirge
#

totaluser@Total-Hardware:~$ openclaw gateway stop

for db in ~/.openclaw/agents/*/qmd/xdg-cache/qmd/index.sqlite; do
echo "== $db =="
sqlite3 "$db" 'PRAGMA integrity_check;'
done

🦞 OpenClaw 2026.4.11 (769908e)
Somewhere between 'hello world' and 'oh god what have I built.'

Stopped systemd service: openclaw-gateway.service
== /home/totaluser/.openclaw/agents/alfred/qmd/xdg-cache/qmd/index.sqlite ==
ok
== /home/totaluser/.openclaw/agents/cipher/qmd/xdg-cache/qmd/index.sqlite ==
ok
== /home/totaluser/.openclaw/agents/forge/qmd/xdg-cache/qmd/index.sqlite ==
ok
== /home/totaluser/.openclaw/agents/main/qmd/xdg-cache/qmd/index.sqlite ==
ok
== /home/totaluser/.openclaw/agents/rush/qmd/xdg-cache/qmd/index.sqlite ==
ok
== /home/totaluser/.openclaw/agents/scout/qmd/xdg-cache/qmd/index.sqlite ==
*** in database main ***
Tree 12 page 10747: btreeInitPage() returns error code 11
Tree 12 page 10750: btreeInitPage() returns error code 11
Tree 12 page 10751: btreeInitPage() returns error code 11
Tree 12 page 9658: btreeInitPage() returns error code 11
Tree 12 page 10757: btreeInitPage() returns error code 11
Tree 12 page 6481: btreeInitPage() returns error code 11
Tree 12 page 12 cell 225: 2nd reference to page 560
Tree 12 page 12 cell 224: 2nd reference to page 563
Tree 12 page 12 cell 223: 2nd reference to page 718
Tree 12 page 12 cell 221: Rowid 5229 out of order
Tree 12 page 12 cell 221: 2nd reference to page 2846
Tree 12 page 10655: btreeInitPage() returns error code 11
Tree 12 page 10615: btreeInitPage() returns error code 11
Tree 12 page 10612: btreeInitPage() returns error code 11
Tree 12 page 9834: btreeInitPage() returns error code 11
Tree 12 page 12 cell 216: 2nd reference to page 2875
Tree 12 page 12 cell 215: 2nd reference to page 2858

#

Tree 12 page 7794: btreeInitPage() returns error code 11
Tree 12 page 5109: btreeInitPage() returns error code 11
Tree 12 page 1138: btreeInitPage() returns error code 11
Tree 12 page 9986: btreeInitPage() returns error code 11
Tree 12 page 9683: btreeInitPage() returns error code 11
Tree 12 page 9956: btreeInitPage() returns error code 11
Tree 12 page 9831: btreeInitPage() returns error code 11
Tree 12 page 7828: btreeInitPage() returns error code 11
Tree 12 page 7823: btreeInitPage() returns error code 11
Tree 12 page 7816: btreeInitPage() returns error code 11
Tree 12 page 7809: btreeInitPage() returns error code 11
Tree 12 page 12 cell 203: 2nd reference to page 553
Tree 12 page 12 cell 202: 2nd reference to page 552
Tree 12 page 12 cell 201: 2nd reference to page 545
Tree 12 page 12 cell 200: 2nd reference to page 544
Tree 12 page 12 cell 199: 2nd reference to page 482
Tree 12 page 12 cell 197: Rowid 4677 out of order
Tree 12 page 12 cell 196: Rowid 4654 out of order
Tree 12 page 436: btreeInitPage() returns error code 11
Tree 12 page 5125: btreeInitPage() returns error code 11
Tree 12 page 9616: btreeInitPage() returns error code 11
Tree 12 page 5130: btreeInitPage() returns error code 11
Tree 12 page 5147: btreeInitPage() returns error code 11
Tree 12 page 9615: btreeInitPage() returns error code 11
Tree 12 page 12 cell 190: 2nd reference to page 3785
Tree 12 page 9720: btreeInitPage() returns error code 11
Tree 12 page 12 cell 188: 2nd reference to page 2812
Tree 12 page 6573: btreeInitPage() returns error code 11
Tree 12 page 9591: btreeInitPage() returns error code 11
Tree 12 page 9589: btreeInitPage() returns error code 11
Tree 12 page 9584: btreeInitPage() returns error code 11
Tree 12 page 9583: btreeInitPage() returns error code 11

#

Tree 12 page 9580: btreeInitPage() returns error code 11
Tree 12 page 9578: btreeInitPage() returns error code 11
Tree 12 page 9573: btreeInitPage() returns error code 11
Tree 12 page 9572: btreeInitPage() returns error code 11
Tree 12 page 9568: btreeInitPage() returns error code 11
Tree 12 page 9566: btreeInitPage() returns error code 11
Tree 12 page 9564: btreeInitPage() returns error code 11
Tree 12 page 9561: btreeInitPage() returns error code 11
Tree 12 page 9558: btreeInitPage() returns error code 11
Tree 12 page 8785: btreeInitPage() returns error code 11
Tree 12 page 8784: btreeInitPage() returns error code 11
Tree 12 page 8780: btreeInitPage() returns error code 11
Tree 12 page 8779: btreeInitPage() returns error code 11
Tree 12 page 8775: btreeInitPage() returns error code 11
Tree 12 page 8773: btreeInitPage() returns error code 11
Tree 12 page 8770: btreeInitPage() returns error code 11
Tree 12 page 8769: btreeInitPage() returns error code 11
Tree 12 page 8764: btreeInitPage() returns error code 11
Tree 12 page 8762: btreeInitPage() returns error code 11
Tree 12 page 8759: btreeInitPage() returns error code 11
Tree 12 page 8758: btreeInitPage() returns error code 11
Tree 12 page 8755: btreeInitPage() returns error code 11
Tree 12 page 8754: btreeInitPage() returns error code 11
Tree 12 page 8751: btreeInitPage() returns error code 11
Tree 12 page 8750: btreeInitPage() returns error code 11
Tree 12 page 8748: btreeInitPage() returns error code 11
Tree 12 page 8745: btreeInitPage() returns error code 11
Tree 12 page 8742: btreeInitPage() returns error code 11
Tree 12 page 8736: btreeInitPage() returns error code 11
Tree 12 page 8735: btreeInitPage() returns error code 11
Tree 12 page 8732: btreeInitPage() returns error code 11
Tree 12 page 8731: btreeInitPage() returns error code 11

#

Tree 12 page 8727: btreeInitPage() returns error code 11
Tree 12 page 8725: btreeInitPage() returns error code 11
Tree 12 page 8721: btreeInitPage() returns error code 11
Tree 12 page 8717: btreeInitPage() returns error code 11
Tree 12 page 8715: btreeInitPage() returns error code 11
Tree 12 page 8713: btreeInitPage() returns error code 11
Tree 12 page 8711: btreeInitPage() returns error code 11
Tree 12 page 8709: btreeInitPage() returns error code 11
Tree 12 page 8708: btreeInitPage() returns error code 11
Tree 12 page 8704: btreeInitPage() returns error code 11
Tree 12 page 8702: btreeInitPage() returns error code 11
Tree 12 page 8699: btreeInitPage() returns error code 11
Tree 12 page 8698: btreeInitPage() returns error code 11
Tree 12 page 8696: btreeInitPage() returns error code 11
Tree 12 page 8694: btreeInitPage() returns error code 11
Tree 12 page 8690: btreeInitPage() returns error code 11
Tree 12 page 8687: btreeInitPage() returns error code 11
Tree 12 page 8684: btreeInitPage() returns error code 11
Tree 12 page 8682: btreeInitPage() returns error code 11
== /home/totaluser/.openclaw/agents/sentinel/qmd/xdg-cache/qmd/index.sqlite ==
ok
== /home/totaluser/.openclaw/agents/vital/qmd/xdg-cache/qmd/index.sqlite ==
ok
totaluser@Total-Hardware:~$

#

im done

dense wigeonBOT
#

Nice, that narrows it down a lot.

It is scout, and this is main DB corruption, not just a bad WAL. PRAGMA integrity_check; failed against index.sqlite itself while the gateway was stopped, so I would skip the WAL/SHM-only fix.

Best next step is to move aside only Scout’s QMD home and let it rebuild:

openclaw gateway stop
mv ~/.openclaw/agents/scout/qmd ~/.openclaw/agents/scout/qmd.bad.$(date +%s)
openclaw gateway start
openclaw gateway status
openclaw logs --follow

That should:

  • preserve the broken data for later inspection
  • rebuild QMD fresh for Scout only
#
  • leave the other agents alone

If the gateway comes up clean after that, then the corruption was isolated to Scout’s QMD cache.

I’d also call this a bug, because one corrupt QMD index should not be able to take the whole gateway down. If it reproduces, please file an issue with:

  • openclaw --version → 2026.4.11
  • the SqliteError: database disk image is malformed stack
  • the PRAGMA integrity_check output for Scout

Docs: https://docs.openclaw.ai/concepts/memory-qmd
Issues: https://github.com/openclaw/openclaw/issues

If you run that and paste the new startup output, I can sanity check the next step.

digital dirge
#

totaluser@Total-Hardware:~$ openclaw gateway stop
mv ~/.openclaw/agents/scout/qmd ~/.openclaw/agents/scout/qmd.bad.$(date +%s)
openclaw gateway start
openclaw gateway status
openclaw logs --follow

🦞 OpenClaw 2026.4.11 (769908e)
Shell yeah—I'm here to pinch the toil and leave you the glory.

Stopped systemd service: openclaw-gateway.service

🦞 OpenClaw 2026.4.11 (769908e)
Like having a senior engineer on call, except I don't bill hourly or sigh audibly.

Restarted systemd service: openclaw-gateway.service

🦞 OpenClaw 2026.4.11 (769908e)
Finally, a use for that always-on Mac Mini under your desk.

│
Service: systemd (enabled)
File logs: /tmp/openclaw/openclaw-2026-04-13.log
Command: /home/linuxbrew/.linuxbrew/opt/node/bin/node /home/totaluser/.npm-global/lib/node_modules/openclaw/dist/entry.js gateway --port 18789
Service file: ~/.config/systemd/user/openclaw-gateway.service
Service env: OPENCLAW_GATEWAY_PORT=18789
Config (cli): ~/.openclaw/openclaw.json
Config (service): ~/.openclaw/openclaw.json
Gateway: bind=loopback (127.0.0.1), port=18789 (service args)
Probe target: ws://127.0.0.1:18789
Dashboard: http://127.0.0.1:18789/
Probe note: Loopback-only gateway; only local clients can connect.

Runtime: running (pid 2003948, state active, sub running, last exit 0, reason 0)
Warm-up: launch agents can take a few seconds. Try again shortly.
RPC probe: failed
RPC target: ws://127.0.0.1:18789
gateway closed (1006):

Gateway port 18789 is not listening (service appears running).
Logs: journalctl --user -u openclaw-gateway.service -n 200 --no-pager

Troubles: run openclaw status
Troubleshooting: https://docs.openclaw.ai/troubleshooting

🦞 OpenClaw 2026.4.11 (769908e)
I'm not saying your workflow is chaotic... I'm just bringing a linter and a helmet.

Gateway not reachable. Is it running and accessible?
Gateway target: ws://127.0.0.1:18789
Source: local loopback
Config: /home/totaluser/.openclaw/openclaw.json
Bind: loopback
Hint: run openclaw doctor.

dense wigeonBOT
#

Okay, Scout QMD was definitely one real problem, but it is not the only thing blocking startup.

The important bit now is:

  • Runtime: running
  • but Gateway port 18789 is not listening
  • and RPC closes with 1006

So the gateway is still failing before it finishes binding/listening. I would not move/delete anything else yet.

Next step, grab the actual service error:

journalctl --user -u openclaw-gateway.service -n 200 --no-pager

If you want one more useful view:

#
systemctl --user status openclaw-gateway.service --no-pager -l

Paste the section from the most recent restart, especially any lines with:

  • Error
  • Unhandled
  • EADDRINUSE
  • refusing to bind
  • SyntaxError
  • QMD

At this point logs matter more than guesses.

Docs: https://docs.openclaw.ai/help/troubleshooting

digital dirge
#

there are two halves to the logs

here is one

Apr 13 13:48:53 Total-Hardware systemd[1760]: Started openclaw-gateway.service - OpenClaw Gateway (v2026.4.11).
Apr 13 13:48:56 Total-Hardware node[2003966]: 2026-04-13T13:48:56.946-07:00 [gateway] loading configuration…
Apr 13 13:48:57 Total-Hardware node[2003966]: 2026-04-13T13:48:57.009-07:00 [gateway] resolving authentication…
Apr 13 13:48:57 Total-Hardware node[2003966]: 2026-04-13T13:48:57.073-07:00 [gateway] starting...
Apr 13 13:49:03 Total-Hardware node[2003966]: 2026-04-13T13:49:03.984-07:00 [plugins] plugins.allow is empty; discovered non-bundled plugins may auto-load: lossless-claw (/home/totaluser/.openclaw/extensions/lossless-claw/dist/index.js). Set plugins.allow to explicit trusted ids.
Apr 13 13:49:04 Total-Hardware node[2003966]: 2026-04-13T13:49:04.812-07:00 [gateway] starting HTTP server...
Apr 13 13:49:04 Total-Hardware node[2003966]: 2026-04-13T13:49:04.893-07:00 [tasks/registry] Failed to restore task registry
Apr 13 13:49:05 Total-Hardware node[2003966]: 2026-04-13T13:49:05.008-07:00 [gateway] agent model: minimax-portal/MiniMax-M2.7
Apr 13 13:49:05 Total-Hardware node[2003966]: 2026-04-13T13:49:05.009-07:00 [gateway] ready (9 plugins: acpx, active-memory, browser, device-pair, discord, lossless-claw, phone-control, talk-voice, telegram; 7.9s)
Apr 13 13:49:05 Total-Hardware node[2003966]: 2026-04-13T13:49:05.098-07:00 [gateway] security warning: dangerous config flags enabled: gateway.controlUi.allowInsecureAuth=true. Run openclaw security audit.
Apr 13 13:49:05 Total-Hardware node[2003966]: 2026-04-13T13:49:05.777-07:00 [tailscale] serve failed: Command failed: /usr/bin/tailscale serve --bg --yes 18789
sending serve config: Access denied: serve config denied
Use 'sudo tailscale serve --bg --yes 18789'.
To not require root, use 'sudo tailscale set --operator=$USER' once.

#

here is the other

Apr 13 13:49:49 Total-Hardware node[2003966]: 2026-04-13T13:49:49.973-07:00 [gateway] qmd memory startup initialization armed for agent "main"
Apr 13 13:49:51 Total-Hardware node[2003966]: 2026-04-13T13:49:51.584-07:00 [gateway] qmd memory startup initialization armed for agent "forge"
Apr 13 13:49:56 Total-Hardware node[2003966]: 2026-04-13T13:49:56.083-07:00 [gateway] qmd memory startup initialization armed for agent "scout"
Apr 13 13:49:58 Total-Hardware node[2003966]: 2026-04-13T13:49:58.083-07:00 [gateway] qmd memory startup initialization armed for agent "vital"
Apr 13 13:49:59 Total-Hardware node[2003966]: 2026-04-13T13:49:59.986-07:00 [gateway] qmd memory startup initialization armed for agent "sentinel"
Apr 13 13:50:08 Total-Hardware node[2003966]: 2026-04-13T13:50:08.227-07:00 [gateway] qmd memory startup initialization armed for agent "alfred"
Apr 13 13:50:11 Total-Hardware node[2003966]: 2026-04-13T13:50:11.737-07:00 [gateway] qmd memory startup initialization armed for agent "rush"
Apr 13 13:50:13 Total-Hardware node[2003966]: 2026-04-13T13:50:13.568-07:00 [gateway] qmd memory startup initialization armed for agent "cipher"

dense wigeonBOT
#

This actually looks good now.

The important line is:

  • [gateway] ready ...

That means the gateway finished startup. So the hard crash loop from Scout’s corrupted QMD DB is likely fixed.

A couple notes on the remaining log lines:

  • qmd memory startup initialization armed for agent "scout" is expected after the rebuild. It means OpenClaw recreated Scout’s QMD setup.
  • tailscale serve failed ... Access denied is separate. That only affects Tailscale Serve, not the local loopback gateway itself.
  • Failed to restore task registry is also separate. Worth checking later if automations/tasks are acting weird, but it is not the crash you were chasing.

I think the earlier gateway status result was just caught during warm-up.

Next, verify it from the CLI now that startup has settled:

#
openclaw gateway status
openclaw channels status --probe

If those come back healthy, then the QMD corruption fix worked and you are into cleanup-only territory.

If you use Tailscale Serve for remote access, that permission problem is the next thing to fix, but it is a different issue from the crash loop.

Docs: https://docs.openclaw.ai/help/troubleshooting
QMD: https://docs.openclaw.ai/concepts/memory-qmd

If you paste those two status commands, I can tell you if you’re fully in the clear.

digital dirge
#

totaluser@Total-Hardware:~$ openclaw gateway status

🦞 OpenClaw 2026.4.11 (769908e)
We ship features faster than Apple ships calculator updates.

│
â—‡
Service: systemd (enabled)
File logs: /tmp/openclaw/openclaw-2026-04-13.log
Command: /home/linuxbrew/.linuxbrew/opt/node/bin/node /home/totaluser/.npm-global/lib/node_modules/openclaw/dist/entry.js gateway --port 18789
Service file: ~/.config/systemd/user/openclaw-gateway.service
Service env: OPENCLAW_GATEWAY_PORT=18789

Config (cli): ~/.openclaw/openclaw.json
Config (service): ~/.openclaw/openclaw.json

Gateway: bind=loopback (127.0.0.1), port=18789 (service args)
Probe target: ws://127.0.0.1:18789
Dashboard: http://127.0.0.1:18789/
Probe note: Loopback-only gateway; only local clients can connect.

Runtime: running (pid 2006307, state active, sub running, last exit 0, reason 0)
RPC probe: ok

Listening: 127.0.0.1:18789
Troubles: run openclaw status
Troubleshooting: https://docs.openclaw.ai/troubleshooting
totaluser@Total-Hardware:~$

dense wigeonBOT
#

Yep, that looks healthy now.

The key lines are:

  • Runtime: running
  • RPC probe: ok
  • Listening: 127.0.0.1:18789

So the gateway crash loop is fixed. The corrupt Scout QMD index was the blocker, and moving ~/.openclaw/agents/scout/qmd aside let OpenClaw rebuild it cleanly.

What is still separate:

  • Tailscale Serve permission error: only matters if you use remote Tailscale access
  • Task registry restore failure: only matters if tasks/automation are acting weird

If you want one final sanity check for message transport:

#
openclaw channels status --probe

I’d also keep ~/.openclaw/agents/scout/qmd.bad.<timestamp> around for a bit, just in case you want to inspect it later.

And honestly, I’d file this as a bug, because one corrupt QMD index should not be able to take the whole gateway down: https://github.com/openclaw/openclaw/issues

Glad we caught it. 🎉
If this fixed it, please right-click the solution message → Apps → ✅ Mark Solution so Answer Overflow can index it for the next person.

digital dirge
#

can you give me the right bug report info to submit

dense wigeonBOT
#

Yep, here’s a clean issue draft you can paste into GitHub:

Title: Corrupt QMD index for one agent can crash the whole gateway in a restart loop

## Version
- OpenClaw: `2026.4.11 (769908e)`

## Environment
- OS: Linux
- Service mode: user `systemd`
- Gateway bind: loopback (`127.0.0.1:18789`)
- Config path: `~/.openclaw/openclaw.json`

## Summary
A corrupted QMD SQLite index for a single agent caused the entire gateway to crash repeatedly instead of isolating the failure to that agent.
#

## What happened
The gateway was exiting roughly every 5 to 6 minutes with status code 1. During that period, messages sent to the affected agent were dropped.

Journal/logs showed QMD-related failures ending in:

```txt
SqliteError: database disk image is malformed
  at qmd store.js:1878
  at listCollections -> updateCollections

Diagnostic steps

I stopped the gateway and ran:

#

for db in ~/.openclaw/agents/*/qmd/xdg-cache/qmd/index.sqlite; do
echo "== $db =="
sqlite3 "$db" 'PRAGMA integrity_check;'
done


All agents returned `ok` except `scout`, which returned many corruption errors from `PRAGMA integrity_check`.

Example:

```txt
== ~/.openclaw/agents/scout/qmd/xdg-cache/qmd/index.sqlite ==
*** in database main ***
Tree 12 page 10747: btreeInitPage() returns error code 11
...
Tree 12 page 12 cell 221: Rowid 5229 out of order
...
#

This suggests the main `index.sqlite` was corrupted, not just the WAL.

## Workaround that fixed it
I stopped the gateway, moved aside only the corrupt QMD home for that agent, then restarted:

```bash
openclaw gateway stop
mv ~/.openclaw/agents/scout/qmd ~/.openclaw/agents/scout/qmd.bad.$(date +%s)
openclaw gateway start

After that, the gateway recovered successfully:

  • Runtime: running
#
- `RPC probe: ok`
- `Listening: 127.0.0.1:18789`

## Expected behavior
If one agent's QMD index is corrupt, OpenClaw should degrade gracefully, for example:
- isolate the failure to that agent
- warn loudly in logs
- optionally rebuild that agent's QMD cache
- but not crash the entire gateway repeatedly

## Actual behavior
A single corrupt QMD index appears to take down the whole gateway, causing a restart loop until manual cleanup.

## Notes
There were also non-fatal log lines after recovery:
#
- `tailscale serve failed ... Access denied`
- `Failed to restore task registry`

Those appear separate from the QMD corruption issue.

## Attachments
I can provide:
- `journalctl --user -u openclaw-gateway.service -n 200 --no-pager`
- `PRAGMA integrity_check` output for the corrupt DB
- `openclaw gateway status` after recovery

Issue link: https://github.com/openclaw/openclaw/issues

I’d use this title if you want a shorter one:

#

QMD corruption in one agent can crash the whole gateway instead of failing open

If you want, I can also turn this into a tighter, maintainer-friendly version before you submit it.

digital dirge
#

howcan i verify that there is not corrupted thing again

dense wigeonBOT
#

You can’t prove it will never recur, but you can verify the current state pretty well.

I’d do this:

1. Make sure it stays up past the old failure window

Since it used to die every ~5 to 6 minutes, let it run for at least 15 to 20 minutes.

openclaw gateway status
openclaw logs --follow

Good signs:

  • RPC probe: ok
  • no new SqliteError
  • no database disk image is malformed
  • no restart loop
#

2. Re-run integrity checks after it has rebuilt

With the gateway stopped again:

openclaw gateway stop

for db in ~/.openclaw/agents/*/qmd/xdg-cache/qmd/index.sqlite; do
  echo "== $db =="
  sqlite3 "$db" 'PRAGMA integrity_check;'
done

openclaw gateway start

What you want:

  • every DB returns ok
  • especially scout, which was the bad one before
#

3. Verify transport still works

If you want to be extra sure messages are flowing:

openclaw channels status --probe

4. Keep an eye out for the exact bad signature

If it happens again, the giveaway will usually be one of these in logs:

  • database disk image is malformed
  • btreeInitPage() returns error code 11
  • repeated QMD startup failures
  • gateway exiting/restarting every few minutes

So the short version is: stable for 15 to 20 min + clean integrity_check + healthy gateway status.

digital dirge
#

totaluser@Total-Hardware:~$ openclaw gateway stop

for db in ~/.openclaw/agents/*/qmd/xdg-cache/qmd/index.sqlite; do
echo "== $db =="
sqlite3 "$db" 'PRAGMA integrity_check;'
done

openclaw gateway start

🦞 OpenClaw 2026.4.11 (769908e)
I'm not magic—I'm just extremely persistent with retries and coping strategies.

Stopped systemd service: openclaw-gateway.service
== /home/totaluser/.openclaw/agents/alfred/qmd/xdg-cache/qmd/index.sqlite ==
ok
== /home/totaluser/.openclaw/agents/cipher/qmd/xdg-cache/qmd/index.sqlite ==
ok
== /home/totaluser/.openclaw/agents/forge/qmd/xdg-cache/qmd/index.sqlite ==
ok
== /home/totaluser/.openclaw/agents/main/qmd/xdg-cache/qmd/index.sqlite ==
ok
== /home/totaluser/.openclaw/agents/rush/qmd/xdg-cache/qmd/index.sqlite ==
ok
== /home/totaluser/.openclaw/agents/scout/qmd/xdg-cache/qmd/index.sqlite ==
ok
== /home/totaluser/.openclaw/agents/sentinel/qmd/xdg-cache/qmd/index.sqlite ==
ok
== /home/totaluser/.openclaw/agents/vital/qmd/xdg-cache/qmd/index.sqlite ==
ok

🦞 OpenClaw 2026.4.11 (769908e)
I autocomplete your thoughts—just slower and with more API calls.

Restarted systemd service: openclaw-gateway.service
totaluser@Total-Hardware:~$

dense wigeonBOT
#

Perfect. That is the confirmation you wanted.

Every QMD DB now returns ok, including Scout, so the corrupted index has been rebuilt cleanly.

At this point I’d call the QMD corruption issue repaired. The only last sanity check is time:

  • if the gateway stays up for another 10 to 15 minutes
  • and openclaw gateway status still shows RPC probe: ok

then you’re basically in the clear.

I’d still submit the bug report, because the recovery is good, but the original behavior is still a real bug.

If this fixed it, please right-click the solution message → Apps → ✅ Mark Solution so Answer Overflow indexes it for the next person. 🎉

digital dirge
#

ok im still getting crashes this is what my agent says now

[4/13/26 5:10 PM] Total: Found it. The crash is:

Error: database disk image is malformed
at migrateLegacyOwnerColumns (task-registry-C2FGQI1O.js:1044:5)
at ensureSchema (task-registry-C2FGQI1O.js:1099:2)
at openTaskRegistryDatabase
at withWriteTransaction
at upsertTaskWithDeliveryState
at updateTask
at markTaskLostById
at markTaskLost
at runTaskRegistryMaintenance

This is the task registry database (/home/totaluser/.openclaw/tasks/runs.sqlite) — not the QMD databases. It's crashing every ~6 minutes when the task registry maintenance job tries to mark a task as lost and hits a corrupted SQLite file.

Let me check it:
[4/13/26 5:10 PM] Total: The database file exists (6MB, modified 05:28 today). sqlite_master query returns 12, so the database opens. Let me check for WAL files:
[4/13/26 5:11 PM] Total: No WAL files. The database is internally corrupted — SELECT count(*) works but specific operations like migrateLegacyOwnerColumns fail.

Root cause: /home/totaluser/.openclaw/tasks/runs.sqlite is corrupted — the OpenClaw task registry database. Every ~6 minutes, the maintenance job runs and crashes the gateway.

Fix: Delete the corrupted database. OpenClaw will rebuild it fresh (task history is in Mission Control DB, not here).

dense wigeonBOT
#

Yep, this new diagnosis is plausible, and it is a different corruption from the QMD one.

A couple important corrections though:

  • ~/.openclaw/tasks/runs.sqlite is the task registry store, per docs.
  • So if you wipe it, you will lose task records/status history stored there. You should move it aside, not delete it.

The timing also fits pretty well with the task sweeper behavior in the docs: maintenance runs every 60s, and “lost task” reconciliation kicks in after about 5 minutes. That matches the delayed crash pattern.
Docs: https://docs.openclaw.ai/automation/tasks

I’d do this next:

openclaw gateway stop
sqlite3 ~/.openclaw/tasks/runs.sqlite 'PRAGMA integrity_check;'
#

If that does not return ok, or if logs keep pointing at task-registry corruption, then quarantine it:

mv ~/.openclaw/tasks/runs.sqlite ~/.openclaw/tasks/runs.sqlite.bad.$(date +%s)
mv ~/.openclaw/tasks/runs.sqlite-wal ~/.openclaw/tasks/runs.sqlite-wal.bad.$(date +%s) 2>/dev/null || true
mv ~/.openclaw/tasks/runs.sqlite-shm ~/.openclaw/tasks/runs.sqlite-shm.bad.$(date +%s) 2>/dev/null || true

openclaw gateway start
openclaw gateway status
openclaw logs --follow

So, short version:

  • QMD corruption was real and fixed
  • task registry corruption may be the second crash source
  • do not delete first
#
  • move runs.sqlite aside if integrity_check fails

If you paste the PRAGMA integrity_check; result for ~/.openclaw/tasks/runs.sqlite, I can tell you whether to pull the trigger on that move.

digital dirge
#

totaluser@Total-Hardware:~$ openclaw gateway stop
sqlite3 ~/.openclaw/tasks/runs.sqlite 'PRAGMA integrity_check;'

🦞 OpenClaw 2026.4.11 (769908e)
Open source means you can see exactly how I judge your config.

Stopped systemd service: openclaw-gateway.service
in database main
Tree 2 page 1290 cell 455: 2nd reference to page 1002
Tree 2 page 1290 cell 454: 2nd reference to page 997
Tree 2 page 1290 cell 453: 2nd reference to page 995
Tree 2 page 1290 cell 452: 2nd reference to page 994
Tree 2 page 1290 cell 450: 2nd reference to page 1001
Tree 2 page 1290 cell 448: Rowid 4866 out of order
Tree 2 page 1290 cell 446: 2nd reference to page 991
Tree 2 page 696 cell 12: 2nd reference to page 1011
Tree 2 page 696 cell 11: 2nd reference to page 1010
Tree 2 page 696 cell 10: 2nd reference to page 1009
Tree 2 page 696 cell 8: 2nd reference to page 1006
Tree 2 page 696 cell 5: 2nd reference to page 1003
Tree 8 page 8 cell 15: 2nd reference to page 1411
Page 361: never used
wrong (hashtag is here but i have to remove it for discord rules) of entries in index idx_task_runs_parent_flow_id
wrong of entries in index idx_task_runs_child_session_key
wrong of entries in index idx_task_runs_owner_key
wrong of entries in index idx_task_runs_last_event_at
wrong of entries in index idx_task_runs_cleanup_after
wrong of entries in index idx_task_runs_runtime_status
wrong of entries in index idx_task_runs_status
wrong of entries in index idx_task_runs_run_id
wrong of entries in index sqlite_autoindex_task_runs_1
row 15 missing from index idx_task_runs_child_session_key
row 15 missing from index idx_task_runs_owner_key
row 15 missing from index idx_task_runs_run_id
row 15 missing from index sqlite_autoindex_task_runs_1
this pattern continues all the way down to row 37

dense wigeonBOT
#

Yep, that is definitely corrupted. I’d move it aside now.

Since PRAGMA integrity_check; failed on ~/.openclaw/tasks/runs.sqlite, this is not a false alarm. The next step is:

mv ~/.openclaw/tasks/runs.sqlite ~/.openclaw/tasks/runs.sqlite.bad.$(date +%s)
mv ~/.openclaw/tasks/runs.sqlite-wal ~/.openclaw/tasks/runs.sqlite-wal.bad.$(date +%s) 2>/dev/null || true
mv ~/.openclaw/tasks/runs.sqlite-shm ~/.openclaw/tasks/runs.sqlite-shm.bad.$(date +%s) 2>/dev/null || true

openclaw gateway start
openclaw gateway status
openclaw logs --follow

Important caveat:

  • this should let OpenClaw rebuild the task registry cleanly
#
  • but you will lose the task records/status history stored in that DB

Docs for where task records live: https://docs.openclaw.ai/automation/tasks

Also, I’m a little worried now, because you’ve hit two separate SQLite corruptions (scout QMD and tasks/runs.sqlite). Once the gateway is stable, I would also check for an underlying disk / filesystem / unclean-shutdown problem, not just OpenClaw itself.

For now though, yes, I’d proceed with moving runs.sqlite aside. If you paste the new gateway status after restart, I’ll sanity check it.

digital dirge
#

totaluser@Total-Hardware:~$ mv ~/.openclaw/tasks/runs.sqlite ~/.openclaw/tasks/runs.sqlite.bad.$(date +%s)
mv ~/.openclaw/tasks/runs.sqlite-wal ~/.openclaw/tasks/runs.sqlite-wal.bad.$(date +%s) 2>/dev/null || true
mv ~/.openclaw/tasks/runs.sqlite-shm ~/.openclaw/tasks/runs.sqlite-shm.bad.$(date +%s) 2>/dev/null || true

openclaw gateway start
openclaw gateway status

🦞 OpenClaw 2026.4.11 (769908e)
Your terminal just grew claws—type something and let the bot pinch the busywork.

Restarted systemd service: openclaw-gateway.service

🦞 OpenClaw 2026.4.11 (769908e)
I don't have opinions about tabs vs spaces. I have opinions about everything else.

│
â—‡
Service: systemd (enabled)
File logs: /tmp/openclaw/openclaw-2026-04-13.log
Command: /home/linuxbrew/.linuxbrew/opt/node/bin/node /home/totaluser/.npm-global/lib/node_modules/openclaw/dist/entry.js gateway --port 18789
Service file: ~/.config/systemd/user/openclaw-gateway.service
Service env: OPENCLAW_GATEWAY_PORT=18789

Config (cli): ~/.openclaw/openclaw.json
Config (service): ~/.openclaw/openclaw.json

Gateway: bind=loopback (127.0.0.1), port=18789 (service args)
Probe target: ws://127.0.0.1:18789
Dashboard: http://127.0.0.1:18789/
Probe note: Loopback-only gateway; only local clients can connect.

Runtime: running (pid 2083457, state active, sub running, last exit 0, reason 0)
Warm-up: launch agents can take a few seconds. Try again shortly.
RPC probe: failed
RPC target: ws://127.0.0.1:18789
gateway closed (1006):

Gateway port 18789 is not listening (service appears running).
Logs: journalctl --user -u openclaw-gateway.service -n 200 --no-pager

Troubles: run openclaw status
Troubleshooting: https://docs.openclaw.ai/troubleshooting
totaluser@Total-Hardware:~$

dense wigeonBOT
#

This one may just be another too-early status check.

The clue is right there:

  • Warm-up: launch agents can take a few seconds. Try again shortly.

You hit gateway status immediately after restart, and earlier we already saw this exact race.

I’d do this next:

sleep 15
openclaw gateway status

If it still says RPC probe: failed after that, then grab the newest service log again:

#
journalctl --user -u openclaw-gateway.service -n 120 --no-pager

What I’m looking for is whether it reaches a line like:

  • [gateway] ready

If yes, the task-registry move worked and the first probe was just early.
If no, then there is still a third startup issue and we need the fresh log block.

So, not panicking yet. Give it 15 seconds, rerun status, then paste the result.

digital dirge
#

totaluser@Total-Hardware:~$ sleep 15
openclaw gateway status

🦞 OpenClaw 2026.4.11 (769908e)
I can't fix your code taste, but I can fix your build and your backlog.

│
â—‡
Service: systemd (enabled)
File logs: /tmp/openclaw/openclaw-2026-04-13.log
Command: /home/linuxbrew/.linuxbrew/opt/node/bin/node /home/totaluser/.npm-global/lib/node_modules/openclaw/dist/entry.js gateway --port 18789
Service file: ~/.config/systemd/user/openclaw-gateway.service
Service env: OPENCLAW_GATEWAY_PORT=18789

Config (cli): ~/.openclaw/openclaw.json
Config (service): ~/.openclaw/openclaw.json

Gateway: bind=loopback (127.0.0.1), port=18789 (service args)
Probe target: ws://127.0.0.1:18789
Dashboard: http://127.0.0.1:18789/
Probe note: Loopback-only gateway; only local clients can connect.

Runtime: running (pid 2083457, state active, sub running, last exit 0, reason 0)
RPC probe: ok

Listening: 127.0.0.1:18789
Troubles: run openclaw status
Troubleshooting: https://docs.openclaw.ai/troubleshooting
totaluser@Total-Hardware:~$

dense wigeonBOT
#

Nice, that looks healthy again.

So at this point:

  • Scout QMD corruption was real
  • task registry corruption was also real
  • after moving both bad SQLite stores aside, the gateway is back to:
    • RPC probe: ok
    • Listening: 127.0.0.1:18789

I’d still watch it for one more old crash window, about 10 to 15 minutes, but this is the best signal so far that the restart loop is actually gone.

One thing I’d strongly recommend after the immediate fire is out: check for an underlying disk / filesystem / unclean shutdown issue, because hitting two separate SQLite corruptions is a little suspicious.

And I’d update the GitHub bug report to mention both corrupt DBs:

  • ~/.openclaw/agents/scout/qmd/xdg-cache/qmd/index.sqlite
#
  • ~/.openclaw/tasks/runs.sqlite

Glad we finally pinned it down. 🎉
If this fixed it, please right-click the solution message → Apps → ✅ Mark Solution so Answer Overflow can index it.

digital dirge
#

how can i fix these doctor warnings

i will send you two messages wait for the second one

  • channels.telegram: Telegram is in first-time setup mode. DMs are │
    │ open. Group messages stay blocked until you add allowed chats under │
    │ channels.telegram.groups (and optional sender IDs under │
    │ channels.telegram.groupAllowFrom), or set │
    │ channels.telegram.groupPolicy to "open" if you want broad group │
    │ access. │
    │ - channels.telegram.accounts.default: Telegram is in first-time setup │
    │ mode. DMs are open. Group messages stay blocked until you add │
    │ allowed chats under channels.telegram.accounts.default.groups (and │
    │ optional sender IDs under │
    │ channels.telegram.accounts.default.groupAllowFrom), or set │
    │ channels.telegram.accounts.default.groupPolicy to "open" if you want │
    │ broad group access. │
    │ - channels.telegram.accounts.forge: Telegram is in first-time setup │
    │ mode. DMs are open. Group messages stay blocked until you add │
    │ allowed chats under channels.telegram.accounts.forge.groups (and │
    │ optional sender IDs under │
    │ channels.telegram.accounts.forge.groupAllowFrom), or set │
    │ channels.telegram.accounts.forge.groupPolicy to "open" if you want │
    │ broad group access. │
    │ - channels.discord.groupPolicy is "allowlist" but groupAllowFrom (and │
    │ allowFrom) is empty — all group messages will be silently dropped. │
    │ Add sender IDs to channels.discord.groupAllowFrom or │
    │ channels.discord.allowFrom, or set groupPolicy to "open". │
#

│ - channels.discord.accounts.total.groupPolicy is "allowlist" but │
│ groupAllowFrom (and allowFrom) is empty — all group messages will be │
│ silently dropped. Add sender IDs to │
│ channels.discord.accounts.total.groupAllowFrom or │
│ channels.discord.accounts.total.allowFrom, or set groupPolicy to │
│ "open". │
│ - channels.discord.accounts.forge.groupPolicy is "allowlist" but │
│ groupAllowFrom (and allowFrom) is empty — all group messages will be │
│ silently dropped. Add sender IDs to │
│ channels.discord.accounts.forge.groupAllowFrom or │
│ channels.discord.accounts.forge.allowFrom, or set groupPolicy to │
│ "open". │
│ - channels.discord.accounts.rush.groupPolicy is "allowlist" but │
│ groupAllowFrom (and allowFrom) is empty — all group messages will be │
│ silently dropped. Add sender IDs to │
│ channels.discord.accounts.rush.groupAllowFrom or │
│ channels.discord.accounts.rush.allowFrom, or set groupPolicy to │
│ "open". │
│ - channels.discord.accounts.default.groupPolicy is "allowlist" but │
│ groupAllowFrom (and allowFrom) is empty — all group messages will be │
│ silently dropped. Add sender IDs to │
│ channels.discord.accounts.default.groupAllowFrom or │
│ channels.discord.accounts.default.allowFrom, or set groupPolicy to │
│ "open". │
│ sorry 3 parts theres one more

#

channels.discord.accounts.cipher.groupPolicy is "allowlist" but │
│ groupAllowFrom (and allowFrom) is empty — all group messages will be │
│ silently dropped. Add sender IDs to │
│ channels.discord.accounts.cipher.groupAllowFrom or │
│ channels.discord.accounts.cipher.allowFrom, or set groupPolicy to │
│ "open".

dense wigeonBOT
#

Those warnings are mostly saying, “your group access policy is implicit / half-configured.”

The clean fix depends on what you actually want.

My recommendation, safest

If you only want DMs and do not want the bot active in groups, make that explicit:

{
  channels: {
    telegram: {
      groupPolicy: "disabled",
      accounts: {
        default: { groupPolicy: "disabled" },
        forge: { groupPolicy: "disabled" },
      },
#
    },
    discord: {
      groupPolicy: "disabled",
      accounts: {
        default: { groupPolicy: "disabled" },
        total: { groupPolicy: "disabled" },
        forge: { groupPolicy: "disabled" },
        rush: { groupPolicy: "disabled" },
        cipher: { groupPolicy: "disabled" },
      },
    },
  },
}

That should silence the “groups will be dropped” warnings, because now you are explicitly saying “yes, block groups.”

#

If you do want Telegram groups

Keep groupPolicy: "allowlist", then add:

  • which groups are allowed, under channels.telegram.groups
  • which users are allowed to trigger, via groupAllowFrom or allowFrom

Docs-backed pattern:

{
  channels: {
    telegram: {
      allowFrom: ["tg:123456789"],
      groupPolicy: "allowlist",
      groups: {
        "-1001234567890": { requireMention: true },
#
      },
      // optional, if omitted Telegram falls back to allowFrom
      groupAllowFrom: ["tg:123456789"],
    },
  },
}

Telegram docs: https://docs.openclaw.ai/channels/telegram

If you do want Discord guild/channel messages

For Discord, the allowlist is mainly the guild/channel config, not Telegram-style group IDs.

Docs-backed pattern:

#
{
  channels: {
    discord: {
      groupPolicy: "allowlist",
      guilds: {
        "123456789012345678": {
          channels: {
            help: { allow: true, requireMention: true },
            general: { allow: true },
          },
          // optional extra sender restriction
          users: ["987654321098765432"],
        },
      },
    },
#
  },
}

Discord docs: https://docs.openclaw.ai/channels/discord
Groups docs: https://docs.openclaw.ai/channels/groups

What I would not recommend

groupPolicy: "open" unless you really mean it. Docs are pretty clear that open groups are a last resort.

Short version

Pick one of these for each warned account/channel:

  1. DM only → groupPolicy: "disabled"
  2. Specific groups only → groupPolicy: "allowlist" + real allowlist config
  3. Anything goes → groupPolicy: "open" (not recommended)
#

If you want, paste the relevant channels.telegram and channels.discord blocks from your config, and I can tell you the smallest exact edit to make.

digital dirge
#

i want discord servers, groups, and channels to be allowed. i dont really use telegram groups but thats fine.

dense wigeonBOT
#

Got it.

Then I’d configure it like this:

Telegram, disable groups

Since you do not really use Telegram groups, make that explicit:

{
  channels: {
    telegram: {
      groupPolicy: "disabled",
      accounts: {
        default: { groupPolicy: "disabled" },
        forge: { groupPolicy: "disabled" },
      },
#
    },
  },
}

Discord, two ways

Recommended

Allow only the Discord servers you actually want, but allow all channels in them unless you restrict channels explicitly.

{
  channels: {
    discord: {
      groupPolicy: "allowlist",
#
      guilds: {
        "YOUR_GUILD_ID_1": {},
        "YOUR_GUILD_ID_2": {},
      },
      accounts: {
        default: { groupPolicy: "allowlist" },
        total: { groupPolicy: "allowlist" },
        forge: { groupPolicy: "allowlist" },
        rush: { groupPolicy: "allowlist" },
        cipher: { groupPolicy: "allowlist" },
      },
    },
  },
}
#

Per the Discord docs, if a guild is allowlisted and has no channels block, then all channels in that guild are allowed.

Broadest

If you really mean “any Discord server/channel this bot account can see,” set Discord groups open:

{
  channels: {
    discord: {
      groupPolicy: "open",
      accounts: {
        default: { groupPolicy: "open" },
        total: { groupPolicy: "open" },
        forge: { groupPolicy: "open" },
        rush: { groupPolicy: "open" },
        cipher: { groupPolicy: "open" },
#

If you want, paste your current channels.discord block and I’ll turn it into the exact final JSON5 for your case.

digital dirge
#

give me all the commands to paste and give you output of, then build me config set commands to make it correct

dense wigeonBOT
#

Yep. Paste the output of these read-only commands, and I’ll turn that into exact openclaw config set commands for you.

Please redact token values before pasting.

openclaw --version
openclaw config file
openclaw config get channels.telegram --json
openclaw config get channels.discord --json

If one of those config get commands says the path is missing or null, paste that too.

Why these four:

  • --version makes sure I target the right config behavior
  • config file confirms which file is live
#
  • channels.telegram tells me whether to disable groups at root, per-account, or both
  • channels.discord tells me whether to set root policy, account overrides, guild config, or all three

Once you paste those, I’ll give you a copy-paste block of openclaw config set ... commands that matches your actual config.

digital dirge
#

guilds are redacted with x
OpenClaw 2026.4.12 (1c0672b)

🦞 OpenClaw 2026.4.12 (1c0672b)
I don't just autocomplete—I auto-commit (emotionally), then ask you to review (logically).

~/.openclaw/openclaw.json
{
"enabled": true,
"dmPolicy": "open",
"allowFrom": [
""
],
"groupPolicy": "allowlist",
"streaming": {
"mode": "off"
},
"accounts": {
"default": {
"dmPolicy": "open",
"botToken": "OPENCLAW_REDACTED",
"allowFrom": [
"
"
],
"groupPolicy": "allowlist",
"streaming": {
"mode": "partial"
}
},
"forge": {
"dmPolicy": "open",
"botToken": "OPENCLAW_REDACTED",
"allowFrom": [
"*"
],
"groupPolicy": "allowlist",
"streaming": {
"mode": "partial"
}
}
}
}
{
"enabled": true,
"allowBots": true,
"groupPolicy": "allowlist",
"streaming": {
"mode": "off"
},
"accounts": {
"total": {
"token": "OPENCLAW_REDACTED",
"groupPolicy": "allowlist",
"streaming": {
"mode": "off"
},
"guilds": {
"x": {
"requireMention": true
}
}
},
"forge": {
"token": "OPENCLAW_REDACTED",
"groupPolicy": "allowlist",
"streaming": {
"mode": "off"
},
"guilds": {
"x": {
"requireMention": true
}
}
},
"rush": {
"token": "OPENCLAW_REDACTED",
"groupPolicy": "allowlist",
"streaming": {
"mode": "off"
},
"guilds": {
"x": {
"requireMention": true
}
}
},
"default": {
"groupPolicy": "allowlist",
"streaming": {
"mode": "off"
}
},
"cipher": {
"token": "OPENCLAW_REDACTED",
"groupPolicy": "allowlist",
"streaming": {
"mode": "off"
},
"guilds": {
"x": {
"requireMention": true
}
}
}
}
}

dense wigeonBOT
#

Yep. Based on what you said, the cleanest config is:

  • Telegram groups: explicitly disabled
  • Discord guild/group/channel access: explicitly open

That matches your preference and should clear those warnings without needing guild IDs.

Paste this exactly:

openclaw config set channels.telegram.groupPolicy disabled
openclaw config set channels.telegram.accounts.default.groupPolicy disabled
openclaw config set channels.telegram.accounts.forge.groupPolicy disabled

openclaw config set channels.discord.groupPolicy open
openclaw config set channels.discord.accounts.default.groupPolicy open
#
openclaw config set channels.discord.accounts.total.groupPolicy open
openclaw config set channels.discord.accounts.forge.groupPolicy open
openclaw config set channels.discord.accounts.rush.groupPolicy open
openclaw config set channels.discord.accounts.cipher.groupPolicy open

openclaw config validate
openclaw gateway restart
openclaw doctor

What this does:

  • Telegram will stop pretending groups might be configured later
  • Discord will allow guild/server/channel traffic broadly
  • your existing Discord guilds blocks can stay there, but they are no longer the gate once groupPolicy is open
digital dirge
#

can you enable telegram groups

dense wigeonBOT
#

Yes. If you want Telegram groups broadly enabled, set the Telegram group policy to open:

openclaw config set channels.telegram.groupPolicy open
openclaw config set channels.telegram.accounts.default.groupPolicy open
openclaw config set channels.telegram.accounts.forge.groupPolicy open

openclaw config validate
openclaw gateway restart
openclaw doctor

That is the simplest way to enable Telegram groups and clear those setup warnings.

One caution, because I don’t want to oversell it: groupPolicy: "open" is broad. If the bot is in a Telegram group, that group can trigger it. If you want the safer version later, we can switch to:

  • groupPolicy: "allowlist"
digital dirge
#

sorry can you give me the entire config set for what we talked about

dense wigeonBOT
#

Yep, here’s the final full copy-paste block for the version we ended up on:

  • Telegram groups enabled
  • Discord guilds/groups/channels enabled
  • no token changes, just policy cleanup
openclaw config set channels.telegram.groupPolicy open
openclaw config set channels.telegram.accounts.default.groupPolicy open
openclaw config set channels.telegram.accounts.forge.groupPolicy open

openclaw config set channels.discord.groupPolicy open
openclaw config set channels.discord.accounts.default.groupPolicy open
openclaw config set channels.discord.accounts.total.groupPolicy open
openclaw config set channels.discord.accounts.forge.groupPolicy open
openclaw config set channels.discord.accounts.rush.groupPolicy open
#
openclaw config set channels.discord.accounts.cipher.groupPolicy open

openclaw config validate
openclaw gateway restart
openclaw doctor

That should make the config match what you asked for.

Small warning, just so it’s explicit: groupPolicy: "open" is broad. If those bot accounts are in a Discord server or Telegram group, they can respond there. If you want, we can tighten it later to allowlist mode once the fire drill is over.

If you paste the openclaw doctor output after running this, I’ll confirm whether the warnings are cleaned up.

digital dirge
#

how do i fix this doctor message

  • Found 3 agent directories on disk without a matching agents.list │
    │ entry. │
    │ These agents can still have sessions/auth state on disk, but │
    │ config-driven routing, identity, and model selection will ignore │
    │ them. │
    │ Examples: concierge, jonathan, total │
    │ Restore the missing agents.list entries or remove stale dirs after │
    │ confirming they are no longer needed: ~/.openclaw/agents │
    │ - 2/5 recent sessions are missing transcripts. │
    │ Verify sessions in store: openclaw sessions --store │
    │ "/home/totaluser/.openclaw/agents/main/sessions/sessions.json" │
    │ Preview cleanup impact: openclaw sessions cleanup --store │
    │ "/home/totaluser/.openclaw/agents/main/sessions/sessions.json" │
    │ --dry-run │
    │ Prune missing entries: openclaw sessions cleanup --store │
    │ "/home/totaluser/.openclaw/agents/main/sessions/sessions.json" │
    │ --enforce --fix-missing
dense wigeonBOT
#

Those are mostly state cleanup warnings, not “gateway is broken” warnings.

1. Orphaned agent dirs

You have on-disk agent folders that are not in agents.list:

  • concierge
  • jonathan
  • total

Fix path A, keep them

If you still want those agents, add them back to agents.list.

First inspect what is configured vs on disk:

openclaw config get agents.list --json
#
ls -1 ~/.openclaw/agents

Fix path B, archive them

If they are stale, I would move them aside, not delete them:

backup="$HOME/.openclaw/agents-orphans-$(date +%s)"
mkdir -p "$backup"
mv ~/.openclaw/agents/concierge "$backup"/
mv ~/.openclaw/agents/jonathan "$backup"/
mv ~/.openclaw/agents/total "$backup"/

Then rerun:

#
openclaw doctor

2. Sessions missing transcripts

This one has a docs-backed cleanup path already.

Verify

openclaw sessions --store "/home/totaluser/.openclaw/agents/main/sessions/sessions.json"

Preview cleanup

openclaw sessions cleanup --store "/home/totaluser/.openclaw/agents/main/sessions/sessions.json" --dry-run
#

Fix it

openclaw sessions cleanup --store "/home/totaluser/.openclaw/agents/main/sessions/sessions.json" --enforce --fix-missing

Then rerun:

openclaw doctor

My recommendation

  • If concierge, jonathan, and total are old, archive them
  • For the missing transcripts warning, run the sessions cleanup --enforce --fix-missing command exactly as doctor suggested

Docs: https://docs.openclaw.ai/cli/sessions

#

If you want, paste the output of:

openclaw config get agents.list --json
ls -1 ~/.openclaw/agents

and I can tell you whether those three agent dirs look safe to archive.

digital dirge
#

those 3 agents shouldnt exist =

#

ok help me fix this

│ │
│ - Heartbeat agent "alfred": heartbeat delivery is configured while │
│ heartbeat.directPolicy for agent "alfred" is unset. │
│ Heartbeat now allows direct/DM targets by default. Set it explicitly │
│ to "allow" or "block" to pin upgrade behavior. │
│ - Run: openclaw security audit --deep

dense wigeonBOT
#

Then I’d archive them and clean the missing session entries.

Paste this:

backup="$HOME/.openclaw/agents-orphans-$(date +%s)"
mkdir -p "$backup"

mv ~/.openclaw/agents/concierge "$backup"/
mv ~/.openclaw/agents/jonathan "$backup"/
mv ~/.openclaw/agents/total "$backup"/

openclaw sessions cleanup --store "/home/totaluser/.openclaw/agents/main/sessions/sessions.json" --enforce --fix-missing

openclaw doctor
#

I’m recommending move, not delete, so you have an easy rollback if anything weird shows up.

If doctor comes back clean after that, you’re good. If not, paste the remaining warnings and I’ll help trim the rest.

#

That one is easy.

If Alfred is supposed to send heartbeat messages to a DM / direct target, pin it explicitly to allow:

openclaw config set agents.defaults.heartbeat.directPolicy allow
openclaw config validate
openclaw gateway restart
openclaw security audit --deep

If instead you want heartbeats to run but never send to direct/DM targets, use block instead:

openclaw config set agents.defaults.heartbeat.directPolicy block
openclaw config validate
#
openclaw gateway restart
openclaw security audit --deep

My recommendation

Because the warning says heartbeat delivery is configured for Alfred, I’d use:

openclaw config set agents.defaults.heartbeat.directPolicy allow

Docs: https://docs.openclaw.ai/gateway/heartbeat

If you want, paste the openclaw security audit --deep output after that and I’ll help clean up anything left.

digital dirge
#

what about tthis

21:00:55 [skills] Skipping escaped skill path outside its configured root: source=openclaw-managed root=~/.openclaw/skills reason=symlink-escape requested=~/.openclaw/skills/clawdcursor resolved=~/clawd-cursor
21:00:55 [skills] Skipping escaped skill path outside its configured root: source=openclaw-workspace root=~/.openclaw/workspace/skills reason=symlink-escape requested=~/.openclaw/workspace/skills/clawdcursor resolved=~/clawd-cursor

dense wigeonBOT
#

That one is not a crash, it is a safety guard.

OpenClaw found a skill entry here:

  • ~/.openclaw/skills/clawdcursor
  • ~/.openclaw/workspace/skills/clawdcursor

but those are symlinks that resolve to:

  • ~/clawd-cursor

which is outside the allowed skill root, so OpenClaw skips them on purpose.

If you want to keep the skill

Put the real skill directory inside an allowed skills folder, not as an escaping symlink.

For example:

#
mkdir -p ~/.openclaw/workspace/skills
cp -a ~/clawd-cursor ~/.openclaw/workspace/skills/clawdcursor
openclaw gateway restart
openclaw skills list

If you do not need it

Just move the symlinks aside so the warning stops:

mv ~/.openclaw/skills/clawdcursor ~/.openclaw/skills/clawdcursor.bak 2>/dev/null || true
mv ~/.openclaw/workspace/skills/clawdcursor ~/.openclaw/workspace/skills/clawdcursor.bak 2>/dev/null || true
openclaw gateway restart

Short version

#
  • safe and intentional
  • not fatal
  • fix by copying the skill into ~/.openclaw/workspace/skills/...
  • or archiving the symlink

Docs: https://docs.openclaw.ai/tools/creating-skills

If you want, I can give you the exact command for the “keep the skill” path or the “remove the warning” path.

digital dirge
#

how ab out this one

21:01:03 [plugins] plugins.allow is empty; discovered non-bundled plugins may auto-load: lossless-claw (/home/totaluser/.openclaw/extensions/lossless-claw/dist/index.js). Set plugins.allow to explicit trusted ids.

#

?