my agent said this
Found the root cause. The gateway is crashing every ~6 minutes with status code 1 (FAILURE). The restart counter is at 131 as of this morning.
Root cause: Corrupted QMD sqlite database
SqliteError: database disk image is malformed
at qmd store.js:1878
at listCollections → updateCollections
Every time qmd (the memory system) tries to boot, it hits a corrupted sqlite database and crashes the gateway. It keeps restarting, hitting the corrupted DB, crashing again. The pattern is every ~6 minutes.
Impact on messages: Yes — if you sent Rush a message during the crash window (between 20:02 and 20:25), it was dropped. That's the exact same pattern.
Fix options:
- Rebuild qmd database — find which sqlite file(s) are corrupted and restore/rebuild them
- Wipe qmd data — delete the corrupted sqlite files and let qmd rebuild from scratch (loses memory)
- Fix the underlying cause — stop the corruption from happening in the first place
The qmd databases are at:
• /home/totaluser/.openclaw/agents/*/qmd/xdg-cache/qmd/index.sqlite (one per agent)
One of those is corrupt. Do you want me to identify which one and fix it?
is said yes figure it out and he said this
The WAL files are 4-5MB each - active uncommitted changes that can't be checkpointed because the gateway keeps crashing before finishing. This is the root cause: every ~6 minutes the gateway crashes, the WAL is left corrupted, then it restarts and tries to read the corrupted WAL.
Fix: Stop the gateway, delete the WAL and SHM files,