#multiple state directories. how to resolve via terminal?

420 messages ยท Page 1 of 1 (latest)

digital sundial
#

ran openclaw doctor --fix and saw this:

โ—‡  State integrity โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚                                                                          โ”‚
โ”‚  - Multiple state directories detected. This can split session history.  โ”‚
โ”‚    - ~/.openclaw                                                         โ”‚
โ”‚    Active state dir: /data/.openclaw                                     โ”‚
โ”‚                                                                          โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
โ”‚```
pallid pagodaBOT
sweet bobcatBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> โœ… Mark Solution

worthy gobletBOT
#

That warning means OpenClaw sees two state dirs and only one is active. You want one canonical dir, not both.

Since yours says:

  • extra: ~/.openclaw
  • active: /data/.openclaw

the usual fix is to keep /data/.openclaw and archive the old ~/.openclaw.

openclaw --version
du -sh ~/.openclaw /data/.openclaw 2>/dev/null

openclaw gateway stop

ts=$(date +%Y%m%d-%H%M%S)
#
[ -d ~/.openclaw ] && mv ~/.openclaw ~/.openclaw.backup-$ts

# if you use an installed gateway service, re-install it from the same env/shell
export OPENCLAW_STATE_DIR=/data/.openclaw
openclaw gateway install --force

openclaw doctor
openclaw gateway start
openclaw status

A couple notes:

  • Do not delete first, just rename/archive.
  • I would not blindly rsync/merge the two state dirs unless you know both contain needed data.
  • If you actually want ~/.openclaw to be the real one, do the reverse: unset/change OPENCLAW_STATE_DIR, then reinstall the gateway service from that same shell.
digital sundial
#

ok want to double check before running

this: ```โ”‚
โ—‡ State integrity โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ โ”‚
โ”‚ - Multiple state directories detected. This can split session history. โ”‚
โ”‚ - ~/.openclaw โ”‚
โ”‚ Active state dir: /data/.openclaw โ”‚
โ”‚ โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
โ”‚


was AFTER running:

```openclaw sessions cleanup --store "/data/.openclaw/agents/main/sessions sessions.json" --enforce --fix-missing```

because i got:

```โ”‚
โ—‡  State integrity โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚                                                                          โ”‚
โ”‚  - Multiple state directories detected. This can split session history.  โ”‚
โ”‚    - ~/.openclaw                                                         โ”‚
โ”‚    Active state dir: /data/.openclaw                                     โ”‚
โ”‚  - 2/5 recent sessions are missing transcripts.                          โ”‚
โ”‚    Verify sessions in store: openclaw sessions --store                   โ”‚
โ”‚    "/data/.openclaw/agents/main/sessions/sessions.json"                  โ”‚
โ”‚    Preview cleanup impact: openclaw sessions cleanup --store             โ”‚
โ”‚    "/data/.openclaw/agents/main/sessions/sessions.json" --dry-run        โ”‚
โ”‚    Prune missing entries: openclaw sessions cleanup --store              โ”‚
โ”‚    "/data/.openclaw/agents/main/sessions/sessions.json" --enforce        โ”‚
โ”‚    --fix-missing                                                         โ”‚
โ”‚                                                                          โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
โ”‚```

BUT 
now i get  ^same again so i ran  cleanup --store --enforce --fix-missing again
worthy gobletBOT
#

Yep, those are 2 separate issues.

  • sessions cleanup ... --fix-missing only fixes missing transcript entries
  • it does not fix multiple state directories detected

So if the missing transcripts warning disappeared and only this remains:

- Multiple state directories detected.
  - ~/.openclaw
  Active state dir: /data/.openclaw

then the cleanup probably worked, and you do not need to keep rerunning it.

Also, tiny typo check: your command should use:

#
/data/.openclaw/agents/main/sessions/sessions.json

not:

/data/.openclaw/agents/main/sessions sessions.json

If you want to verify cleanly:

openclaw sessions cleanup --store "/data/.openclaw/agents/main/sessions/sessions.json" --dry-run --fix-missing
openclaw doctor
#

If doctor now only shows the state-dir warning, next step is the separate fix:

openclaw gateway stop
ts=$(date +%Y%m%d-%H%M%S)
[ -d ~/.openclaw ] && mv ~/.openclaw ~/.openclaw.backup-$ts
export OPENCLAW_STATE_DIR=/data/.openclaw
openclaw gateway install --force
openclaw gateway start
openclaw doctor

If you want, paste these 3 outputs before you run the rename and Iโ€™ll sanity-check them:

openclaw --version
digital sundial
#

openclaw sessions cleanup --store "/data/.openclaw/agents/main/sessions/sessions.json" --enforce --fix-missing

๐Ÿฆž OpenClaw 2026.4.9 (0512059)
I'm not AI-powered, I'm AI-possessed. Big difference.

Session store: /data/.openclaw/agents/main/sessions/sessions.json
Applied maintenance. Current entries: 41

now about to run doctor

worthy gobletBOT
#

Nice, that part looks good.

Applied maintenance. Current entries: 41 means the session-store cleanup ran successfully.

Next:

openclaw doctor

What to expect:

  • If the missing transcripts warning is gone, cleanup worked.
  • If the only thing left is multiple state directories detected, then the next fix is just the stale ~/.openclaw dir, not sessions cleanup again.

If doctor comes back with only this:

#
- Multiple state directories detected.
  - ~/.openclaw
  Active state dir: /data/.openclaw

then run:

openclaw gateway stop
ts=$(date +%Y%m%d-%H%M%S)
[ -d ~/.openclaw ] && mv ~/.openclaw ~/.openclaw.backup-$ts
export OPENCLAW_STATE_DIR=/data/.openclaw
openclaw gateway install --force
openclaw gateway start
openclaw doctor
#

Paste the doctor output if you want me to sanity-check the next step before you rename anything.

digital sundial
#
โ”‚                                                                          โ”‚
โ”‚  - Multiple state directories detected. This can split session history.  โ”‚
โ”‚    - ~/.openclaw                                                         โ”‚
โ”‚    Active state dir: /data/.openclaw                                     โ”‚
โ”‚  - 2/5 recent sessions are missing transcripts.                          โ”‚
โ”‚    Verify sessions in store: openclaw sessions --store                   โ”‚
โ”‚    "/data/.openclaw/agents/main/sessions/sessions.json"                  โ”‚
โ”‚    Preview cleanup impact: openclaw sessions cleanup --store             โ”‚
โ”‚    "/data/.openclaw/agents/main/sessions/sessions.json" --dry-run        โ”‚
โ”‚    Prune missing entries: openclaw sessions cleanup --store              โ”‚
โ”‚    "/data/.openclaw/agents/main/sessions/sessions.json" --enforce        โ”‚
โ”‚    --fix-missing                                                         โ”‚
โ”‚  - Found 1 orphan transcript file in                                     โ”‚
โ”‚    /data/.openclaw/agents/main/sessions.                                 โ”‚
โ”‚    These .jsonl files are no longer referenced by sessions.json, so      โ”‚
โ”‚    they are not part of any active session history.                      โ”‚
โ”‚    Doctor can archive them safely by renaming each file to               โ”‚
โ”‚    *.deleted.<timestamp>.                                                โ”‚
โ”‚    Examples: 9523fcfc-c75e-4f56-b1a2-5dd2e6a42295.jsonl                  โ”‚
โ”‚                                                                          โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
โ”‚```
worthy gobletBOT
#

Ah, okay, that changes it a bit.

If doctor still shows:

  • 2/5 recent sessions are missing transcripts
  • Found 1 orphan transcript file

after cleanup, I would not do the ~/.openclaw rename yet.

Iโ€™m a little suspicious the running gateway is rewriting the session store while you clean it.

Try this order instead:

openclaw gateway stop
#
openclaw sessions cleanup \
  --store "/data/.openclaw/agents/main/sessions/sessions.json" \
  --dry-run --fix-missing --json

openclaw sessions cleanup \
  --store "/data/.openclaw/agents/main/sessions/sessions.json" \
  --enforce --fix-missing

openclaw doctor --fix

A few notes:

  • The orphan transcript is separate, and doctor --fix should be able to archive it safely.
  • The multiple state dirs warning is also separate, handle that after the missing-transcript warning is gone.
  • If the dry-run JSON still says missing: 2 even with the gateway stopped, paste that output. Then we know itโ€™s not just a live-write race.
#

So, short version: stop gateway first, then re-run cleanup + doctor. If that clears the missing-transcript warning and leaves only the /data/.openclaw vs ~/.openclaw warning, then we do the state-dir cleanup next.

digital sundial
#

wait i ran doctor --fix again and got:

โ”‚                                                                          โ”‚
โ”‚  - Multiple state directories detected. This can split session history.  โ”‚
โ”‚    - ~/.openclaw                                                         โ”‚
โ”‚    Active state dir: /data/.openclaw                                     โ”‚
โ”‚                                                                          โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
โ”‚```
worthy gobletBOT
#

Perfect, thatโ€™s the good outcome.

That means:

  • the missing transcript issue is gone
  • the orphan transcript issue is gone
  • the only thing left is the separate multiple state dirs warning

So yes, now the next step is just to archive the extra ~/.openclaw dir:

openclaw gateway stop
ts=$(date +%Y%m%d-%H%M%S)
[ -d ~/.openclaw ] && mv ~/.openclaw ~/.openclaw.backup-$ts

export OPENCLAW_STATE_DIR=/data/.openclaw
#
openclaw gateway install --force
openclaw gateway start
openclaw doctor

If doctor comes back clean after that, youโ€™re done ๐ŸŽ‰

If you want to be extra cautious before the move, this is a nice quick check:

du -sh ~/.openclaw /data/.openclaw 2>/dev/null

But at this point, based on the output you pasted, /data/.openclaw is clearly the active one.

digital sundial
#

tell me exactly step by step and also double check the default according to docs too

worthy gobletBOT
#

Yep. I double-checked the docs.

Docs say:

  • OpenClaw stores data under $OPENCLAW_STATE_DIR
  • the default is ~/.openclaw
  • if OPENCLAW_STATE_DIR is set, that path becomes the real state dir

From the docs:

  • FAQ: โ€œEverything lives under $OPENCLAW_STATE_DIR (default: ~/.openclaw)โ€
  • Environment docs: OPENCLAW_STATE_DIR = โ€œOverride the state directory (default ~/.openclaw)โ€

So in your case, /data/.openclaw is active because something is overriding the default, and ~/.openclaw is just the extra stale dir doctor is warning about.

Exact steps

#

1) Optional sanity check

openclaw --version
du -sh ~/.openclaw /data/.openclaw 2>/dev/null
openclaw status --all

2) Stop the gateway

openclaw gateway stop

3) Backup the extra default dir instead of deleting it

ts=$(date +%Y%m%d-%H%M%S)
[ -d ~/.openclaw ] && mv ~/.openclaw ~/.openclaw.backup-$ts
#

4) Make sure your shell uses the active state dir

export OPENCLAW_STATE_DIR=/data/.openclaw

5) If you use the installed gateway service, reinstall it from this same shell

openclaw gateway install --force

6) Start the gateway again

openclaw gateway start

7) Verify

#
openclaw doctor
openclaw gateway status

What success looks like

You should no longer see:

- Multiple state directories detected.
  - ~/.openclaw
  Active state dir: /data/.openclaw

Important note

digital sundial
#

btw this is running in a container via coolify, does this change any steps?

worthy gobletBOT
#

Yep, that does change it a bit.

For Coolify / container:

  • keep /data/.openclaw
  • skip openclaw gateway install --force
    (that step is for systemd/launchd-style installed services, not containers)
  • make sure OPENCLAW_STATE_DIR=/data/.openclaw is set in Coolify env, not just in a temporary shell

Updated steps for Coolify

1) Confirm the env in Coolify
Make sure your app/container has:

OPENCLAW_STATE_DIR=/data/.openclaw
#

If you already see Active state dir: /data/.openclaw, that part is probably already correct.

2) Exec into the container and back up the extra dir

ts=$(date +%Y%m%d-%H%M%S)
[ -d ~/.openclaw ] && mv ~/.openclaw ~/.openclaw.backup-$ts

3) Restart the app/container from Coolify
Use Coolifyโ€™s restart/redeploy, so the container comes back with the same persistent /data mount and env.

4) Exec back in and verify

openclaw doctor
openclaw status --all
#

What changes vs my previous steps

  • remove this step:
    openclaw gateway install --force
    
  • donโ€™t rely on:
    export OPENCLAW_STATE_DIR=/data/.openclaw
    
    unless youโ€™re only testing temporarily inside the current shell. In Coolify, the real source of truth should be the app env config.

Short version

In Coolify, this is basically:

  1. confirm env points to /data/.openclaw
#
  1. rename ~/.openclaw
  2. restart container
  3. run openclaw doctor

So yes, the container part mostly means no service reinstall step.

digital sundial
#

openclaw state dir is /data/.openclaw i just checked now

now what?

worthy gobletBOT
#

Nice, then youโ€™re at the last step.

Since /data/.openclaw is confirmed active, just archive the extra ~/.openclaw dir and restart the container.

In the container shell

ts=$(date +%Y%m%d-%H%M%S)
[ -d ~/.openclaw ] && mv ~/.openclaw ~/.openclaw.backup-$ts

Then restart the app/container in Coolify

After restart, verify

openclaw doctor
openclaw status --all
#

If everything is fixed, the multiple state directories warning should be gone.

So the short answer is:

  1. rename ~/.openclaw
  2. restart container
  3. run openclaw doctor

If you want, paste the post-restart doctor output and Iโ€™ll confirm youโ€™re clean.

digital sundial
#

ts=$(date +%Y%m%d-%H%M%S)
[ -d ~/.openclaw ] && mv ~/.openclaw ~/.openclaw.backup-$ts

now what? also, no output after that, normal?

worthy gobletBOT
#

Yep, no output is normal. That command is silent on success.

If you want to verify the rename happened:

ls -ld ~/.openclaw*

You should see something like:

~/.openclaw.backup-YYYYMMDD-HHMMSS

Next step

Since youโ€™re on Coolify:

#
  1. restart the app/container
  2. exec back in
  3. run:
openclaw doctor
openclaw status --all

What you want to see: the multiple state directories warning is gone.

So yes, youโ€™re basically at the finish line now.

digital sundial
#

ls -ld ~/.openclaw*
drwxr-xr-x 3 root root 4096 Apr 14 12:03 '/root/.openclaw.backup-$'

worthy gobletBOT
#

That backup name is a little weird, but itโ€™s fine.

What matters is: did ~/.openclaw get moved out of the way?

Check that with:

[ -d ~/.openclaw ] && echo "still exists" || echo "~/.openclaw is gone"

If it says:

~/.openclaw is gone

then youโ€™re good, and the weird .backup-$ name is just cosmetic. You can leave it.

#

Then do this

  1. restart the Coolify container/app
  2. exec back in
  3. run:
openclaw doctor
openclaw status --all

If you want, you can also rename the backup to something cleaner later, but donโ€™t bother yet. First confirm the warning is gone after restart.

digital sundial
#

[ -d ~/.openclaw ] && echo "still exists" || echo "~/.openclaw is gone"
~/.openclaw is gone

worthy gobletBOT
#

Perfect. That means the stale ~/.openclaw dir is out of the way. The # is just your shell prompt, totally normal.

Now:

  1. restart the Coolify app/container
  2. exec back in
  3. run:
openclaw doctor
openclaw status --all

At this point, the multiple state directories warning should be gone.

If you paste the doctor output after restart, Iโ€™ll confirm youโ€™re fully clean.

digital sundial
#

should i change that weird back up name so it doesnt get lost? if yes, send cmd

worthy gobletBOT
#

Totally optional, but yeah, if you want a cleaner name:

mv '/root/.openclaw.backup-$' "/root/.openclaw.backup.$(date +%Y%m%d-%H%M%S)"

Then verify:

ls -ld /root/.openclaw.backup.*

A couple notes:

  • the single quotes around the old path matter, so the literal $ is matched correctly
  • this is just housekeeping, not part of the actual fix
#

You can do it now or after the restart, either is fine.

digital sundial
#

mv '/root/.openclaw.backup-$' "/root/.openclaw.backup.$(date +%Y%m%d-%H%M%S)"
ls -ld /root/.openclaw.backup.*
drwxr-xr-x 3 root root 4096 Apr 14 12:03 /root/.openclaw.backup.20260414-122357

worthy gobletBOT
#

Nice, that looks clean now.

Youโ€™ve got:

  • active state dir: /data/.openclaw
  • stale ~/.openclaw moved out of the way
  • backup renamed properly

Last step

Restart the Coolify app/container, then run:

openclaw doctor
openclaw status --all

If the warning is gone, youโ€™re done ๐ŸŽ‰

#

Glad that helped!
If this fixes it, please right-click the solution message โ†’ Apps โ†’ โœ… Mark Solution.
This helps Answer Overflow index the solution for future users.

digital sundial
#

hmm doctor shows:

โ—‡  State integrity โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚                                                                          โ”‚
โ”‚  - Multiple state directories detected. This can split session history.  โ”‚
โ”‚    - ~/.openclaw                                                         โ”‚
โ”‚    Active state dir: /data/.openclaw                                     โ”‚
โ”‚                                                                          โ”‚
โ”œโ”€โ”€โ”€```
worthy gobletBOT
#

Ah, got it, that means something recreated ~/.openclaw after the restart.

So the move worked, but the container/app startup created a new default-state dir again.

Before changing anything else, run these exact checks and paste the output:

echo "HOME=$HOME"
echo "OPENCLAW_STATE_DIR=$OPENCLAW_STATE_DIR"
ls -ld ~/.openclaw /data/.openclaw
readlink -f ~/.openclaw
readlink -f /data/.openclaw
find ~/.openclaw -maxdepth 2 -mindepth 1 | head -50

Quick read on what this means

#

In Coolify, this usually means one of:

  • the startup command is creating ~/.openclaw
  • thereโ€™s a second volume/mount involved
  • the container is running as a different user/home than expected

One small shortcut

If find ~/.openclaw ... shows nothing, then itโ€™s just an empty recreated dir. In that case:

rmdir ~/.openclaw
openclaw doctor

But if it has files in it, paste the output first, because then I want to see what is recreating it.

digital sundial
#

echo "HOME=$HOME"
HOME=/root
echo "OPENCLAW_STATE_DIR=$OPENCLAW_STATE_DIR"
OPENCLAW_STATE_DIR=/data/.openclaw
ls -ld ~/.openclaw /data/.openclaw
drwx------ 23 root root 4096 Apr 14 11:48 /data/.openclaw
drwxr-xr-x 3 root root 4096 Apr 14 12:30 /root/.openclaw
readlink -f ~/.openclaw
/root/.openclaw
readlink -f /data/.openclaw
/data/.openclaw
find ~/.openclaw -maxdepth 2 -mindepth 1 | head -50
/root/.openclaw/lcm.db
/root/.openclaw/lcm.db-wal
/root/.openclaw/lcm.db-shm
/root/.openclaw/qqbot
/root/.openclaw/qqbot/data

worthy gobletBOT
#

Ah, that explains it. You didnโ€™t miss a step.

I checked the installed code, and the QQ Bot extension writes to ~/.openclaw/qqbot under HOME directly. That matches exactly what you pasted:

  • /root/.openclaw/qqbot/...
  • plus lcm.db in /root/.openclaw

So in your Coolify container:

  • your real OpenClaw state is still /data/.openclaw
  • but something, at least QQ Bot, is recreating /root/.openclaw
  • thatโ€™s why doctor keeps warning about multiple state dirs

What I recommend

If you use QQ Bot:
Youโ€™re probably fine to stop here. The gateway is using /data/.openclaw, and this remaining warning looks like a plugin/path behavior, not your main state being wrong.

#

If you do not use QQ Bot:
Then Iโ€™d disable/remove that plugin/config, restart, and check doctor again.

Practical takeaway

At this point, I would not keep fighting the warning blindly. Your core state-dir problem is basically resolved, and the remaining warning is likely caused by QQ Bot writing under /root/.openclaw.

If you want, I can help you do one of these next:

  1. confirm whether QQ Bot is enabled
  2. cleanly disable/remove it if unused
  3. write you a short GitHub issue summary you can file here: https://github.com/openclaw/openclaw/issues

If this fixed the original issue enough for you, please right-click the solution message โ†’ Apps โ†’ โœ… Mark Solution so Answer Overflow can index it.

digital sundial
worthy gobletBOT
#

Yep, thatโ€™s separate from QQ Bot.

I checked the Lossless Claw README, and it does indeed default to:

LCM_DATABASE_PATH=~/.openclaw/lcm.db

So in your container:

  • Lossless Claw is writing lcm.db under /root/.openclaw
  • QQ Bot is writing /root/.openclaw/qqbot/...
  • both contribute to the same doctor warning
  • they are not the same thing

Is something wrong that you need to fix?

#

Not โ€œbrokenโ€, but suboptimal in a container.
If you want LCM data persisted under your real state volume, then yes, I would fix Lossless Claw by setting:

LCM_DATABASE_PATH=/data/.openclaw/lcm.db

in Coolify env, then restart.

Important distinction

  • Lossless Claw: likely fixable by changing LCM_DATABASE_PATH
  • QQ Bot: looks like it writes to ~/.openclaw/qqbot directly, so thatโ€™s a different issue

So yes, LCM is another source of the warning, but not the same as QQ Bot.

If you want, I can give you the exact safe migration steps for moving lcm.db into /data/.openclaw without losing the existing LCM history.

digital sundial
#

should i move the files first and then change the env var or what?

worthy gobletBOT
#

Yep, safest order is:

  1. stop the app/container
  2. copy the LCM DB files into /data/.openclaw
  3. set LCM_DATABASE_PATH=/data/.openclaw/lcm.db in Coolify
  4. start/redeploy
  5. verify it works
  6. only then delete the old /root/.openclaw/lcm.db* files if you want

Iโ€™d copy first, not move first, so you keep a fallback.

Safe sequence

With the app stopped:

mkdir -p /data/.openclaw
#

for f in /root/.openclaw/lcm.db /root/.openclaw/lcm.db-wal /root/.openclaw/lcm.db-shm; do
  [ -e "$f" ] && cp -av "$f" /data/.openclaw/
done

Then in Coolify env, set:

LCM_DATABASE_PATH=/data/.openclaw/lcm.db

Then restart/redeploy.

After restart, verify:

#
ls -l /data/.openclaw/lcm.db*

One important note: moving LCM fixes the LCM part, but if QQ Bot is still writing to ~/.openclaw/qqbot, the doctor warning may still remain. So this is worth doing for correct/persistent LCM storage, but it may not fully clear the warning by itself.

digital sundial
#

also, does coolify have a different default path to data than the vanilla openclaw? is that why lcm has this?

also whats the problem with just leaving as is?

worthy gobletBOT
#

Yep, basically.

Does Coolify change the default?

Not by itself.
Vanilla OpenClawโ€™s default is still:

~/.openclaw

What changes in container setups is that people usually set:

OPENCLAW_STATE_DIR=/data/.openclaw
#

because /data is the persistent mounted volume.

So the mismatch is:

  • OpenClaw core is using /data/.openclaw
  • Lossless Claw still defaults to ~/.openclaw/lcm.db
  • QQ Bot appears to use ~/.openclaw/qqbot

Thatโ€™s why this shows up in Coolify more obviously.

Whatโ€™s the problem with just leaving it?

If you leave it as-is, the main risks are:

  1. LCM / QQ Bot data may not be persistent
    /root/.openclaw is often just container filesystem, so a rebuild/recreate can wipe it.
#
  1. State is split across 2 places
    harder backups, harder debugging, confusing doctor output.

  2. You may lose LCM history
    if lcm.db stays outside the persistent volume and the container gets replaced.

Is it fatal?

No.
If everything works and you donโ€™t care about that extra data surviving redeploys, you can leave it.

My honest recommendation

  • Main OpenClaw state: already fixed enough
  • Lossless Claw: worth fixing if you care about keeping LCM history
  • QQ Bot: separate issue, may still recreate ~/.openclaw
#

So yes, you can leave it, but the tradeoff is mostly persistence + cleaner state management, not immediate breakage.

digital sundial
#

if i stop the app in coolify doesn't that now allow me to use the terminal?

worthy gobletBOT
#

Usually no.

In Coolify, the terminal is normally an exec into the running container. If the app/container is stopped, thereโ€™s often nothing to exec into.

So the practical options are:

Easiest

If you donโ€™t care about preserving old LCM history, just:

  1. set
    LCM_DATABASE_PATH=/data/.openclaw/lcm.db
    
    in Coolify
  2. redeploy/restart

That makes future LCM data go to /data, which is the important part.

#

If you want to preserve the existing LCM DB

You usually need one of these:

  • a host shell on the machine running Coolify
  • a temporary/debug container with the same volume mounted
  • or another way to access files while the app itself is not running

I would not love doing a raw SQLite file copy from a live-running app unless youโ€™re okay with a little risk.

So honestly, unless the old LCM history is important, Iโ€™d take the simple route:

set the env var, redeploy, move on.

If you want, I can give you the lowest-risk โ€œgood enoughโ€ Coolify plan in 3 steps.

digital sundial
#

if its semi important, how does a live copy sound to you? like if it doesn work its cool but if it does then great? and if its good give me the plan

worthy gobletBOT
#

Yeah, for โ€œsemi importantโ€ data, I think a best-effort live copy is reasonable.

Iโ€™d do it this way, in order of preference:

Option A, better live copy if sqlite3 exists, recommended

In the running container shell:

mkdir -p /data/.openclaw
command -v sqlite3

If that prints a path, run:

#
sqlite3 /root/.openclaw/lcm.db ".backup '/data/.openclaw/lcm.db'"
ls -lh /data/.openclaw/lcm.db

Then in Coolify env set:

LCM_DATABASE_PATH=/data/.openclaw/lcm.db

Then redeploy/restart.

Option B, fallback raw live copy if sqlite3 is not installed

In the running container shell:

#
mkdir -p /data/.openclaw

for f in /root/.openclaw/lcm.db /root/.openclaw/lcm.db-wal /root/.openclaw/lcm.db-shm; do
  [ -e "$f" ] && cp -av "$f" /data/.openclaw/
done

ls -lh /data/.openclaw/lcm.db*

Then set in Coolify env:

LCM_DATABASE_PATH=/data/.openclaw/lcm.db

Then redeploy/restart.

#

Important notes

  • Do not delete the old /root/.openclaw/lcm.db* yet.
  • This is best effort, not perfect, if the app is live.
  • Even if LCM migration works, QQ Bot may still keep the doctor warning alive because that looks separate.

My recommendation

Try Option A first.
If sqlite3 isnโ€™t there, Option B is fine for a โ€œwould be nice if it worksโ€ migration.

If this fixed it, please right-click the solution message โ†’ Apps โ†’ โœ… Mark Solution.
That helps Answer Overflow index it for future users.

digital sundial
#

why mkdir? its already there, no?

worthy gobletBOT
#

Yep, probably. mkdir -p is just a safe no-op if it already exists.

It does not overwrite anything, it just makes sure the target dir is there before the copy/backup step.

So:

  • if /data/.openclaw already exists, mkdir -p /data/.openclaw does basically nothing
  • if you want, you can skip it

I included it just to make the sequence copy-paste safe.

digital sundial
#

what does this do? command -v sqlite3

actually can you give me the step by step but also explains why

worthy gobletBOT
#

Yep. command -v sqlite3 just checks whether the sqlite3 tool is installed.

  • if it prints a path like /usr/bin/sqlite3, you have it
  • if it prints nothing, you donโ€™t

We check that because SQLiteโ€™s built-in .backup is safer than raw file-copying a live DB.

Step by step, with why

1) Check if sqlite3 exists

command -v sqlite3

Why:
If it exists, we can do a safer live backup of lcm.db.

#

2A) If sqlite3 exists, use this

sqlite3 /root/.openclaw/lcm.db ".backup '/data/.openclaw/lcm.db'"

Why:
This asks SQLite itself to make a consistent copy of the database while itโ€™s live.

Then verify:

ls -lh /data/.openclaw/lcm.db

Why:

#

Just confirms the copied DB is there.


2B) If sqlite3 does NOT exist, use raw copy fallback

for f in /root/.openclaw/lcm.db /root/.openclaw/lcm.db-wal /root/.openclaw/lcm.db-shm; do
  [ -e "$f" ] && cp -av "$f" /data/.openclaw/
done

Why:
SQLite may also be using -wal and -shm sidecar files. If we do a plain copy, we want all of them, not just lcm.db.

Then verify:

#
ls -lh /data/.openclaw/lcm.db*

3) In Coolify, set this env var

LCM_DATABASE_PATH=/data/.openclaw/lcm.db

Why:
Lossless Claw currently defaults to ~/.openclaw/lcm.db. This tells it to use the persistent /data path instead.


4) Restart / redeploy the app in Coolify

#

Why:
The new env var only takes effect after restart.


5) Do NOT delete the old DB yet

Why:
You want a fallback in case the new path doesnโ€™t work or the live copy was incomplete.


Important caveat

Even if this works perfectly for Lossless Claw, the doctor warning may still remain because QQ Bot appears to write to ~/.openclaw/qqbot separately.

If you want, I can give you the exact copy-paste sequence for the sqlite3 exists case and the no-sqlite3 fallback case as two tiny blocks.

digital sundial
#

yes its there /home/linuxbrew/.linuxbrew/bin/sqlite3

worthy gobletBOT
#

Nice, then use the safer SQLite backup path.

Step 1, copy the LCM DB into /data

sqlite3 /root/.openclaw/lcm.db ".backup '/data/.openclaw/lcm.db'"
ls -lh /data/.openclaw/lcm.db

Step 2, set this in Coolify env

LCM_DATABASE_PATH=/data/.openclaw/lcm.db

Step 3, restart / redeploy the app

Step 4, after restart, verify the file is still there

#
ls -lh /data/.openclaw/lcm.db

Important

  • Do not delete /root/.openclaw/lcm.db yet
  • this fixes the Lossless Claw DB location
  • QQ Bot may still recreate ~/.openclaw, so the doctor warning may still remain even if this part works

If you want, after the redeploy paste:

ls -ld ~/.openclaw /data/.openclaw
find ~/.openclaw -maxdepth 2 -mindepth 1 | head -20

and Iโ€™ll tell you whether the remaining warning is now just QQ Bot.

digital sundial
#

what the heck is qq bot? built in?

worthy gobletBOT
#

Yep, QQ Bot is the Tencent QQ channel integration.

And yes, per the docs itโ€™s a bundled plugin / built-in channel integration, not some random thing you installed manually by default.

So:

  • QQ Bot = OpenClaw channel support for QQ
  • itโ€™s separate from Lossless Claw
  • if you never meant to use QQ, then seeing /root/.openclaw/qqbot/... is worth checking later

Docs: https://docs.openclaw.ai/channels/qqbot

Short version:

  • Lossless Claw โ†’ your lcm.db
  • QQ Bot โ†’ that ~/.openclaw/qqbot folder
#

If you want, next I can help you figure out whether QQ Bot is actually enabled in your config or whether itโ€™s just leaving behind files.

digital sundial
#

should this also be moved?

worthy gobletBOT
#

If you mean the QQ Bot folder, I would say:

  • if you do not use QQ Bot: no, I would not bother moving it. Better to ignore it or later figure out why it exists.
  • if you do use QQ Bot and care about persistence: ideally yes, but itโ€™s not the same easy fix as LCM.

Why:

  • LCM has a clean env var, LCM_DATABASE_PATH
  • QQ Bot looks like it writes to ~/.openclaw/qqbot directly, so a manual move by itself probably wonโ€™t โ€œstickโ€

So my honest recommendation is:

  • move/fix LCM if you care about that history
  • leave QQ Bot alone for now unless you actually use it

If you do actively use QQ Bot, thatโ€™s probably worth a GitHub issue because it looks like a path-handling problem rather than just normal config: https://github.com/openclaw/openclaw/issues

digital sundial
#

sqlite3 /root/.openclaw/lcm.db ".backup '/data/.openclaw/lcm.db'"
ls -lh /data/.openclaw/lcm.db
-rw-r--r-- 1 root root 184K Apr 14 14:08 /data/.openclaw/lcm.db

worthy gobletBOT
#

Nice, that worked. ๐ŸŽ‰

You now have an LCM DB copy in the persistent state dir:

/data/.openclaw/lcm.db

Next steps

1) In Coolify env, set:

LCM_DATABASE_PATH=/data/.openclaw/lcm.db

2) Restart / redeploy the app

#

3) After restart, verify:

ls -lh /data/.openclaw/lcm.db

Important

  • Do not delete /root/.openclaw/lcm.db yet
  • this fixes the Lossless Claw DB location
  • QQ Bot may still keep the doctor warning alive, so donโ€™t be surprised if doctor still mentions ~/.openclaw

If you want, after the restart paste:

find ~/.openclaw -maxdepth 2 -mindepth 1 | head -20
#

and I can tell you whether whatโ€™s left is basically just QQ Bot.

digital sundial
#

got this after restart and adding the env var

ls -lh /data/.openclaw/lcm.db
-rw-r--r-- 1 root root 184K Apr 14 14:08 /data/.openclaw/lcm.db

worthy gobletBOT
#

Nice, thatโ€™s a good sign.

That means:

  • the file exists in /data/.openclaw
  • it survived the restart
  • your LCM path change probably worked

Next check

Run:

find ~/.openclaw -maxdepth 2 -mindepth 1 | head -20

Why:

#

We want to see whether ~/.openclaw still contains:

  • lcm.db stuff
  • qqbot stuff
  • or only QQ Bot leftovers

If doctor still warns now, my guess is the remaining culprit is QQ Bot, not LCM.

So youโ€™re pretty close. Paste that find ~/.openclaw ... output and Iโ€™ll tell you exactly whatโ€™s left.

digital sundial
#

find ~/.openclaw -maxdepth 2 -mindepth 1 | head -20
find: '/root/.openclaw': No such file or directory
find ~/.openclaw -maxdepth 2 -mindepth 1 | head -20
find: '/root/.openclaw': No such file or directory

worthy gobletBOT
#

Oh, thatโ€™s actually great.

That means ~/.openclaw does not exist right now, so both of these likely happened:

  • Lossless Claw stopped recreating it after the env change
  • QQ Bot is not currently recreating it either

Next step

Run:

openclaw doctor

If it still warns about ~/.openclaw, then Iโ€™d want to see that exact fresh output, because at that point something odd is happening.

#

But based on what you just pasted, Iโ€™d expect the warning to be gone now.

digital sundial
#

doctor fix:

โ”‚                                                                          โ”‚
โ”‚  - Multiple state directories detected. This can split session history.  โ”‚
โ”‚    - ~/.openclaw                                                         โ”‚
โ”‚    Active state dir: /data/.openclaw                                     โ”‚
โ”‚  - 2/5 recent sessions are missing transcripts.                          โ”‚
โ”‚    Verify sessions in store: openclaw sessions --store                   โ”‚
โ”‚    "/data/.openclaw/agents/main/sessions/sessions.json"                  โ”‚
โ”‚    Preview cleanup impact: openclaw sessions cleanup --store             โ”‚
โ”‚    "/data/.openclaw/agents/main/sessions/sessions.json" --dry-run        โ”‚
โ”‚    Prune missing entries: openclaw sessions cleanup --store              โ”‚
โ”‚    "/data/.openclaw/agents/main/sessions/sessions.json" --enforce        โ”‚
โ”‚    --fix-missing                                                         โ”‚
โ”‚  - Found 1 orphan transcript file in                                     โ”‚
โ”‚    /data/.openclaw/agents/main/sessions.                                 โ”‚
โ”‚    These .jsonl files are no longer referenced by sessions.json, so      โ”‚
โ”‚    they are not part of any active session history.                      โ”‚
โ”‚    Doctor can archive them safely by renaming each file to               โ”‚
โ”‚    *.deleted.<timestamp>.                                              โ”‚
โ”‚    Examples: 9523fcfc-c75e-4f56-b1a2-5dd2e6a42295.jsonl                  โ”‚
โ”‚                                                                          โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
โ”‚
โ—‡  Doctor changes โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚                                                                  โ”‚
โ”‚  - Archived 1 orphan transcript file in                          โ”‚
โ”‚    /data/.openclaw/agents/main/sessions as .deleted timestamped  โ”‚
โ”‚    backups```
pallid pagodaBOT
worthy gobletBOT
#

Okay, a couple things happened there:

  • good: doctor archived the orphan transcript file
  • not good: it still sees
    • ~/.openclaw
    • 2/5 recent sessions are missing transcripts

So Iโ€™d split this into two separate checks again.

1) Check what exists in ~/.openclaw immediately after doctor

Run this right now, after that doctor run:

ls -ld ~/.openclaw 2>/dev/null || echo "no ~/.openclaw after doctor"
find ~/.openclaw -maxdepth 2 -mindepth 1 | head -20 2>/dev/null
#

Why:
We need to see what recreated ~/.openclaw during/after doctor.


2) Re-run the session cleanup one more time

openclaw sessions cleanup --store "/data/.openclaw/agents/main/sessions/sessions.json" --enforce --fix-missing

Why:
The orphan file got archived, but the missing-transcript store entries are still there.


What Iโ€™m thinking

#

At this point, it looks like:

  • doctor or a plugin is recreating ~/.openclaw during startup/runtime
  • the session store issue is separate and needs another cleanup pass

Paste the output of step 1 first. That will tell us whether this is still QQ Bot / LCM / something else recreating the home dir.

digital sundial
#

ls -ld ~/.openclaw 2>/dev/null || echo "no ~/.openclaw after doctor"
drwxr-xr-x 3 root root 4096 Apr 14 14:13 /root/.openclaw
find ~/.openclaw -maxdepth 2 -mindepth 1 | head -20 2>/dev/null
/root/.openclaw/qqbot
/root/.openclaw/qqbot/data

worthy gobletBOT
#

Perfect, that confirms it.

LCM is no longer the culprit.
Whatโ€™s recreating ~/.openclaw now is just:

/root/.openclaw/qqbot

So the situation is now:

  • main OpenClaw state โ†’ /data/.openclaw
  • Lossless Claw DB โ†’ moved/fixed
  • remaining extra dir โ†’ QQ Bot only

What that means

#

If you use QQ Bot, this warning is basically expected for now.

If you do not use QQ Bot, the next thing Iโ€™d check is whether itโ€™s actually configured:

openclaw config get channels.qqbot --json

If that returns a real config object and you do not use QQ

You can remove it with:

openclaw config unset channels.qqbot

Then restart the app and run:

#
openclaw doctor

My recommendation

  • If you donโ€™t use QQ Bot: remove its config
  • If you do use QQ Bot: youโ€™re probably done enough, and the remaining warning is a QQ Bot path behavior thing

So yeah, we found the last piece. ~/.openclaw is being recreated by QQ Bot, not by core OpenClaw or LCM anymore.

digital sundial
#

also did the clean up

openclaw sessions cleanup --store "/data/.openclaw/agents/main/sessions/sessions.json" --enforce --fix-missing

๐Ÿฆž OpenClaw 2026.4.9 (0512059)
Your .env is showing; don't worry, I'll pretend I didn't see it.

Session store: /data/.openclaw/agents/main/sessions/sessions.json
Applied maintenance. Current entries: 139

worthy gobletBOT
#

Nice, that cleanup ran successfully.

So at this point youโ€™ve done both meaningful fixes:

  • LCM DB moved into /data/.openclaw
  • session cleanup applied against the real session store

Next check

Run:

openclaw doctor

What I expect now

#

Best case: only the QQ Bot-caused ~/.openclaw warning remains.

If thatโ€™s all thatโ€™s left, then:

  • if you use QQ Bot: youโ€™re basically done
  • if you donโ€™t use QQ Bot: run
    openclaw config get channels.qqbot --json
    
    and we can remove that config next

Youโ€™re in much better shape now. The remaining issue looks pretty narrow.

digital sundial
#

openclaw config get channels.qqbot --json
Config path not found: channels.qqbot

worthy gobletBOT
#

Thatโ€™s actually pretty telling.

Config path not found: channels.qqbot means QQ Bot is not explicitly configured in your config, so this is likely:

  • bundled/plugin-side behavior
  • creating ~/.openclaw/qqbot anyway
  • enough for doctor to keep warning

So my read now is:

  • core OpenClaw state: fixed
  • LCM path: fixed
  • sessions cleanup: done
  • remaining warning: likely a QQ Bot / bundled plugin path quirk, not a config mistake on your side

My honest recommendation

#

At this point, I would stop chasing it unless itโ€™s causing an actual runtime problem.

If you want maintainer attention, this is now in GitHub issue territory:
https://github.com/openclaw/openclaw/issues

You could summarize it like this:

In a Coolify/container setup with OPENCLAW_STATE_DIR=/data/.openclaw, openclaw doctor still reports multiple state directories because /root/.openclaw/qqbot is recreated even though channels.qqbot is not configured. Lossless Claw was also writing to ~/.openclaw/lcm.db by default until LCM_DATABASE_PATH was set.

So yeah, I think youโ€™ve fixed the real issues. Whatโ€™s left looks like a product bug/quirk, not you doing something wrong.

If this got you unstuck, please right-click the solution message โ†’ Apps โ†’ โœ… Mark Solution so Answer Overflow can index it.

digital sundial
#

all good thank you that makes sense

are any of these important?

[lcm] Compaction summarization model: openai/gpt-4.1 (default)
โ”‚
โ—‡  Plugins โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚                โ”‚
โ”‚  Loaded: 48    โ”‚
โ”‚  Imported: 48  โ”‚
โ”‚  Disabled: 50  โ”‚
โ”‚  Errors: 0     โ”‚
โ”‚                โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
โ”‚
โ—‡  
โ”‚
โ—‡  Gateway โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚                        โ”‚
โ”‚  Gateway not running.  โ”‚
โ”‚                        โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
โ”‚
โ—‡  Gateway connection โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚                                         โ”‚
โ”‚  Gateway target: ws://127.0.0.1:18789   โ”‚
โ”‚  Source: local loopback                 โ”‚
โ”‚  Config: /data/.openclaw/openclaw.json  โ”‚
โ”‚  Bind: loopback                         โ”‚
โ”‚                                         โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
โ”‚
โ—‡  Gateway โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚                                                                          โ”‚
โ”‚  systemd user services are unavailable; install/enable systemd or run    โ”‚
โ”‚  the gateway under your supervisor.                                      โ”‚
โ”‚  If you're in a container, run the gateway in the foreground instead of  โ”‚
โ”‚  `openclaw gateway`.                                                     โ”‚
โ”‚                                                                          โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
โ”‚
โ””  Doctor complete.```
worthy gobletBOT
#

Mostly:

  • Good / expected

    • [lcm] Plugin loaded ... db=/data/.openclaw/lcm.db
      This is good, it means LCM is now using the right path.
    • Plugins ... Errors: 0
      Also good.
  • Normal in a container

    • systemd user services are unavailable...
      Totally normal in Coolify/container land.
  • Only important if you expected the app to be live

    • Gateway not running.

That last one is the only thing Iโ€™d pay attention to.

#

In plain English

If you were just running doctor in a shell and the actual app isnโ€™t supposed to be up right now, no problem.

If the app is supposed to be running, then your container/start command should be running the gateway in the foreground, and youโ€™d want to fix that in Coolify.

So my summary:

  • LCM path fixed: yes โœ…
  • plugin errors: none โœ…
  • systemd warning in container: ignore โœ…
  • gateway not running: only matters if you expected it to be up โš ๏ธ

If this fixed it, please right-click the solution message โ†’ Apps โ†’ โœ… Mark Solution so Answer Overflow can index it.

digital sundial
#

โ—‡ Gateway โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ โ”‚
โ”‚ Gateway not running. โ”‚
โ”‚ โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
โ”‚
โ—‡ Gateway connection โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ โ”‚
โ”‚ Gateway target: ws://127.0.0.1:18789 โ”‚
โ”‚ Source: local loopback โ”‚
โ”‚ Config: /data/.openclaw/openclaw.json โ”‚
โ”‚ Bind: loopback โ”‚
โ”‚ โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
โ”‚
โ—‡ Gateway โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ โ”‚
โ”‚ systemd user services are unavailable; install/enable systemd or run โ”‚
โ”‚ the gateway under your supervisor. โ”‚
โ”‚ If you're in a container, run the gateway in the foreground instead of โ”‚
โ”‚ openclaw gateway. โ”‚
โ”‚ โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
โ”‚
โ”” Doctor complete.

why gateway not working?

worthy gobletBOT
#

Because right now nothing is listening on 127.0.0.1:18789 inside the container.

In a Coolify/container setup, that usually means one of these:

  1. the app/container is stopped
  2. the app is crashing on startup
  3. the Coolify start command / entrypoint is not actually running the gateway
  4. less likely, itโ€™s running on a different port/bind

Whatโ€™s normal vs not

  • systemd user services are unavailable โ†’ normal in containers
  • Gateway not running โ†’ not normal if the app is supposed to be up

What to check next

In Coolify logs, you want to see something like:

#
[gateway] listening on ws://0.0.0.0:18789

If you donโ€™t see a listening line like that, the gateway never came up.

Fast next checks

If you have shell access in the running container, run:

ss -ltnp | grep 18789 || true
ps -ef | grep -E 'openclaw|node' | grep -v grep

My best guess

#

Since youโ€™re in Coolify, I think the most likely issue is:

  • the container is not currently running the gateway as its main foreground process

If you paste the Coolify start command or the container logs from startup, I can tell you exactly whatโ€™s wrong.

digital sundial
#

the controlui dashboard shows all ok though

worthy gobletBOT
#

Ah, that changes the read quite a bit.

If the Control UI dashboard shows everything OK, then the gateway is probably fine and the doctor result is likely a probe-context mismatch.

Whatโ€™s probably happening

openclaw doctor is probing:

ws://127.0.0.1:18789

from the shell/container you ran it in.

In Coolify, that shell is sometimes not the same network/process context as the actual running gateway process, so:

  • Control UI can reach the real gateway
#
  • but doctor in that shell probes local loopback and sees nothing

So is the gateway actually broken?

Probably no, if the dashboard is healthy.

Practical takeaway

At this point I would interpret it like this:

  • Dashboard healthy โ†’ trust that
  • doctor says gateway not running from shell โ†’ likely shell/loopback mismatch in Coolify

So I would not chase the gateway warning further unless the app itself is actually failing.

You basically ended up with two Coolify/container quirks:

#
  1. state-dir warnings from things touching ~/.openclaw
  2. local loopback health checks from a shell that may not be the real gateway runtime

If this fixed the real problem, please right-click the solution message โ†’ Apps โ†’ โœ… Mark Solution so Answer Overflow can index it.

digital sundial
#

can you check github and see if known issue?

worthy gobletBOT
#

I checked GitHub. I didnโ€™t find your exact Coolify + qqbot + doctor combo as a single issue, but I did find a few very related ones.

Related issues

#

My read

So, yes, this looks like a known class of issue, even if I didnโ€™t find the exact same report.

Your case seems to combine two known patterns:

  1. parts of OpenClaw / plugins still touching ~/.openclaw despite OPENCLAW_STATE_DIR
  2. doctor/gateway probes being misleading in some environments

If you want maintainer attention, I think a new GitHub issue is justified. A good title would be:

Doctor reports multiple state directories in Coolify because qqbot recreates ~/.openclaw even when OPENCLAW_STATE_DIR=/data/.openclaw and channels.qqbot is unset

If you want, I can help you write the body in a clean copy-paste format.

digital sundial
#

i feel like i didnt have this before but cant remember

worthy gobletBOT
#

Yeah, thatโ€™s very plausible.

I wouldnโ€™t assume youโ€™re misremembering or that you suddenly broke something.

A few reasons this can show up โ€œlaterโ€ even if your setup felt the same:

  • doctor got stricter / noisier about state-dir checks
  • bundled/plugin behavior changed over time
  • Lossless Claw was definitely one source until you moved lcm.db
  • QQ Bot path behavior may have become visible only after an update or plugin load change

So โ€œI didnโ€™t used to see thisโ€ actually fits the evidence pretty well.

My honest read is:

  • you had a real but mostly hidden path split
  • newer OpenClaw/plugin behavior made it visible
#
  • you already fixed the important part, which was main state + LCM persistence

Whatโ€™s left looks more like a current product quirk/bug than user error.

digital sundial
#

thanks can you summarise the problem and solution now

also btw do you think my bot spamming me with "agent aborted" even when i didn't send anything, is related?

worthy gobletBOT
#

Yep, short version:

Problem

Your bot is running in Coolify/container mode with the real state dir at:

/data/.openclaw

But some things were still touching:

~/.openclaw

So doctor warned about multiple state directories.

#

The two things we found were:

  • Lossless Claw defaulting to ~/.openclaw/lcm.db
  • QQ Bot recreating ~/.openclaw/qqbot even though channels.qqbot wasnโ€™t set

You also had some session-store cleanup issues:

  • missing transcripts
  • one orphan transcript file

What we fixed

  • confirmed the real OpenClaw state is /data/.openclaw
  • moved LCM to the persistent path by setting:
#
LCM_DATABASE_PATH=/data/.openclaw/lcm.db
  • backed up / cleaned the old ~/.openclaw
  • ran:
openclaw sessions cleanup --store "/data/.openclaw/agents/main/sessions/sessions.json" --enforce --fix-missing
  • doctor also archived the orphan transcript file

Whatโ€™s left

#

The remaining ~/.openclaw warning looks like a QQ Bot / plugin path quirk, not your main state being wrong.

So the important part is fixed. Whatโ€™s left looks more like a product bug/quirk than a broken setup on your side.


About the โ€œagent abortedโ€ spam

I would not assume thatโ€™s related.

This state-dir issue was mainly about:

  • where files are stored
  • persistence
  • doctor warnings

โ€œagent abortedโ€ spam sounds more like a separate runtime / session / automation issue.

#

Since thatโ€™s a different problem, Iโ€™d suggest opening a new thread for it so it doesnโ€™t get mixed into this one. Include:

openclaw --version
openclaw status --all

and a small log snippet around one of the aborts.

Glad we got this one untangled. If this fixed the original issue, please right-click the solution message โ†’ Apps โ†’ โœ… Mark Solution so Answer Overflow can index it.

digital sundial
#

oh, you don't think it's related?

worthy gobletBOT
#

Not likely, no.

I canโ€™t rule it out 100%, but I wouldnโ€™t treat it as the same root cause.

Why:

  • this issue was mostly about where state files live
  • โ€œagent abortedโ€ spam is usually more about runtime failures, retries, background jobs, or automation/session issues

So there could be a loose connection if the bot is unstable overall, but I would still debug it as a separate problem.

My honest answer is:

  • possible in a broad sense
  • unlikely as the direct cause

If you want, open a fresh thread for the abort spam and include:

#
  • openclaw --version
  • openclaw status --all
  • one example of the abort message
  • whether you have heartbeat / cron / tasks / automations enabled

Thatโ€™ll make it much easier to tell fast.

digital sundial
#

getting this now after upgrading to 2026.4.14

โ”‚
โ—‡ State integrity โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ โ”‚
โ”‚ - Multiple state directories detected. This can split session history. โ”‚
โ”‚ - ~/.openclaw โ”‚
โ”‚ Active state dir: /data/.openclaw โ”‚
โ”‚ - Found 1 agent directory on disk without a matching agents.list โ”‚
โ”‚ entry. โ”‚
โ”‚ These agents can still have sessions/auth state on disk, but โ”‚
โ”‚ config-driven routing, identity, and model selection will ignore โ”‚
โ”‚ them. โ”‚
โ”‚ Examples: codex โ”‚
โ”‚ Restore the missing agents.list entries or remove stale dirs after โ”‚
โ”‚ confirming they are no longer needed: /data/.openclaw/agents โ”‚
โ”‚ โ”‚
โ”œโ”€

worthy gobletBOT
#

Yep, this looks like a new doctor check, not necessarily a new breakage.

It means:

  • there is a directory on disk at roughly
    /data/.openclaw/agents/codex
    
  • but your current config does not list an agent named codex in agents.list

So OpenClaw is basically saying:

โ€œI found old/stored agent state for codex, but your config no longer declares that agent.โ€

Is it bad?

Usually not urgent.

#

It just means one of these is true:

  1. you used to have an agent called codex and removed it from config
  2. something created that agent dir experimentally
  3. you still want it, but forgot to keep its config entry

Best next step

Check whether you still care about that agent:

ls -la /data/.openclaw/agents/codex
openclaw config get agents.list --json

If you do NOT need codex

#

Safest cleanup is to move it aside, not delete it:

mv /data/.openclaw/agents/codex /data/.openclaw/agents/codex.backup-$(date +%Y%m%d-%H%M%S)

Then rerun:

openclaw doctor

If you DO need codex

Then the fix is to add it back to agents.list in config.

So short version: this warning is basically โ€œstale agent folder on diskโ€.

#

Very plausible after an upgrade if doctor just got better at spotting it.

digital sundial
#

i dont need it now but if i want later can add it back

worthy gobletBOT
#

Yep, exactly.

If you move it aside instead of deleting it, you can bring it back later.

Safe move now

mv /data/.openclaw/agents/codex /data/.openclaw/agents/codex.backup-$(date +%Y%m%d-%H%M%S)

Later, if you want it back

You can move it back to:

/data/.openclaw/agents/codex

and then add the matching agents.list config entry again.

#

So yes, backup-and-move is the right play if you donโ€™t need it now but might later.

digital sundial
#

is this correct? mv /data/.openclaw/agents/codex /data/.openclaw/agents/codex.backup-$(date +%Y%m%d-%H%M%S) theres a space between codex and /

worthy gobletBOT
#

Yep, that space is correct.

mv syntax is:

mv <source> <destination>

So in your command:

mv /data/.openclaw/agents/codex /data/.openclaw/agents/codex.backup-$(date +%Y%m%d-%H%M%S)
  • first path = the current codex folder
  • second path = the new backup name
#

So yes, thatโ€™s the right command.

digital sundial
#

i did that and now get this:

โ—‡ State integrity โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ โ”‚
โ”‚ - Multiple state directories detected. This can split session history. โ”‚
โ”‚ - ~/.openclaw โ”‚
โ”‚ Active state dir: /data/.openclaw โ”‚
โ”‚ - Found 1 agent directory on disk without a matching agents.list โ”‚
โ”‚ entry. โ”‚
โ”‚ These agents can still have sessions/auth state on disk, but โ”‚
โ”‚ config-driven routing, identity, and model selection will ignore โ”‚
โ”‚ them. โ”‚
โ”‚ Examples: codex.backup-20260414-201436 (id โ”‚
โ”‚ codex-backup-20260414-201436) โ”‚
โ”‚ Restore the missing agents.list entries or remove stale dirs after โ”‚
โ”‚ confirming they are no longer needed: /data/.openclaw/agents โ”‚
โ”‚ โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
โ”‚

worthy gobletBOT
#

Ah, yep, thatโ€™s why.

You moved it to a new name, but itโ€™s still inside /data/.openclaw/agents/, so doctor still sees it as an agent dir.

Fix

Move it out of the agents folder entirely, for example:

mv /data/.openclaw/agents/codex.backup-20260414-201436 /data/.openclaw/codex.backup-20260414-201436

Then rerun:

openclaw doctor
#

Why

Doctor is basically scanning:

/data/.openclaw/agents/*

So anything left inside that folder still looks like an agent, even if you renamed it.

digital sundial
#

is that the correct cmd?

worthy gobletBOT
#

Yep, if that exact folder name matches what you have, then yes.

mv /data/.openclaw/agents/codex.backup-20260414-201436 /data/.openclaw/codex.backup-20260414-201436

That moves it:

  • from inside agents/
  • to the top level of /data/.openclaw/

If you want to double-check the exact name first:

ls -1 /data/.openclaw/agents
digital sundial
#

done now

โ”‚
โ—‡ State integrity โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ โ”‚
โ”‚ - Multiple state directories detected. This can split session history. โ”‚
โ”‚ - ~/.openclaw โ”‚
โ”‚ Active state dir: /data/.openclaw โ”‚
โ”‚ โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
โ”‚

worthy gobletBOT
#

Nice, that means the codex stale-agent warning is gone.

So now youโ€™re back to the one remaining thing we already narrowed down:

  • ~/.openclaw is being recreated
  • likely because of the QQ Bot / plugin path quirk
  • your main state, LCM DB, and stale agent dir are handled

So honestly, this is a pretty good stopping point.

Final state

Fixed:

  • main state dir confirmed at /data/.openclaw
  • LCM moved to /data/.openclaw/lcm.db
  • missing/orphan session cleanup handled
  • stale codex agent dir handled
#

Remaining:

  • doctor still warns about ~/.openclaw
  • likely a bundled/plugin behavior, not your core setup being wrong

If this solved the real issue, please right-click the solution message โ†’ Apps โ†’ โœ… Mark Solution so Answer Overflow can index it.

digital sundial
#

is there a memory/session fix command?

worthy gobletBOT
#

Yep, there are, but theyโ€™re different commands for different layers.

Sessions

For broken/missing session transcripts:

openclaw sessions cleanup --store "/data/.openclaw/agents/main/sessions/sessions.json" --enforce --fix-missing

Thatโ€™s the main session repair command.

General state / doctor fixes

For broader state cleanup:

openclaw doctor --fix
#

That handles general repairs and things like orphan transcript archiving.

Memory

If youโ€™re actually using the memory plugin, there are memory-side repair commands too:

openclaw memory status --fix
openclaw memory index --force

What they do

  • memory status --fix โ†’ repairs stale recall locks / normalizes memory metadata
  • memory index --force โ†’ rebuilds the memory index

Short version

  • session fix command: yes
#
  • memory fix command: yes, if memory is enabled
  • general fix command: yes, doctor --fix

If you want, I can give you a tiny โ€œwhich fix command should I run for which symptom?โ€ cheat sheet.

digital sundial
#

is the total token cost available via terminal?

worthy gobletBOT
#

Yep, via terminal the main one is:

openclaw gateway usage-cost

Useful variants:

openclaw gateway usage-cost --days 7
openclaw gateway usage-cost --json

That pulls usage/cost summaries from session logs.

Two related commands:

#
  • openclaw status --usage โ†’ provider usage/quota snapshot
  • openclaw status --all โ†’ session/token info, but not the same as total cost summary

So if you specifically want total token cost, gateway usage-cost is the right one.

digital sundial
#

openclaw gateway usage-cost

๐Ÿฆž OpenClaw 2026.4.14 (323493f) โ€” Greetings, Professor Falken

โ”‚
โ—‡
Gateway usage cost failed: Error: gateway timeout after 10000ms
Gateway target: ws://127.0.0.1:18789
Source: local loopback
Config: /data/.openclaw/openclaw.json
Bind: loopback
openclaw gateway usage-cost --days 30

๐Ÿฆž OpenClaw 2026.4.14 (323493f) โ€” Hot reload for config, cold sweat for deploys.

โ”‚
โ—‡
Gateway usage cost failed: Error: gateway timeout after 10000ms
Gateway target: ws://127.0.0.1:18789
Source: local loopback
Config: /data/.openclaw/openclaw.json
Bind: loopback
openclaw gateway usage-cost --days 2

๐Ÿฆž OpenClaw 2026.4.14 (323493f) โ€” Ah, the fruit tree company! ๐ŸŽ

โ”‚
โ—‡
Gateway usage cost failed: Error: gateway timeout after 10000ms
Gateway target: ws://127.0.0.1:18789
Source: local loopback
Config: /data/.openclaw/openclaw.json
Bind: loopback

worthy gobletBOT
#

Yep, thatโ€™s the same Coolify loopback mismatch again, not the usage-cost command itself.

Whatโ€™s happening:

  • openclaw gateway usage-cost is trying to talk to
    ws://127.0.0.1:18789
    
  • from the shell youโ€™re in
  • but in your setup that shell apparently cannot reach the real gateway process

So:

  • the command is valid
  • the timeout is environmental

What would make it work

#

Either:

  1. run it from the actual gateway runtime/container namespace
  2. or give it the real gateway URL + auth explicitly:
openclaw gateway usage-cost --url <real-ws-url> --token <token>

Important

This is the same reason you were seeing:

  • dashboard says things are okay
  • CLI loopback commands say gateway is not reachable

So I wouldnโ€™t read this as โ€œusage-cost is brokenโ€.

#

Iโ€™d read it as โ€œthis shell canโ€™t reach the gateway over local loopbackโ€.

If you want, I can help you figure out the correct --url / --token form for your Coolify setup.

digital sundial
#

also how to list workspace files?

worthy gobletBOT
#

Yep, easiest is:

1) Find the workspace path

openclaw config get agents.defaults.workspace

If youโ€™re using an env override, also check:

echo "$OPENCLAW_WORKSPACE_DIR"

2) List files in it

Once you know the path, for example /data/workspace:

#
ls -la /data/workspace

or a more useful recursive view:

find /data/workspace -maxdepth 2 | sort

If you want just files:

find /data/workspace -maxdepth 2 -type f | sort

So short version:

#
openclaw config get agents.defaults.workspace
find <that-path> -maxdepth 2 | sort
digital sundial
#

did that and got:

Gateway usage cost failed: Error: gateway closed (1006 abnormal closure (no close frame)): no close reason
Gateway target: wss://
Source: cli --url
Config: /data/.openclaw/openclaw.json

worthy gobletBOT
#

Thatโ€™s useful, it means you did reach the host, but the WebSocket session did not establish cleanly.

So this is different from the earlier timeout.

What 1006 abnormal closure usually means here

Usually one of these:

  1. wrong WebSocket URL
  2. missing token/password auth
  3. reverse proxy is not forwarding WebSockets correctly

Next thing Iโ€™d try

Run it with the same URL plus the gateway auth.

If your gateway uses a token:

#
openclaw gateway usage-cost --url wss://django.personalslave.xyz --token "<your-gateway-token>"

If it uses a password instead:

openclaw gateway usage-cost --url wss://django.personalslave.xyz --password "<your-gateway-password>"

Do not paste the token/password here.

Better test first

Before retrying usage-cost, test the connection itself:

#
openclaw gateway status --url wss://django.personalslave.xyz --token "<your-gateway-token>"

If that still gives 1006, then Iโ€™d suspect:

  • the URL is not the actual gateway WS endpoint
  • or the reverse proxy/domain is not handling WS upgrade correctly

So short version:
progress, but youโ€™re not authenticated or not hitting the exact WebSocket endpoint cleanly yet.

digital sundial
#

what happened here?

chat logs:

A A:
test

Django:
โš ๏ธ Something went wrong while processing your request. Please try again, or use /new to start a fresh session.

A A:
hey where you from?

Django:
โš ๏ธ Something went wrong while processing your request. Please try again, or use /new to start a fresh session.

A A:
/new

Django:
โš ๏ธ Something went wrong while processing your request. Please try again, or use /new to start a fresh session.

A A:
/new

Django:
โš ๏ธ Something went wrong while processing your request. Please try again, or use /new to start a fresh session.

A A:
hey where you from?

Django:
โš ๏ธ Something went wrong while processing your request. Please try again, or use /new to start a fresh session.

worthy gobletBOT
#

That error means the bot accepted the message but the request failed during processing, so it fell back to the generic:

โ€œSomething went wrong while processing your requestโ€ฆโ€

A couple important clues:

  • it happens on normal messages
  • it also happens on /new
  • so this is probably not just one bad chat/session

So this looks much more like a runtime failure than the state-dir warning.
Honestly, this is closer to your โ€œagent abortedโ€ problem than to the ~/.openclaw issue.

Likely categories

Usually itโ€™s one of:

#
  • model/provider/auth failure
  • plugin/runtime crash
  • gateway/session handler error
  • background agent abort / request pipeline failure

Best next step

This is now a separate issue, so Iโ€™d open a new thread for it.

Include:

openclaw --version
openclaw status --all

and a log snippet from the exact failure time.

#

Since youโ€™re on Coolify, the most useful thing is probably the app logs around one of those failed messages.

Short answer

What happened?

The bot crashed/aborted while handling the request, and /new failed too, so itโ€™s not just a broken conversation thread.

If you want, I can help you phrase the new thread cleanly so itโ€™s easier to debug fast.

digital sundial
#

openclaw status --all

its too long so coming one at a time dont reply till i say pikachi

#

Overview
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Item โ”‚ Value โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Version โ”‚ 2026.4.14 โ”‚
โ”‚ OS โ”‚ linux 6.14.0-37-generic (x64) โ”‚
โ”‚ Node โ”‚ 25.8.1 โ”‚
โ”‚ Config โ”‚ /data/.openclaw/openclaw.json โ”‚
โ”‚ Dashboard โ”‚ http://127.0.0.1:18789/ โ”‚
โ”‚ Tailscale โ”‚ off โ”‚
โ”‚ Channel โ”‚ stable (v2026.4.14) โ”‚
โ”‚ Git โ”‚ detached ยท tag v2026.4.14 ยท @ 323493fa โ”‚
โ”‚ Update โ”‚ git HEAD ยท npm latest 2026.4.14 ยท deps ok โ”‚
โ”‚ Gateway โ”‚ local ยท ws://127.0.0.1:18789 (local loopback) ยท reachable โ”‚
โ”‚ โ”‚ 304ms ยท auth token ยท 590a50ec9ef2 (172.19.0.3) app 2026.4. โ”‚
โ”‚ โ”‚ 14 linux 6.14.0-37-generic โ”‚
โ”‚ Security โ”‚ Run: openclaw security audit --deep โ”‚
โ”‚ Gateway self โ”‚ unknown โ”‚
โ”‚ Gateway service โ”‚ systemd not installed โ”‚
โ”‚ Node service โ”‚ systemd not installed โ”‚
โ”‚ Agents โ”‚ 1 total ยท 1 bootstrapping ยท 1 active ยท 140 sessions โ”‚
โ”‚ Secrets โ”‚ none โ”‚

#

Channels
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Channel โ”‚ Enabled โ”‚ State โ”‚ Detail โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Telegram โ”‚ ON โ”‚ OK โ”‚ token config (sha256:b88d722c ยท len 46) ยท โ”‚
โ”‚ โ”‚ โ”‚ โ”‚ accounts 1/1 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Telegram accounts
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Account โ”‚ Status โ”‚ Notes โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ default โ”‚ OK โ”‚ token:config ยท allow:367989403,421931654 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Agents
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Agent โ”‚ Bootstrap file โ”‚ Sessions โ”‚ Active โ”‚ Store โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ main โ”‚ PRESENT โ”‚ 140 โ”‚ 4m ago โ”‚ /data/.openclaw/agents/ โ”‚
โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ main/sessions/sessions. โ”‚
โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ json โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

#

Diagnosis (read-only)

Gateway connection details:
Gateway target: ws://127.0.0.1:18789
Source: local loopback
Config: /data/.openclaw/openclaw.json
Bind: loopback

โœ“ Config: /data/.openclaw/openclaw.json
โœ“ Secret diagnostics (0)
โœ“ Restart sentinel: none
โœ“ Port 18789
Detected dual-stack loopback listeners (127.0.0.1 + ::1) for one gateway process.
! Tailscale: off ยท unknown
โœ“ Skills: 28 eligible ยท 0 missing ยท /data/workspace
โœ“ Plugin compatibility (none)
โœ“ Channel issues (none)

Pasteable debug report. Auth tokens redacted.
Troubleshooting: https://docs.openclaw.ai/troubleshooting

#

ok done, pikachu

worthy gobletBOT
#

Aha, yep, thereโ€™s a real clue in that status dump.

The standout lines are:

  • Agents: 1 total ยท 1 bootstrapping
  • Bootstrap file: PRESENT

That means your main agent is still in bootstrap mode.

And that matters because thereโ€™s a known gotcha where a leftover BOOTSTRAP.md in the workspace can keep an agent bootstrapping way longer than intended. That can absolutely cause weird behavior like generic failures / aborted runs.

What I think is happening

Your logs do not look like:

  • Telegram broken
  • gateway broken
#
  • config unreadable

Those all look fine.

The suspicious part is really:

  • gateway reachable โœ…
  • Telegram OK โœ…
  • agent still bootstrapping โš ๏ธ

So this is much more likely to be related to the โ€œagent aborted / something went wrongโ€ behavior than the state-dir warning.

What Iโ€™d do next

If you do not intentionally need bootstrap mode anymore, check for this file:

#
ls -la /data/workspace/BOOTSTRAP.md

If it exists, move it aside:

mv /data/workspace/BOOTSTRAP.md /data/workspace/BOOTSTRAP.md.backup-$(date +%Y%m%d-%H%M%S)

Then restart the app/container and test again.

Why

That keeps the file as a backup, but stops the agent from treating every run like first-run bootstrap setup.

So yes, from everything you pasted, this is the biggest red flag I see right now.

digital sundial
#

will that file name be a problem like codex earlier?

#

ls -la /data/workspace/BOOTSTRAP.md
-rw-r--r-- 1 root root 1470 Mar 15 15:15 /data/workspace/BOOTSTRAP.md

worthy gobletBOT
#

Nope, this one is different.

The codex problem happened because it was still a directory inside:

/data/.openclaw/agents/

So doctor still treated it like an agent.

For BOOTSTRAP.md, the trigger is the exact bootstrap file name/location.
If you move it to:

/data/workspace/BOOTSTRAP.md.backup-<timestamp>
#

that should not be treated like an active bootstrap file.

So yes, renaming it like that is the normal safe move.

#

Yep, thatโ€™s exactly the kind of stale bootstrap file I was worried about.

The date is a good hint too, March 15 is old enough that itโ€™s very plausibly just lingering.

Iโ€™d move it aside now

mv /data/workspace/BOOTSTRAP.md /data/workspace/BOOTSTRAP.md.backup-$(date +%Y%m%d-%H%M%S)

Then restart the app/container and test again.

Why I think this matters

Your status showed:

  • 1 bootstrapping
#
  • Bootstrap file PRESENT

and now weโ€™ve confirmed the file is really there.

So this is a strong candidate for the weird aborted / generic failure behavior.

digital sundial
#

after i moved it to the backup location, give me the cmd to see whats inside

worthy gobletBOT
#

Yep, easiest is:

ls -1 /data/workspace/BOOTSTRAP.md.backup-*

Then to print it:

cat /data/workspace/BOOTSTRAP.md.backup-*

If you want a nicer paged view:

less /data/workspace/BOOTSTRAP.md.backup-*
#

Or just the first 200 lines:

sed -n '1,200p' /data/workspace/BOOTSTRAP.md.backup-*
digital sundial
#

moved and restarted

hey where you from?
โš ๏ธ Something went wrong while processing your request. Please try again, or use /new to start a fresh session.

worthy gobletBOT
#

Okay, thatโ€™s actually useful. It means:

  • moving BOOTSTRAP.md was still worth doing
  • but it wasnโ€™t the only cause

So now weโ€™re firmly in runtime error territory, not just bootstrap/state-dir weirdness.

Next step, get the real error

If openclaw logs works in your setup, do this:

openclaw logs --follow

Then send one test message to the bot and copy the error lines that appear.

#

If openclaw logs doesnโ€™t work

Use the Coolify app logs and grab the lines at the exact time the message fails.

What we need

We want the actual stack trace / error message, not the user-facing:

โ€œSomething went wrong while processing your requestโ€ฆโ€

That message is just the fallback.

Also check this once

After moving BOOTSTRAP.md, run:

#
openclaw status --all

and confirm whether it still says:

  • 1 bootstrapping
  • Bootstrap file PRESENT

If those are gone, then we successfully removed one possible cause, and the remaining failure is something else.

At this point, Iโ€™d really open a new thread for this runtime failure and paste:

  • openclaw --version
  • openclaw status --all
  • the error lines from openclaw logs --follow or Coolify logs

Thatโ€™ll make it much easier to debug cleanly.

digital sundial
#

2026-04-14T21:11:00.759+00:00 [heartbeat] started
2026-Apr-14 21:11:00
2026-04-14T21:11:00.828+00:00 [ws] webchat connected conn=9d827710-05e9-475e-a0a6-a9b6c1101bcd remote=127.0.0.1 client=openclaw-control-ui webchat v2026.4.14
2026-Apr-14 21:11:00
2026-04-14T21:11:00.874+00:00 [telegram] [default] starting provider (@AutoDjangoBot)
2026-Apr-14 21:11:01
2026-04-14T21:11:01.256+00:00 [telegram] menu text exceeded the conservative 5700-character payload budget; shortening descriptions to keep 68 commands visible.
2026-Apr-14 21:11:01
2026-04-14T21:11:01.349+00:00 [ws] โ‡„ res โœ“ sessions.usage 56ms conn=9d827710โ€ฆ1bcd id=5487170aโ€ฆc68c
2026-Apr-14 21:11:02
2026-04-14T21:11:02.036+00:00 [plugins] embedded acpx runtime backend ready
2026-Apr-14 21:11:27
2026-04-14T21:11:27.811+00:00 [diagnostic] lane task error: lane=main durationMs=1630 error="Error: Context engine "lossless-claw" factory returned an invalid ContextEngine: info.id must match registered id "lossless-claw"."
2026-Apr-14 21:11:27
2026-04-14T21:11:27.813+00:00 [diagnostic] lane task error: lane=session:agent:main:telegram:direct:367989403 durationMs=1634 error="Error: Context engine "lossless-claw" factory returned an invalid ContextEngine: info.id must match registered id "lossless-claw"."
2026-Apr-14 21:11:27
2026-04-14T21:11:27.823+00:00 Embedded agent failed before reply: Context engine "lossless-claw" factory returned an invalid ContextEngine: info.id must match registered id "lossless-claw".

#

21:14:26+00:00 error diagnostic {"subsystem":"diagnostic"} lane task error: lane=main durationMs=341 error="Error: Context engine "lossless-claw" factory returned an invalid ContextEngine: info.id must match registered id "lossless-claw"."
21:14:26+00:00 error diagnostic {"subsystem":"diagnostic"} lane task error: lane=session:agent:main:telegram:direct:367989403 durationMs=343 error="Error: Context engine "lossless-claw" factory returned an invalid ContextEngine: info.id must match registered id "lossless-claw"."
21:14:26+00:00 error Embedded agent failed before reply: Context engine "lossless-claw" factory returned an invalid ContextEngine: info.id must match registered id "lossless-claw".

worthy gobletBOT
#

Yep, we found it.

This is not your state-dir issue.
This is an exact known regression in 2026.4.14:

And the error you pasted is an exact match:

Context engine "lossless-claw" factory returned an invalid ContextEngine:
info.id must match registered id "lossless-claw"

Thatโ€™s why the bot is replying with:

#

โš ๏ธ Something went wrong while processing your request...

Short answer

2026.4.14 broke lossless-claw.

Workaround right now

Temporarily disable lossless-claw and revert to the default context engine.

openclaw plugins disable lossless-claw
openclaw config unset plugins.slots.contextEngine

Then restart the app/container.

#

Important

  • this should restore normal replies
  • it does not delete your LCM DB at /data/.openclaw/lcm.db
  • you can re-enable lossless-claw later after the regression is fixed

So yes, this one is real, known, and directly explains the broken bot behavior.

#

Yep, thatโ€™s an exact match.

This log confirms the known 2026.4.14 lossless-claw regression.
Nothing ambiguous there.

Fix right now

Run:

openclaw plugins disable lossless-claw
openclaw config unset plugins.slots.contextEngine

Then restart the app/container.

Why both?

#
  • plugins disable lossless-claw stops the plugin
  • config unset plugins.slots.contextEngine stops OpenClaw from trying to use it as the active context engine

After restart, test the bot again.

So at this point, the broken replies are explained:
lossless-claw is crashing the agent before it can answer.

digital sundial
#

openclaw config unset plugins.slots.contextEngine

whats this for?

and how do you know it broke? checked github?

worthy gobletBOT
#

Yep.

What this does

openclaw config unset plugins.slots.contextEngine

removes the config value that says:

โ€œuse lossless-claw as my active context engineโ€

So after you unset it, OpenClaw stops trying to route every conversation through Lossless Claw and falls back to the normal built-in context handling.

Thatโ€™s why I suggested it together with:

#
openclaw plugins disable lossless-claw

One disables the plugin, the other removes the config pointer to it.

How I know it broke

Because:

  1. your logs exactly say:

    Context engine "lossless-claw" factory returned an invalid ContextEngine: info.id must match registered id "lossless-claw"
    
  2. I checked GitHub and found an exact matching new issue:

    • #66591
digital sundial
#

the lossless claw github was updated

after i update, can you give me the terminal commands to re-enable?

worthy gobletBOT
#

Yep. After youโ€™ve updated Lossless Claw, the re-enable steps should be:

openclaw plugins enable lossless-claw
openclaw config set plugins.slots.contextEngine lossless-claw

Then restart the app/container.

After restart, sanity-check

openclaw config get plugins.slots.contextEngine

You want it to print:

#
lossless-claw

Important

Do not use --strict-json on that config set line.
For string values like lossless-claw, plain:

openclaw config set plugins.slots.contextEngine lossless-claw

is the easy version.

If you want, after you update and restart, paste the first few LCM/plugin log lines and Iโ€™ll sanity-check that it came back cleanly.

digital sundial
#

dms work but the group just gets a reacion, typing but no reply

worthy gobletBOT
#

If DMs work but the group only gets a reaction / typing and no final reply, that sounds like a different issue again, and probably not the Lossless Claw breakage.

Why:

  • DM success means the core agent/model path is alive
  • group-only failure usually points more toward:
    • group routing/policy
    • group permissions
    • group-specific delivery failure
    • Telegram group handling quirks

So I would treat that as a separate group-delivery bug, not the same root cause as the earlier state-dir / LCM issue.

Best move

Please open a new thread for that one and include:

#
openclaw --version
openclaw status --all
openclaw config get channels.telegram --json

plus one log snippet from a failed group message.

Thatโ€™ll make it much easier to debug cleanly, and keep this thread from turning into five different issues glued together.

digital sundial
#

openclaw config get channels.telegram --json
{
"capabilities": {
"inlineButtons": "all"
},
"execApprovals": {
"enabled": true,
"approvers": [
"367989403",
"421931654"
],
"agentFilter": [
"main"
],
"target": "dm"
},
"enabled": true,
"dmPolicy": "allowlist",
"botToken": "OPENCLAW_REDACTED",
"groups": {
"*": {
"requireMention": true,
"allowFrom": [
"367989403",
"421931654"
]
},
"-1003705610703": {
"topics": {
"1124": {
"requireMention": false
},
"1686": {
"systemPrompt": "This topic is for travel planning. Use the trip-planning protocol at protocols/trip-planning/README.md by default for travel requests here. Plan trips, compare flights/hotels, and return concise curated options. Do not autonomously book. If the conversation stops being mainly about travel, suggest moving it back to General."
},
"2247": {
"systemPrompt": "This topic is for lead generation. Use the lead-gen protocol at protocols/lead-gen/README.md by default for lead-gen requests here. Handle sourcing, verification, enrichment, qualification, dedupe, and outreach-ready CSV/list building. Do not enrich people from name-only guesses. If the conversation stops being mainly about lead gen, suggest moving it back to General."
},
"2965": {
"systemPrompt": "redacted because too long"
}
}
}
},
"allowFrom": [
"367989403",
"421931654"
],
"groupAllowFrom": [
"367989403",
"421931654"
],
"groupPolicy": "allowlist",
"streaming": {
"mode": "partial"
},
"threadBindings": {
"spawnAcpSessions": true
},
"reactionLevel": "extensive"
}

worthy gobletBOT
#

This config actually looks mostly fine.

The big thing that jumps out is not allowlists or Telegram auth, itโ€™s this:

"groups": {
  "-1003705610703": {
    "topics": {
      "1686": { "systemPrompt": "This topic is for travel planning. Use the trip-planning protocol at protocols/trip-planning/README.md ..." },
      "2247": { "systemPrompt": "This topic is for lead generation. Use the lead-gen protocol at protocols/lead-gen/README.md ..." },
      "2965": { "systemPrompt": "..." }
    }
  }
}

Why that matters

#

DMs do not use those topic-specific prompts.

So if:

  • DMs work
  • group/topic messages react + type
  • but no final reply

then Iโ€™d be suspicious of topic-specific prompt / file / protocol issues.

What Iโ€™d check next

If the failing group messages are happening in one of those topics, check whether those files actually exist:

ls -la /data/workspace/protocols/trip-planning/README.md
#
ls -la /data/workspace/protocols/lead-gen/README.md

And maybe:

find /data/workspace/protocols -maxdepth 3 | sort

My read

  • Telegram config itself: looks okay
  • group allowlist: looks okay
  • most suspicious part: those topic-specific system prompts / referenced files

So if the no-reply behavior is happening in one of those special topics, I think thatโ€™s the first place to look.

digital sundial
#

ls -la /data/workspace/protocols/trip-planning/README.md
-rw-r--r-- 1 root root 13848 Mar 24 11:48 /data/workspace/protocols/trip-planning/README.md

find /data/workspace/protocols -maxdepth 3 | sort
/data/workspace/protocols
/data/workspace/protocols/conference-radar
/data/workspace/protocols/conference-radar/README.md
/data/workspace/protocols/conference-radar/feedback.md
/data/workspace/protocols/conference-radar/sources.csv
/data/workspace/protocols/email-drafting
/data/workspace/protocols/email-drafting/README.md
/data/workspace/protocols/email-drafting/context.template.json
/data/workspace/protocols/email-drafting/csv-schema.md
/data/workspace/protocols/email-drafting/examples.md
/data/workspace/protocols/lead-gen
/data/workspace/protocols/lead-gen/README.md
/data/workspace/protocols/lead-gen/csv-schema.md
/data/workspace/protocols/lead-gen/examples.md
/data/workspace/protocols/lead-gen/review-rules.md
/data/workspace/protocols/trip-planning
/data/workspace/protocols/trip-planning/README.md
/data/workspace/protocols/trip-planning/examples.md
/data/workspace/protocols/trip-planning/implementation-spec.md
/data/workspace/protocols/trip-planning/trip-schema.json

#

failing behaviour happens in all topics

#

@worthy goblet did you forget me?

worthy gobletBOT
#

Nope, still here ๐Ÿฆ

I think the most useful clue from what you pasted is still this:

  • DMs work
  • group replies fail
  • your Telegram config has topic-specific system prompts pointing at files under protocols/...

So I donโ€™t think I forgot you, I think we narrowed it to a group/topic-specific config or prompt path issue.

If you want to keep going right here, send me either:

ls -la /data/workspace/protocols/trip-planning/README.md
ls -la /data/workspace/protocols/lead-gen/README.md
#

or tell me which Telegram topic ID / group is failing.

Thatโ€™s the next thing Iโ€™d check.

digital sundial
#

all the topic ids are failing even the main one

worthy gobletBOT
#

Thatโ€™s actually really helpful.

If all topic IDs fail, including the main one, then we can probably rule out the topic-specific protocol files.

So the thing that stands out next in your Telegram config is this:

"threadBindings": {
  "spawnAcpSessions": true
}

That is a group/topic-specific behavior, and DMs donโ€™t depend on it the same way.

My next suspicion

A problem with topic/thread session spawning, not Telegram auth and not the per-topic prompts.

#

Good A/B test

Temporarily turn that off:

openclaw config set channels.telegram.threadBindings.spawnAcpSessions false --strict-json

Then restart the app/container and test a group topic again.

Why this is a good test

  • DMs work
  • all group topics fail
  • that points to something shared by group/topic handling
  • spawnAcpSessions is exactly one of those shared group/topic features
#

If disabling that makes group replies work again, we found the culprit.

If you want, after you test it, send me just:

  • โ€œgroups work nowโ€
    or
  • โ€œstill brokenโ€

and Iโ€™ll tell you the next step.

digital sundial
#

ls -la /data/workspace/protocols/trip-planning/README.md
-rw-r--r-- 1 root root 13848 Mar 24 11:48 /data/workspace/protocols/trip-planning/README.md

if you still need it

what does spawnacpsessions do?

worthy gobletBOT
#

Yep, that file exists, so the trip-planning path is probably not the problem.

What spawnAcpSessions does

At a high level, it makes topic/thread chats spin up their own ACP-backed persistent session instead of just using the normal main embedded session flow.

So, roughly:

  • DMs โ†’ normal/direct agent flow
  • group topics/threads โ†’ can get their own separate spawned session

That can be useful because each topic gets cleaner history/state, but it also means group topics use a different path than DMs.

Why I care about it here

Since:

#
  • DMs work
  • all group topics fail

the ACP thread/session path is one of the most suspicious group-only differences in your config.

So the test still makes sense

Temporarily turn it off:

openclaw config set channels.telegram.threadBindings.spawnAcpSessions false --strict-json

restart, test a group topic, and see what happens.

If you want to turn it back on later:

#
openclaw config set channels.telegram.threadBindings.spawnAcpSessions true --strict-json

So short version:
spawnAcpSessions = โ€œuse separate spawned sessions for topics/threads.โ€
And thatโ€™s why itโ€™s a good thing to A/B test here.

digital sundial
#

and how do i turn it back on? will disabling it break anything? check githb also

worthy gobletBOT
#

Yep, I checked GitHub, and there are related Telegram topic + ACP issues, so this is a pretty reasonable thing to test.

Related issues

Not a perfect 1:1 match, but definitely the same neighborhood.

What happens if you turn it off?

This setting is mainly for ACP thread/topic session spawning.

So if you disable it:

#
  • DMs should still work
  • normal Telegram handling should still work
  • you mainly lose the special ACP-per-topic/session behavior

Thatโ€™s why itโ€™s a good diagnostic toggle. It should not โ€œbreak the whole botโ€, it just disables that extra thread/topic ACP layer.

Turn it off

openclaw config set channels.telegram.threadBindings.spawnAcpSessions false --strict-json

Turn it back on

openclaw config set channels.telegram.threadBindings.spawnAcpSessions true --strict-json

Then restart the app/container after changing it.

#

So short version:

  • yes, you can safely toggle it for testing
  • yes, there are GitHub issues that make this a believable suspect
  • re-enable command is the same line with true instead of false
digital sundial
#

is there no more shared context when its disabled? but yeah ill try and get back to you

worthy gobletBOT
#

You still get context, just through the normal session flow instead of a separate ACP-bound topic session.

digital sundial
#

disabled, same thing in the group. dms still work of course

should i re-enable now?

should i disable requiring mentions in the group chat?

check github too

worthy gobletBOT
#

Yep, since disabling spawnAcpSessions didnโ€™t change anything, Iโ€™d turn it back on.

Re-enable

openclaw config set channels.telegram.threadBindings.spawnAcpSessions true --strict-json

Then restart the app/container.

I would not change requireMention yet

Why:

  • if mention gating were the problem, the bot usually wouldnโ€™t start processing
  • but yours is already reacting + typing
  • that means the message is getting through, and the failure is happening later, during reply delivery/runtime
#

So I donโ€™t think requireMention is the right lever here.

GitHub check

I checked, and there are some very related Telegram topic/group delivery issues:

My read now

At this point this looks much more like a Telegram topic/group reply delivery bug than a config mistake.

#

So yes:

  • re-enable spawnAcpSessions
  • donโ€™t bother changing requireMention yet
  • and Iโ€™d treat this as likely being in the same family as those Telegram topic delivery issues
digital sundial
#

can we just try the require mention? i mean if its easy to temp disable and enable

worthy gobletBOT
#

Yeah, thatโ€™s easy enough to A/B test.

I still think itโ€™s low probability, since the bot is already reacting + typing, but itโ€™s a harmless toggle.

Temporarily disable requireMention

openclaw config set 'channels.telegram.groups["*"].requireMention' false --strict-json

Then restart the app/container and test.

Turn it back on

openclaw config set 'channels.telegram.groups["*"].requireMention' true --strict-json
#

Then restart again.

One small note

Topic 1124 already had requireMention: false, so this mainly changes the default group behavior for the other topics/groups.

So yes, easy to try.

digital sundial
#

is there a benefit to openclaw config set channels.telegram.threadBindings.spawnAcpSessions true --strict-json or better keep it off?

ok also wait the dms are also very slow and my last one didn't get a reply but maybe its because i asked it to check main topic?

worthy gobletBOT
#

spawnAcpSessions=true only really helps if you want separate ACP-backed topic sessions.

So:

  • ON = cleaner per-topic/session separation, more advanced behavior
  • OFF = simpler, fewer moving parts

Since it didnโ€™t obviously fix anything, I would not keep it on just for the sake of it unless you actively want that ACP-per-topic behavior.

More important part

If DMs are now also slow / not replying, then this is probably not just a Telegram topic issue anymore.

That means Iโ€™d stop tuning requireMention / spawnAcpSessions for a moment and get a fresh log snippet from one failed DM right now, because the failure may have shifted back to a broader runtime issue.

So short answer:

#
  • if you donโ€™t care about ACP topic sessions, keeping spawnAcpSessions off is fine
  • if you do want that feature later, turn it back on
  • but DMs also slowing/failing is the bigger clue now
digital sundial
#

is the acp topic thing on by default? if yes its gotta be important right?

and here are logs do these mean anything? or the error isnt captured here?

26-04-14T21:51:10.941+00:00 [browser/server] Browser control listening on http://127.0.0.1:18791/ (auth=token)
2026-Apr-14 21:51:11
2026-04-14T21:51:11.327+00:00 [heartbeat] started
2026-Apr-14 21:51:11
2026-04-14T21:51:11.396+00:00 [ws] webchat connected conn=e22947e6-e6be-4619-97a9-eb65ed99789b remote=127.0.0.1 client=openclaw-control-ui webchat v2026.4.14
2026-Apr-14 21:51:11
2026-04-14T21:51:11.466+00:00 [telegram] [default] starting provider (@AutoDjangoBot)
2026-Apr-14 21:51:11
2026-04-14T21:51:11.753+00:00 [telegram] menu text exceeded the conservative 5700-character payload budget; shortening descriptions to keep 66 commands visible.
2026-Apr-14 21:51:12
2026-04-14T21:51:12.214+00:00 [plugins] embedded acpx runtime backend ready
2026-Apr-14 21:51:37
2026-04-14T21:51:37.359+00:00 [ws] โ‡„ res โœ“ usage.cost 25552ms conn=e22947e6โ€ฆ789b id=21c3cd75โ€ฆ1ca1
2026-Apr-14 21:51:37
2026-04-14T21:51:37.555+00:00 [ws] โ‡„ res โœ“ sessions.usage 25782ms conn=e22947e6โ€ฆ789b id=e4c82aaeโ€ฆfb8f

worthy gobletBOT
#

I wouldnโ€™t treat spawnAcpSessions as โ€œmust be on because it existsโ€.

From the docs, itโ€™s specifically needed for ACP thread-bound spawn in Telegram topics like:

/acp spawn <agent> --thread here|auto

So for normal Telegram replies, itโ€™s not the thing that makes the bot fundamentally work.

On your logs

Those lines mostly show:

  • startup is happening
  • Telegram provider started
  • embedded ACP runtime is ready
#
  • the web UI asked for usage.cost / sessions.usage
  • those web UI requests took ~25s

So:

  • yes, they show some slowness
  • no, they do not show the actual reply failure

The actual failure would be something more like:

  • Embedded agent failed before reply
  • a Telegram send/delivery error
  • a lane/task error tied to the message time

So I think the real error just isnโ€™t captured in this snippet.

Short answer

#
  • spawnAcpSessions = optional ACP topic feature, not โ€œcore Telegram must-haveโ€
  • these logs = mostly normal startup + slow web queries
  • the real message failure = probably not in this pasted chunk
digital sundial
#

should i just back up and reinstall? will everything just work with the bot and group?

worthy gobletBOT
#

Iโ€™d back up, yes. I would not assume reinstall will make everything โ€œjust workโ€, no.

Because from everything you found:

  • lossless-claw on 2026.4.14 was a real regression
  • the ~/.openclaw warning looks like a plugin/path quirk
  • the Telegram group/topic reply problem looks like a separate delivery/routing bug family

A reinstall can give you a cleaner baseline, but it does not guarantee those go away.

My recommendation

Best option

  • Back up /data/.openclaw and /data/workspace
  • only reinstall if you want a clean baseline
  • but do not expect reinstall alone to fix the Telegram group behavior
#

Better than blind reinstall

If you know a version where it worked, Iโ€™d trust a rollback to last known-good more than a generic reinstall.

If you do reinstall

Preserve your state/workspace so you donโ€™t lose:

  • bot config
  • sessions
  • auth
  • LCM DB
  • workspace files

Honest answer

  • backup: yes
  • reinstall: okay if you want
  • guaranteed fix for bot + groups: no, I would not promise that
digital sundial
#

ok maybe i can just go back to 2026.4.6

worthy gobletBOT
#

Yeah, if 2026.4.6 was known-good for you, thatโ€™s a pretty reasonable move.

Iโ€™d just keep expectations realistic:

  • it will likely avoid the 2026.4.14 lossless-claw regression
  • it may or may not fix the Telegram group/topic issue, since some of those reports are older

If you do it

Back up first:

  • /data/.openclaw
  • /data/workspace

Then pin/deploy v2026.4.6.

So my honest take:

#
  • rollback to 2026.4.6: reasonable
  • guaranteed full fix: no
  • good โ€œget me back to a known baselineโ€ move: yes
digital sundial
#

how to back up? and how to restore?

worthy gobletBOT
#

For your Coolify /data setup, Iโ€™d honestly use a plain tar backup, because restore is much simpler than the first-class openclaw backup archive.

Simple backup

tar -czf /root/openclaw-backup-$(date +%Y%m%d-%H%M%S).tgz /data/.openclaw /data/workspace

Then verify it exists:

ls -lh /root/openclaw-backup-*.tgz

Simple restore

First, move the current data aside instead of deleting it:

#
mv /data/.openclaw /data/.openclaw.before-restore-$(date +%Y%m%d-%H%M%S)
mv /data/workspace /data/workspace.before-restore-$(date +%Y%m%d-%H%M%S)

Then restore the backup:

tar -xzf /root/openclaw-backup-YYYYMMDD-HHMMSS.tgz -C /

After that, restart the app/container.

Why I recommend this

  • your real state is already under /data
  • tar makes backup/restore very obvious
digital sundial
#

2026-Apr-14 22:04:29
2026-04-14T22:04:29.774+00:00 [telegram] setMessageReaction failed: Call to 'setMessageReaction' failed! (400: Bad Request: message to react not found)
2026-Apr-14 22:04:29
2026-04-14T22:04:29.782+00:00 [plugins] embedded acpx runtime backend ready
2026-Apr-14 22:05:04
2026-Apr-14 22:05:04
<--- Last few GCs --->
2026-Apr-14 22:05:04
2026-Apr-14 22:05:04
[103:0x56a8dac10000] 115122 ms: Mark-Compact 2045.2 (2050.2) -> 2045.1 (2050.5) MB, pooled: 0.0 MB, 411.00 / 0.00 ms (average mu = 0.190, current mu = 0.029) allocation failure; scavenge might not succeed
2026-Apr-14 22:05:04
2026-Apr-14 22:05:04
FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
2026-Apr-14 22:05:04
----- Native stack trace -----
2026-Apr-14 22:05:04
2026-Apr-14 22:05:04
1: 0x7eb1e54015f8 node::OOMErrorHandler(char const*, v8::OOMDetails const&) [/data/linuxbrew/Cellar/node/25.8.1_1/bin/../lib/libnode.so.141]
2026-Apr-14 22:05:04
2: 0x7eb1e60da7c4 [/data/linuxbrew/Cellar/node/25.8.1_1/bin/../lib/libnode.so.141]
2026-Apr-14 22:05:04
3: 0x7eb1e60da8ad [/data/linuxbrew/Cellar/node/25.8.1_1/bin/../lib/libnode.so.141]
2026-Apr-14 22:05:04
4: 0x7eb1e634bebb [/data/linuxbrew/Cellar/node/25.8.1_1/bin/../lib/libnode.so.141]
2026-Apr-14 22:05:04
5: 0x7eb1e634c18d [/data/linuxbrew/Cellar/node/25.8.1_1/bin/../lib/libnode.so.141]
2026-Apr-14 22:05:04
6: 0x7eb1e635b2b8 [/data/linuxbrew/Cellar/node/25.8.1_1/bin/../lib/libnode.so.141]
2026-Apr-14 22:05:04
7: 0x7eb1e6334cb2 [/data/linuxbrew/Cellar/node/25.8.1_1/bin/../lib/libnode.so.141]
2026-Apr-14 22:05:04
8: 0x7eb1e6334e4e [/data/linuxbrew/Cellar/node/25.8.1_1/bin/../lib/libnode.so.141]
2026-Apr-14 22:05:04
9: 0x7eb1e630fa0f [/data/linuxbrew/Cellar/node/25.8.1_1/bin/../lib/libnode.so.141]
2026-Apr-14 22:05:04
10: 0x7eb1e630b383 [/data/linuxbrew/Cellar/node/25.8.1_1/bin/../lib/libnode.so.141]
2026-Apr-14 22:05:04

worthy gobletBOT
#

There it is, this is the real problem.

What happened

Your process ran out of memory and crashed:

FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory

The key lines are:

Mark-Compact 2045 MB -> 2045 MB
FATAL ERROR: Reached heap limit
#

So the bot isnโ€™t just โ€œbeing weirdโ€, Node is hitting its heap limit at about 2 GB and dying.

What that explains

This absolutely explains:

  • slow replies
  • no replies
  • typing / reaction but no final message
  • random โ€œsomething went wrongโ€ behavior

The earlier Telegram reaction error is not the main issue.
Thatโ€™s minor compared to the OOM crash.

What Iโ€™d do next

Option 1, best immediate fix

#

If your host has enough RAM, set in Coolify:

NODE_OPTIONS=--max-old-space-size=4096

Then redeploy/restart.

That gives Node a 4 GB heap instead of dying around 2 GB.

Option 2

Give the container more RAM in Coolify.

Important caution

Only set --max-old-space-size=4096 if the machine/container actually has enough memory for it. Otherwise itโ€™ll just die later.

#

My honest read

This is now much more important than the ~/.openclaw warning.

You found the actual runtime killer:
the process is OOMing.

If you want, I can help you choose a sane memory setting based on how much RAM your Coolify server has.

digital sundial
#

wait why is this? the server connected to coolify is 8gb? isnt that all available to coolify?

worthy gobletBOT
#

Right, and thatโ€™s the confusing part.

8 GB on the server does not mean Node will automatically use 8 GB of heap.

Two different limits

1) Machine / server RAM

Your VPS/server might have 8 GB total.

2) Node.js heap limit

Node still has its own heap ceiling, and by default it often dies around ~2 GB-ish unless you raise it with NODE_OPTIONS.

Thatโ€™s exactly what your log showed:

Mark-Compact 2045 MB ...
#
FATAL ERROR: Reached heap limit

So Node hit its own heap cap, not necessarily the full machine limit.

Also possible

On top of that, Coolify / Docker may also have:

  • container-level memory limits
  • other services using RAM
  • OS/cache overhead

So โ€œserver has 8 GBโ€ โ‰  โ€œthis one Node process can freely use all 8 GBโ€.

Short version

#
  • server RAM: 8 GB
  • Node default heap: much lower
  • your crash: Node heap limit, not proof the whole server ran out of RAM

Thatโ€™s why setting:

NODE_OPTIONS=--max-old-space-size=4096

can help, if the machine/container really has room.

digital sundial
#

whats taking up this space? garbage? how to clean?

worthy gobletBOT
#

This is RAM / Node heap, not disk space.

And based on your log, itโ€™s not mostly garbage.

The giveaway is this line:

2045.2 MB -> 2045.1 MB

after a full Mark-Compact GC.

That means GC ran hard and freed almost nothing, so most of that ~2 GB was still live/reachable JS memory, not just junk waiting to be cleaned.

So whatโ€™s using it?

From the log alone, we canโ€™t name the exact object, but itโ€™s likely one or more of:

#
  • large in-memory session/context state
  • plugin/runtime memory
  • browser/runtime state
  • big usage/session summaries in memory
  • something leaking / accumulating over time

How to โ€œcleanโ€ it

Immediate cleanup

Restart the app/container.
That fully clears the heap.

Prevent it happening again

  • raise Node heap (NODE_OPTIONS=--max-old-space-size=4096) if the box has room
  • reduce memory-heavy features / dashboards / long-lived state
  • keep sessions cleaned up, though that helps disk more than immediate heap
  • if it keeps happening, treat it as a memory leak / high-memory bug, not something GC can solve by itself
#

Important distinction

  • disk cleanup โ‰  this problem
  • GC cleanup already tried and basically failed
  • so there is no simple โ€œclean the garbageโ€ command for this specific crash

The real fix is either:

  1. give Node more heap
  2. find what feature/workload is keeping so much memory alive
digital sundial
#

also, is this related or no?

2026-Apr-14 22:08:44
โ”Œ OpenClaw doctor
2026-Apr-14 22:08:44
โ”‚
2026-Apr-14 22:08:44
โ—‡ Startup optimization โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
2026-Apr-14 22:08:44
โ”‚ โ”‚
2026-Apr-14 22:08:44
โ”‚ - NODE_COMPILE_CACHE is not set; repeated CLI runs can be slower on โ”‚
2026-Apr-14 22:08:44
โ”‚ small hosts (Pi/VM). โ”‚
2026-Apr-14 22:08:44
โ”‚ - OPENCLAW_NO_RESPAWN is not set to 1; set it to avoid extra startup โ”‚
2026-Apr-14 22:08:44
โ”‚ overhead from self-respawn. โ”‚
2026-Apr-14 22:08:44
โ”‚ - Suggested env for low-power hosts: โ”‚
2026-Apr-14 22:08:44
โ”‚ export NODE_COMPILE_CACHE=/var/tmp/openclaw-compile-cache โ”‚
2026-Apr-14 22:08:44
โ”‚ mkdir -p /var/tmp/openclaw-compile-cache โ”‚
2026-Apr-14 22:08:44
โ”‚ export OPENCLAW_NO_RESPAWN=1 โ”‚
2026-Apr-14 22:08:44
โ”‚ โ”‚
2026-Apr-14 22:08:44
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

worthy gobletBOT
#

Not really, no.

Those are startup performance hints, not the cause of the heap crash.

They mean:

  • CLI launches can be slower than necessary
  • OpenClaw is suggesting a compile cache + no-respawn mode for small hosts

So they may help with:

  • faster CLI startup
  • less startup overhead
  • a bit less churn on weaker machines

But they do not explain this:

#
FATAL ERROR: Reached heap limit

So:

  • related to slowness in a broad sense? a little
  • related to the OOM / broken replies? not the main cause

You can set them if you want, but I would not expect them to fix the crashing behavior.

digital sundial
#

ok tell me again the env vars to set to fix

worthy gobletBOT
#

For the actual crash / OOM, the main one is:

NODE_OPTIONS=--max-old-space-size=4096

If you also want the startup optimization ones doctor suggested:

OPENCLAW_NO_RESPAWN=1
NODE_COMPILE_CACHE=/var/tmp/openclaw-compile-cache

and make sure that cache dir exists:

#
mkdir -p /var/tmp/openclaw-compile-cache

Short version

Required-ish for the memory problem

NODE_OPTIONS=--max-old-space-size=4096

Optional perf tweaks

OPENCLAW_NO_RESPAWN=1
NODE_COMPILE_CACHE=/var/tmp/openclaw-compile-cache
#

If you want, I can format them exactly as Coolify env entries next.

digital sundial
#

important? If you also want the startup optimization ones doctor suggested:

OPENCLAW_NO_RESPAWN=1
NODE_COMPILE_CACHE=/var/tmp/openclaw-compile-cache

worthy gobletBOT
#

Not very important for the crash, no.

Priority

  1. Important

    NODE_OPTIONS=--max-old-space-size=4096
    

    This is the one that actually targets the OOM.

  2. Optional

    OPENCLAW_NO_RESPAWN=1
    NODE_COMPILE_CACHE=/var/tmp/openclaw-compile-cache
    

    These are just startup/perf tweaks.

So if you want the minimum useful change, do only:

#
NODE_OPTIONS=--max-old-space-size=4096

The other two are nice-to-have, not the main fix.

digital sundial
#

that node options thing worked!!! write all the problems and solutions as a summary pls

unless you see other problems in these logs

2026-Apr-14 22:17:23
2026-04-14T22:17:23.099+00:00 [telegram] [default] starting provider (@AutoDjangoBot)
2026-Apr-14 22:17:23
2026-04-14T22:17:23.629+00:00 [telegram] menu text exceeded the conservative 5700-character payload budget; shortening descriptions to keep 66 commands visible.
2026-Apr-14 22:17:23
2026-04-14T22:17:23.714+00:00 [ws] โ‡„ res โœ“ node.list 64ms conn=94b27baaโ€ฆ285e id=3bf2113dโ€ฆb9e4
2026-Apr-14 22:17:23
2026-04-14T22:17:23.717+00:00 [ws] โ‡„ res โœ“ device.pair.list 66ms conn=94b27baaโ€ฆ285e id=62161a09โ€ฆ08a8
2026-Apr-14 22:17:24
2026-04-14T22:17:24.112+00:00 [plugins] embedded acpx runtime backend ready
2026-Apr-14 22:18:30
2026-04-14T22:18:30.182+00:00 [compaction-safeguard] Compaction safeguard: no real conversation messages to summarize; writing compaction boundary to suppress re-trigger loop.
2026-Apr-14 22:18:30
2026-04-14T22:18:30.424+00:00 [ws] โ‡„ res โœ“ node.list 99ms conn=94b27baaโ€ฆ285e id=e40d656fโ€ฆ3786
2026-Apr-14 22:18:32
2026-04-14T22:18:32.528+00:00 [ws] โ‡„ res โœ“ usage.cost 68828ms conn=94b27baaโ€ฆ285e id=37213ce5โ€ฆ0a7c
2026-Apr-14 22:18:33
2026-04-14T22:18:33.210+00:00 [ws] โ‡„ res โœ“ sessions.usage 69559ms conn=94b27baaโ€ฆ285e id=3ab0a5dcโ€ฆe23d
2026-Apr-14 22:18:47
2026-04-14T22:18:47.544+00:00 [ws] โ‡„ res โœ“ sessions.list 67ms conn=94b27baaโ€ฆ285e id=bb072619โ€ฆc322
2026-Apr-14 22:19:02
2026-04-14T22:19:02.055+00:00 [ws] โ‡„ res โœ“ sessions.list 57ms conn=94b27baaโ€ฆ285e id=3c4474afโ€ฆ5d85
2026-Apr-14 22:19:07
2026-04-14T22:19:07.908+00:00 [ws] โ‡„ res โœ“ sessions.list 68ms conn=94b27baaโ€ฆ285e id=1da6f141โ€ฆ875e

worthy gobletBOT
#

Nice, Iโ€™m glad we caught it.

Summary: problems and fixes

1) Multiple state directories

Problem: OpenClawโ€™s real state was /data/.openclaw, but parts of the system were still touching ~/.openclaw, so doctor warned about split state.

What we found:

  • core state was correctly using /data/.openclaw
  • Lossless Claw was defaulting to ~/.openclaw/lcm.db
  • QQ Bot was recreating ~/.openclaw/qqbot

Fix / outcome:

  • kept main state at /data/.openclaw
  • moved LCM DB to /data/.openclaw/lcm.db
  • set:
#
  LCM_DATABASE_PATH=/data/.openclaw/lcm.db
  • backed up / moved stale ~/.openclaw
  • remaining ~/.openclaw warning looks like a QQ Bot/plugin quirk, not a broken core setup

2) Session store cleanup

Problem: doctor reported missing transcripts and an orphan transcript file.

Fix:

openclaw sessions cleanup --store "/data/.openclaw/agents/main/sessions/sessions.json" --enforce --fix-missing

And doctor archived the orphan transcript file.

#

3) Stale agent dir (codex)

Problem: doctor found an agent dir on disk with no matching agents.list entry.

Fix:

  • moved the stale agent dir out of /data/.openclaw/agents/

That warning is gone.


4) Lingering BOOTSTRAP.md

Problem: the agent was still showing as bootstrapping, which was suspicious.

Fix:

  • moved /data/workspace/BOOTSTRAP.md aside as a backup
#

That was worth doing, though it turned out not to be the main cause of the reply failures.


5) lossless-claw regression on 2026.4.14

Problem: the bot was failing with:

Context engine "lossless-claw" factory returned an invalid ContextEngine...

Cause: known regression in 2026.4.14.

Fix / workaround:

openclaw plugins disable lossless-claw
#
openclaw config unset plugins.slots.contextEngine

Then later re-enable after the plugin update.


6) Real root cause of the broken / slow replies

Problem: the process was crashing with:

FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory

This was the big one.

#

Fix:

NODE_OPTIONS=--max-old-space-size=4096

That increased the Node heap and stopped it from dying around ~2 GB.


Current log notes

From the latest logs:

Harmless / informational

  • menu text exceeded ... shortening descriptions
    โ†’ harmless
  • compaction safeguard ... no real conversation messages to summarize
#

โ†’ informational / not scary

Mildly notable

  • usage.cost and sessions.usage took ~69 seconds
    โ†’ slow, but they completed, so not fatal

Most important

  • no fresh fatal OOM in the snippet you pasted after setting NODE_OPTIONS
    โ†’ thatโ€™s a good sign โœ…

What Iโ€™d call the current state

Fixed / improved

  • main state path confusion mostly sorted
  • LCM path fixed
#
  • session cleanup done
  • stale agent dir cleaned up
  • OOM crash addressed
  • lossless-claw regression identified and worked around

Possibly still open

  • ~/.openclaw warning from QQ Bot/plugin behavior
  • Telegram group/topic reply behavior may still be a separate Telegram delivery/routing issue

If this fixed the main problem, please right-click the solution message โ†’ Apps โ†’ โœ… Mark Solution so Answer Overflow can index it for future users.

digital sundial
#

re-enable the topic acp thing? what is even acp and what are the pros and cons?

worthy gobletBOT
#

Yeah, you can re-enable it, but only if you actually want what it does.

What ACP is

ACP = Agent Control Protocol

In plain English, itโ€™s the โ€œpersistent agent sessionโ€ / harness layer, where OpenClaw can bind a conversation to its own longer-lived agent session instead of treating every exchange as just the default main flow.

For Telegram topics, spawnAcpSessions basically means:

โ€œlet each topic spin up / bind to its own dedicated sessionโ€

Pros of turning it on

  • better per-topic separation
  • cleaner topic-specific memory/history
  • useful for longer-running topic conversations
#
  • better if you intentionally want each topic to behave like its own โ€œmini agent threadโ€

Cons

  • more moving parts
  • more routing/session complexity
  • more chances for topic/thread-specific bugs
  • harder to debug when Telegram topic delivery is already acting weird

My recommendation

Since disabling it didnโ€™t fix the group problem:

  • if you want the feature, re-enable it
  • if you donโ€™t care about per-topic ACP behavior, leaving it off is simpler

So Iโ€™d say:

#
  • want cleaner per-topic persistent sessions? turn it back on
  • just want stability and simplicity? leave it off for now

Re-enable command

openclaw config set channels.telegram.threadBindings.spawnAcpSessions true --strict-json

Then restart the app/container.

My honest gut: if youโ€™re not sure why you need it, itโ€™s fine to leave it off until everything else feels stable.

digital sundial
#

is it off or on by default?

worthy gobletBOT
#

Effectively off / opt-in.

The docs phrase it as:

requires channels.telegram.threadBindings.spawnAcpSessions=true

which is basically โ€œthis feature only turns on when you explicitly set it to true.โ€

So Iโ€™d treat the default as:

  • off / unset unless you deliberately enabled it.
digital sundial
#

how did it turn on then?

and are you SURE its not on by default for TELEGRAM topics??

worthy gobletBOT
#

Youโ€™re right to push on that, I shouldnโ€™t overstate it.

Iโ€™m sure it is explicitly set to true in your current config.

I am not 100% sure whether Telegram treats it as default-on in some setup path, because the Telegram docs snippet is less explicit than the Discord one.

So the honest answer is:

  • current fact: your config has it turned on
  • how it got there: could have been
    • onboarding/setup
    • a copied config/example
    • a previous manual change
    • some earlier config generation step
  • default for Telegram topics specifically: Iโ€™m not certain enough to swear itโ€™s off by default

What I can say confidently is that it is not just magically active without the config value, because your config literally contains:

#
"threadBindings": {
  "spawnAcpSessions": true
}

So something set it at some point.

digital sundial
#

will it use more or less tokens overall if its off or on?

worthy gobletBOT
#

There isnโ€™t a guaranteed always-lower answer, but roughly:

  • OFF can be a bit simpler, less session machinery
  • ON can sometimes be more efficient if it keeps each topicโ€™s context cleaner and more isolated

So the tradeoff is:

spawnAcpSessions = true

  • may use more overhead because itโ€™s doing extra session binding/spawning
  • but can use fewer tokens per topic over time if each topic stays nicely separated instead of dragging unrelated context around

spawnAcpSessions = false

  • simpler
  • fewer moving parts
  • but topic conversations may share more of the normal/default session flow, which can be less clean

Short version

#
  • not a huge โ€œon = expensive / off = cheapโ€ switch
  • ON may cost a little more operationally
  • OFF may be simpler
  • if you have lots of long topic-specific conversations, ON can actually help token efficiency

So Iโ€™d choose it based more on stability + behavior than raw token cost.

digital sundial
#

when its off, is each topic chat always in context even when talking in another chat?

worthy gobletBOT
#

No, not like one giant shared soup.

Turning spawnAcpSessions off does not mean every topic is automatically always in context everywhere else.

It mainly means:

  • topics use the normal/default session flow
  • instead of getting a separate ACP-bound spawned session

So the important distinction is:

  • off โ‰  โ€œall topic chats are merged into one universal contextโ€
  • on = stronger per-topic dedicated ACP session behavior

So no, I would not assume that talking in Topic A makes Topic A always fully in-context when you talk in Topic B.

digital sundial
#

what is acp?

worthy gobletBOT
#

ACP = Agent Control Protocol.

In plain English, itโ€™s the layer OpenClaw uses to talk to persistent agent sessions / external agent runtimes, instead of only doing everything through the basic built-in reply flow.

Think of it like:

  • normal mode = โ€œanswer this messageโ€
  • ACP mode = โ€œbind this chat/thread/topic to its own ongoing agent sessionโ€

Thatโ€™s why ACP shows up around things like:

  • thread/topic-bound sessions
  • persistent agent conversations
  • Codex / Claude Code / similar harness-style flows
digital sundial
#

what do i lose not having this turned on?

#

@worthy goblet did you forget me?

#

can topics still send requests and "talk" to each other with this off?

worthy gobletBOT
#

Mostly, you lose the dedicated per-topic ACP session behavior.

So with it off, youโ€™re giving up things like:

  • a topic automatically getting its own ACP-bound persistent session
  • cleaner one-topic = one agent session behavior
  • easier long-running topic-specific workflows
  • ACP-style thread/topic continuity

What you do not lose

For normal โ€œbot reply to messagesโ€ behavior, you donโ€™t lose much.
The bot can still reply, keep context, and work normally.

Short version

If youโ€™re not intentionally using topics as separate long-lived agent threads, youโ€™re probably not losing anything important.

#

So for most people:

  • on = nicer advanced topic/session behavior
  • off = simpler, still fine for normal chatting
#

Nope, Iโ€™m here ๐Ÿฆ

I think your earlier message got effectively duplicated in the thread and I answered the first copy, not this ping. Sorry, thatโ€™s on me.

Short answer again: with ACP topic spawn off, you mainly lose dedicated per-topic persistent ACP sessions, not normal basic bot replies.

#

Not by default, no.

Turning ACP topic spawn off does not make topics โ€œtalk to each otherโ€.

So:

  • Topic A does not normally send requests into Topic B
  • topics are not doing cross-topic agent handoffs automatically
  • there isnโ€™t some built-in โ€œall topics collaborate togetherโ€ behavior just because ACP is off

If you wanted one topic/agent/session to explicitly message another, that would need a deliberate cross-session/tool flow, not normal topic chat behavior.

digital sundial
#

so topics talking to each other can happen with or without this acp business?

worthy gobletBOT
#

Yes, thatโ€™s a separate concern.

Whether topics can โ€œtalk to each otherโ€ is not really determined by spawnAcpSessions.

That setting is about:

  • how a topic gets bound to its own session

not:

  • whether topics are allowed to cross-message each other

So, in principle:

  • cross-topic interaction could exist with ACP on
  • or with ACP off
  • but it would require some explicit routing/agent behavior, not just this toggle alone
digital sundial
#

what would you recommend?

worthy gobletBOT
#

Iโ€™d recommend leaving it off for now.

Why:

  • youโ€™ve already had enough moving parts
  • it did not fix the actual reply problem
  • your biggest real issue was the Node OOM
  • if your goal is just โ€œTelegram bot works reliably in topicsโ€, ACP topic spawn is extra complexity you probably donโ€™t need yet

My practical recommendation

For now

  • keep channels.telegram.threadBindings.spawnAcpSessions=false

Later, only if you specifically want it

Turn it on if you want:

#
  • each topic to behave like its own dedicated persistent agent thread
  • cleaner per-topic long-running workflows
  • more advanced ACP-style session behavior

So my honest answer is:

  • stability-first: leave it off
  • advanced topic-agent behavior later: turn it on once everything is boring and stable
digital sundial
#

also can you tell me again how to re-enable lcm?

worthy gobletBOT
#

Yep, to re-enable LCM / Lossless Claw:

openclaw plugins enable lossless-claw
openclaw config set plugins.slots.contextEngine lossless-claw

Then restart/redeploy the app.

Also keep this env var set

LCM_DATABASE_PATH=/data/.openclaw/lcm.db

One warning

#

Iโ€™d only re-enable it if:

  • youโ€™ve rolled back to a version where it works, or
  • the upstream 2026.4.14 regression is fixed

Because otherwise you can end up right back at the same context-engine error.

digital sundial
#

How to enable and disable strict mode for openai agents?

worthy gobletBOT
#

Yep, the knob is:

agents.defaults.embeddedPi.executionContract

Enable strict mode for OpenAI GPT-5-family agents

openclaw config set agents.defaults.embeddedPi.executionContract '"strict-agentic"' --strict-json

Disable it

openclaw config set agents.defaults.embeddedPi.executionContract '"default"' --strict-json
digital sundial
#

And what about enabling and disabling the codex harness? Any gotchas when enabling disabling? Around memory or sessions?

Check GitHub too

Also, should strict mode AND codex harness be enabled together or just one? What are pros and cons?

worthy gobletBOT
#

Yeah, a few important distinctions here.

1) Codex harness on/off

Enable native Codex harness

This means: use the bundled Codex app-server harness, not PI.

openclaw plugins enable codex
openclaw config set agents.defaults.model '"codex/gpt-5.4"' --strict-json
openclaw config set agents.defaults.embeddedHarness.runtime '"codex"' --strict-json
openclaw config set agents.defaults.embeddedHarness.fallback '"none"' --strict-json

Then restart/redeploy.

Disable native Codex harness

#

Go back to PI:

openclaw config set agents.defaults.model '"openai/gpt-5.4"' --strict-json
openclaw config set agents.defaults.embeddedHarness.runtime '"pi"' --strict-json
openclaw config set agents.defaults.embeddedHarness.fallback '"pi"' --strict-json

Then restart/redeploy.

If you want Codex OAuth, but NOT the Codex harness

Use the OpenAI Codex OAuth provider through PI:

openclaw config set agents.defaults.model '"openai-codex/gpt-5.4"' --strict-json
openclaw config set agents.defaults.embeddedHarness.runtime '"pi"' --strict-json
#
openclaw config set agents.defaults.embeddedHarness.fallback '"pi"' --strict-json

2) Gotchas when enabling/disabling Codex harness

Sessions

This is the big one.

The docs say Codex harness uses a native Codex sidecar thread, while OpenClaw still keeps its own transcript mirror.

So when switching around:

  • /new gives you a fresh OpenClaw session
  • /reset clears the OpenClaw binding for that thread
#
  • switching model/harness does not mean old native thread state magically never existed

So if behavior feels weird after switching, Iโ€™d do a fresh session.

Memory

I would not assume Codex harness itself was the cause of your earlier OOM.

But it is another moving part, and GitHub currently has ongoing Codex harness stability work.


3) What GitHub currently suggests

From docs + GitHub search:

  • #66251 tracks Codex harness stability work
    Search snippet mentions:
#
  • sessions failing to start through the expected harness

    • stale native context after reset
    • misleading context usage after compaction
  • #65568 is about a persistent Codex ACP session resume failure
    Thatโ€™s ACP-specific, not the normal embedded harness path.

  • #65983 is about background PTY exec orphan trees consuming memory after restarts
    That is more about background Codex/CLI trees, not the embedded Codex harness itself.

  • #66708 is about Active Memory slowdown/terminated runs on Codex/OpenAI-Codex setups
    Separate from Codex harness.

So yes, there are some real session-ish gotchas around Codex right now.


#

4) Strict mode vs Codex harness

These are different knobs.

Strict mode

agents.defaults.embeddedPi.executionContract = "strict-agentic"

This is the GPT-5/OpenAI/OpenAI-Codex PI execution behavior knob.

Pros:

  • pushes GPT-5 to act instead of just plan
  • retries plan-only stalls
  • auto-enables update_plan
  • explicit blocked state instead of fake progress

Cons:

  • can feel stricter / more forceful
#
  • may spend a bit more effort/tokens on retries
  • it is documented for openai/* and openai-codex/* GPT-5-family runs

Codex harness

This switches the low-level executor to the native Codex app-server.

Pros:

  • native Codex execution
  • native thread resume
  • native compaction
  • Codex-managed model discovery/session behavior

Cons:

  • more moving parts
  • extra session complexity
  • current stability work / known rough edges
#

5) Should you enable both?

I would not think of them as โ€œboth or oneโ€ for the same job.

My recommendation:

If you want stability and simple behavior

Use:

  • openai/gpt-5.4 or openai-codex/gpt-5.4
  • strict-agentic ON
  • Codex harness OFF

If you specifically want native Codex app-server behavior

Use:

#
  • codex/gpt-5.4
  • Codex harness ON
  • donโ€™t rely on strict-agentic as the main differentiator there

My honest recommendation for you

Given your recent memory/session weirdness, Iโ€™d choose:

Best default

  • strict-agentic ON
  • Codex harness OFF

That gives you better GPT-5 behavior without adding the Codex native-thread machinery.

If later you specifically want native Codex threads/resume/compaction, then test Codex harness separately.

Docs:

digital sundial
#

openclaw config set agents.defaults.embeddedPi.executionContract '"strict-agentic"' --strict-json wont break anything for other models right? if im using openai default and using glm for other things? also what about codex harness?

worthy gobletBOT
#

Right, this one is pretty safe.

Short answer

Yes, that global setting should be safe for your other models.

Why:

  • strict-agentic is a GPT-5 OpenAI / OpenAI-Codex embedded-agent contract
  • on unsupported lanes, it is supposed to effectively collapse back to default behavior
  • it does not change your image/video/TTS/PDF model settings

So if:

  • your main/default agent is openai/gpt-5.4
  • but you use GLM for other stuff
#

that should be fine.

Important nuance

It affects the embedded chat agent runtime, not every model use everywhere.

So:

  • OpenAI GPT-5 agent turns โ†’ strict-agentic applies
  • GLM or other unsupported providers/models โ†’ should just behave normally
  • media tools / separate generation models โ†’ not what this knob is for

What about Codex harness?

Thatโ€™s separate.

Setting:

#
openclaw config set agents.defaults.embeddedPi.executionContract '"strict-agentic"' --strict-json

does not enable Codex harness.

And if you later switch to native codex/* harness mode, I would not rely on strict-agentic as the important knob there. Codex harness is its own low-level executor path.

My recommendation

For your setup, this is the sane combo:

  • strict-agentic ON
  • Codex harness OFF
  • keep using openai/gpt-5.4 as main if thatโ€™s your stable lane

That gives you the better GPT-5 behavior without dragging in Codex native harness complexity.

digital sundial
#

i mean what if using codex harness and glm? problems?

worthy gobletBOT
#

Ah, got it. Then the answer is:

No inherent problem, if configured correctly

You can absolutely have:

  • Codex harness for codex/*
  • GLM for glm/*

Safe config shape

Use:

  • embeddedHarness.runtime: "auto"
  • embeddedHarness.fallback: "pi"

That way:

#
  • codex/* โ†’ uses Codex harness
  • glm/* โ†’ falls back to normal PI path

Where it breaks

If you force:

embeddedHarness: {
  runtime: "codex",
  fallback: "none",
}

globally, then GLM is a problem, because Codex harness only claims codex/*.

So in that setup:

#
  • codex/* works
  • glm/* can fail instead of running normally

Practical rule

Good

  • Codex harness = auto
  • fallback = pi

Risky / codex-only

  • runtime = codex
  • fallback = none

So if you want both Codex and GLM in one deployment, Iโ€™d use auto + pi fallback.

digital sundial
#

how to add z.ai + gml 5.1 as a model? i think most of my stuff will use this and only some things will use openai but should i still leave openai as the default? and tell me how to add glm 5.1 model and api keys

worthy gobletBOT
#

Yep. Use zai as the provider, and the model ref is:

zai/glm-5.1

Not z.ai/glm-5.1 in config, even though aliases normalize.

Easiest setup

1) Add the Z.AI key

In Coolify env:

ZAI_API_KEY=your_zai_key_here
#

If you still want OpenAI available too, also keep:

OPENAI_API_KEY=your_openai_key_here

2) Set GLM 5.1 as default

openclaw config set agents.defaults.model.primary "zai/glm-5.1"

3) Optional, keep OpenAI as fallback

openclaw config set agents.defaults.model.fallbacks '["openai/gpt-5.4"]' --strict-json

4) Verify

#
openclaw models list --provider zai

Then restart/redeploy.


Should GLM be the default?

Given what you said, yes, probably:

  • if most of your stuff will use GLM 5.1
  • and only some things need OpenAI

then Iโ€™d set:

#
  • default: zai/glm-5.1
  • fallback or manual switch: openai/gpt-5.4

Thatโ€™s the cleanest setup.

When I would keep OpenAI as default instead

Iโ€™d keep OpenAI default only if:

  • your most important chats depend on the stronger GPT-5 behavior
  • you specifically want strict-agentic to be active on your main lane
  • you trust OpenAI more for your primary agent tasks

Because if GLM becomes the default, your strict-agentic OpenAI setting mostly stops mattering unless you switch back to OpenAI.


#

Best practical recommendation for you

If your goal is โ€œmostly GLM, sometimes OpenAIโ€, Iโ€™d do:

ZAI_API_KEY=...
OPENAI_API_KEY=...
openclaw config set agents.defaults.model.primary "zai/glm-5.1"
openclaw config set agents.defaults.model.fallbacks '["openai/gpt-5.4"]' --strict-json

Docs:

digital sundial
#

When I would keep OpenAI as default instead
Iโ€™d keep OpenAI default only if:

your most important chats depend on the stronger GPT-5 behavior
you specifically want strict-agentic to be active on your main lane
you trust OpenAI more for your primary agent tasks

^ i think yes this for now

worthy gobletBOT
#

Yeah, then Iโ€™d keep OpenAI as the default for now.

That sounds like the right call for your setup.

Recommended setup

  • default: openai/gpt-5.4
  • also configured: zai/glm-5.1
  • use GLM when you want it, but donโ€™t make it the main lane yet

Practical config

Keep these env vars in Coolify:

OPENAI_API_KEY=...
#
ZAI_API_KEY=...

Keep default model as OpenAI:

openclaw config set agents.defaults.model.primary "openai/gpt-5.4"

If you want, add GLM as an easy alias:

openclaw config set 'agents.defaults.models["zai/glm-5.1"].alias' '"glm"' --strict-json

Then you can switch to it more easily when needed.

#

My recommendation

For now:

  • OpenAI default
  • strict-agentic on
  • GLM available as secondary
  • Codex harness off unless you specifically need it

Thatโ€™s a nice stable setup.

digital sundial
#

wait is the only thing i need to do add the zai key and tell openclaw to use the gml-5.1 model for {task}?? i don't have to do any terminal stuff??

worthy gobletBOT
#

Pretty much, yes.

And small typo, itโ€™s glm-5.1, not gml-5.1.

If you only want to use GLM occasionally

You basically just need:

  1. add ZAI_API_KEY in Coolify
  2. restart/redeploy
  3. tell OpenClaw to use zai/glm-5.1 for the session/task

So no, there is not some huge extra terminal setup beyond that.

The only reason to do terminal config

Only if you want one of these:

#
  • make zai/glm-5.1 your default
  • add a permanent alias like glm
  • add OpenAI fallback config

If you just want โ€œhave GLM available when I want itโ€, then yes, itโ€™s basically:

  • add key
  • restart
  • use zai/glm-5.1

So the short answer is:

yes, mostly just add the ZAI key and use the model when you want it.

digital sundial
#

does glm5.1 have thinking?

worthy gobletBOT
#

Yes, but basically as binary thinking.

For Z.AI / GLM, OpenClaw docs say:

  • thinking: off
  • anything else = on

So unlike OpenAI/Anthropic, itโ€™s not really a rich multi-level low/medium/high setup.

Practical meaning

With zai/glm-5.1:

  • thinking off โ†’ off
  • thinking on โ†’ on
  • thinking low/medium/high โ†’ effectively treated as on
#

So, yes, GLM 5.1 has thinking, just in a simpler on/off form.

digital sundial
#

do you think i should just make glm-5.1 default for now? and will everything auto update? and if it sucks, i can move back to openai and only selectively use glm?

worthy gobletBOT
#

You can, and yes, you can always move back.

But based on everything you said, Iโ€™d not make GLM 5.1 the default just yet.

My recommendation

Safer path

  • keep OpenAI as default
  • add ZAI / GLM 5.1
  • try GLM selectively for a bit

That gives you a clean A/B test without changing your main lane.

If you do make GLM default

Yes, you can switch it with one config change, and yes, you can switch back later.

#

But I would not think of it as โ€œeverything instantly auto-updates everywhere foreverโ€.

More like:

  • new/future runs will use the new default
  • existing sessions may still carry their current selected model/session behavior
  • safest way to feel the new default cleanly is with a new session / fresh chat after the change

If GLM sucks

Yep, totally. You can just move back:

openclaw config set agents.defaults.model.primary "openai/gpt-5.4"

and then use GLM only when you want.

#

Honest take

Since you already said your important chats still lean OpenAI, Iโ€™d do this:

  • today: keep OpenAI default
  • test GLM selectively
  • later: if GLM feels consistently good, then flip the default

Thatโ€™s the lowest-risk move.

digital sundial
#

if i just change the default model, how long before the sessions update?

worthy gobletBOT
#

There usually isnโ€™t a long timer or delayed rollout.

Practical answer

New sessions

Theyโ€™ll use the new default right away after the config change and restart/redeploy.

Existing sessions

That depends:

  • if the session is just using the agent default, it should pick up the new default on the next turn
  • if the session already has a session-scoped /model selection, that override sticks until you change it

Safest mental model

  • new session/chat: definitely gets the new default
  • existing session: may keep its own selected model if it was explicitly switched before
#

So if you want a clean guaranteed switch, do one of these:

  • start a new session
  • or explicitly send /model <provider/model> in the existing session

So the short answer is:

new sessions: immediately
old sessions: immediately only if they werenโ€™t already pinned to another model

digital sundial
#

it just seems easier to set the new default model or no? so i can see how it does

worthy gobletBOT
#

Yeah, honestly, if your goal is just โ€œlive with it for a bit and see how it feelsโ€, then setting the new default is the easiest way.

Two good options

Easiest broad test

Set zai/glm-5.1 as the default.

Pros:

  • you actually feel it everywhere
  • no need to keep remembering /model
  • fastest way to judge whether you like it

Cons:

  • it affects your general day-to-day behavior more

Safer A/B test

Keep OpenAI default and switch specific sessions with /model.

#

Pros:

  • less disruption
  • safer for important chats

Cons:

  • annoying
  • easy to forget
  • slower to get a real feel for it

My honest take

If youโ€™re okay with a little churn, yes, setting GLM as the default for a while is the easiest test.

And since you can switch back fast, itโ€™s not that scary.

So in your shoes, Iโ€™d either:

#
  • flip default to GLM for a day or two
  • if it sucks, switch right back to OpenAI

Thatโ€™s a perfectly reasonable way to test it.

digital sundial
#

how to make it always use thinking?

worthy gobletBOT
#

If you want it to default to thinking all the time, set a global thinking default.

Global default

openclaw config set agents.defaults.thinkingDefault '"high"' --strict-json

Then restart/redeploy.

Important note for GLM / Z.AI

For zai/*, thinking is basically binary:

  • off = off
  • low, medium, high, etc. = effectively on
#

So for GLM, these are basically equivalent:

"low"
"medium"
"high"

They all mean โ€œthinking onโ€.

If you want a simpler version

You could also just use:

openclaw config set agents.defaults.thinkingDefault '"low"' --strict-json
#

Thatโ€™s a nice sane default.

One nuance

Session-level /think ... can still override it.
So this is โ€œalways by defaultโ€, not โ€œimpossible to change.โ€

Docs: https://docs.openclaw.ai/tools/thinking

digital sundial
#

hey i tried asking the zai/glm-5.1 to create a sub agent and ask a question using openai/5.4 but it said its blocked, model not allowed and override rejected

worthy gobletBOT
#

Yep, that usually means one of two things:

  1. you asked for the wrong ref, like openai/5.4 instead of
    openai/gpt-5.4
  2. agents.defaults.models is set, so OpenClaw is treating it as a model allowlist, and openai/gpt-5.4 is not in it

Most likely fix

Make sure both models are allowed:

openclaw config set 'agents.defaults.models["zai/glm-5.1"]' '{}' --strict-json
openclaw config set 'agents.defaults.models["openai/gpt-5.4"]' '{}' --strict-json

If you want aliases too:

#
openclaw config set 'agents.defaults.models["zai/glm-5.1"]' '{"alias":"glm"}' --strict-json
openclaw config set 'agents.defaults.models["openai/gpt-5.4"]' '{"alias":"gpt"}' --strict-json

Then restart/redeploy and test again.

Important

Use this exact model ref:

openai/gpt-5.4

not:

#
openai/5.4

Why it happened

Docs say once agents.defaults.models exists, it becomes the allowlist for:

  • /model
  • session overrides
  • model overrides used by sub-agents / spawned runs

So if OpenAI isnโ€™t listed there, override gets rejected.

If this fixed it, please right-click the solution message โ†’ Apps โ†’ โœ… Mark Solution so it gets indexed on Answer Overflow.

digital sundial
#

What I got:

I just retried right now, and it is still blocked at runtime.

โ€ข model: "openai/gpt-5.4" โ†’ model not allowed
โ€ข model: "gpt-5.4" โ†’ resolves to zai/gpt-5.4, also model not allowed

So Iโ€™m not refusing, the gateway policy is still rejecting the override in practice.
If you want, I can keep retrying immediately once you flip the allowlist.

worthy gobletBOT
#

Yep, that confirms it, the allowlist is still the blocker.

And this part is also telling:

"gpt-5.4" โ†’ resolves to zai/gpt-5.4

That means bare gpt-5.4 is not a safe ref in your current config. Since provider is omitted, OpenClaw is falling back to your current/default provider path.

Do this

1) Inspect the current allowlist

openclaw config get agents.defaults.models --json
#

2) Add the exact OpenAI model ref to the allowlist

openclaw config set 'agents.defaults.models["openai/gpt-5.4"]' '{"alias":"gpt"}' --strict-json

3) Make sure GLM is also present

openclaw config set 'agents.defaults.models["zai/glm-5.1"]' '{"alias":"glm"}' --strict-json

4) Retry using the full explicit ref

Use:

openai/gpt-5.4
#

not:

gpt-5.4

Important nuance

  • gpt-5.4 without a provider is ambiguous in your setup
  • openai/gpt-5.4 is the correct explicit override
  • if agents.defaults.models exists, that model must be listed there or it gets blocked

If you want, paste the output of:

openclaw config get agents.defaults.models --json
#

and Iโ€™ll tell you exactly whatโ€™s wrong in one shot.

digital sundial
#

how possible? openai/gpt-5.3 was the default model before this so why not allowed now?

worthy gobletBOT
#

Because those are two separate things:

  1. what used to be the default
  2. what is allowed right now in agents.defaults.models

Once agents.defaults.models exists, it becomes the active allowlist.

So itโ€™s totally possible that:

  • openai/gpt-5.3 used to be your default earlier
  • then later you changed config / onboarded Z.AI / rewrote model config
  • and now the allowlist no longer includes OpenAI models

Also, exact model refs matter:

  • openai/gpt-5.3
  • openai/gpt-5.4
#

are different entries. Allowing one does not automatically allow the other.

Most likely explanation

At some point your config ended up with an allowlist that effectively only contains your current Z.AI entries, so runtime overrides to OpenAI are now blocked.

Best way to prove it

Run:

openclaw config get agents.defaults.models --json

That will show the actual current allowlist, which is what matters now, not what used to be the default.

digital sundial
#

give me the cmd to paste that will get you the part of the config that you need to verify

worthy gobletBOT
#

Yep, paste this:

echo '--- agents.defaults.model ---'; openclaw config get agents.defaults.model --json; echo; echo '--- agents.defaults.models ---'; openclaw config get agents.defaults.models --json

That gives me the two pieces I care about:

  • your current default model
  • the current model allowlist

Paste the output here and Iโ€™ll tell you exactly whatโ€™s wrong.

digital sundial
#

openclaw config get agents.defaults.models --json
{
"openai/gpt-5.2": {
"alias": "gpt"
},
"openai/gpt-5.1-codex": {
"alias": "GPT"
},
"openai/gpt-5.3-codex": {}
}
echo '--- agents.defaults.model ---'; openclaw config get agents.defaults.model --json; echo; echo '--- agents.defaults.models ---'; openclaw config get agents.defaults.models --json
--- agents.defaults.model ---
{
"primary": "zai/glm-5.1"
}

--- agents.defaults.models ---
{
"openai/gpt-5.2": {
"alias": "gpt"
},
"openai/gpt-5.1-codex": {
"alias": "GPT"
},
"openai/gpt-5.3-codex": {}
}

worthy gobletBOT
#

Ah, there it is. Your config is stale/inconsistent.

Whatโ€™s wrong

Your current default is:

"primary": "zai/glm-5.1"

But your allowlist is only:

{
  "openai/gpt-5.2": { "alias": "gpt" },
  "openai/gpt-5.1-codex": { "alias": "GPT" },
#
  "openai/gpt-5.3-codex": {}
}

So:

  • zai/glm-5.1 is not in the allowlist
  • openai/gpt-5.4 is not in the allowlist
  • those gpt-5.1-codex / gpt-5.3-codex refs look like old stale entries

Quick fix, just add the models you actually want

Paste these:

#
openclaw config set 'agents.defaults.models["zai/glm-5.1"]' '{"alias":"glm"}' --strict-json
openclaw config set 'agents.defaults.models["openai/gpt-5.4"]' '{"alias":"gpt"}' --strict-json

Then retry the override using:

openai/gpt-5.4

Cleaner fix, replace the stale allowlist entirely

If you only care about GLM + OpenAI right now, Iโ€™d honestly do this instead:

#
openclaw config set agents.defaults.models '{"zai/glm-5.1":{"alias":"glm"},"openai/gpt-5.4":{"alias":"gpt"}}' --strict-json

That wipes the old weird entries and leaves you with a clean allowlist.

My recommendation

Iโ€™d use the cleaner fix.

Your current agents.defaults.models is clearly leftover old stuff, and itโ€™s exactly why the runtime override is being blocked.

digital sundial
#

wait you recommend two defaults? i dont understand how the quick fix is different from the cleaner fix

i plan to change default models every now and then for experiments btw like i mentioned earlier so if the glm sucks ill move back to using openai as default but i still want to be able to say "hey use openai/glm for {task}"

worthy gobletBOT
#

Ah, yeah, sorry, I blurred default and allowlist. You only have one default.

The two different things

1) Default model

This is just:

agents.defaults.model.primary

That is the one model OpenClaw uses by default.

Examples:

#
openclaw config set agents.defaults.model.primary "zai/glm-5.1"

or later:

openclaw config set agents.defaults.model.primary "openai/gpt-5.4"

2) Allowed models

This is:

agents.defaults.models
#

That is not the default.
It is the list of models OpenClaw is allowed to switch/override to.

So think of it like:

  • primary = current default
  • models = menu of allowed choices

Quick fix vs cleaner fix

Quick fix

Add the missing models to the current menu, but keep the old stale junk too.

Cleaner fix

Replace the whole menu with just the models you actually want.

#

What you want

You said:

  • sometimes GLM default
  • sometimes OpenAI default
  • but always be able to say โ€œuse GLM for thisโ€ or โ€œuse OpenAI for thisโ€

So the right setup is:

Allowed menu

Keep both in agents.defaults.models

Current default

Flip only agents.defaults.model.primary when you want to experiment

#

What I would set

One-time cleanup of the allowlist/menu

openclaw config set agents.defaults.models '{"zai/glm-5.1":{"alias":"glm"},"openai/gpt-5.4":{"alias":"gpt"}}' --strict-json

Then change default whenever you want

Make GLM default

openclaw config set agents.defaults.model.primary "zai/glm-5.1"

Make OpenAI default

#
openclaw config set agents.defaults.model.primary "openai/gpt-5.4"

That way you do not keep editing the allowlist every time. You set it once, then just flip the default back and forth.

One correction

There is no openai/glm.

Use:

  • zai/glm-5.1
  • openai/gpt-5.4

Thatโ€™s the clean setup Iโ€™d use.

digital sundial
#

and i also want to keep openai/gpt-5.2 and openai/gpt-5.3-codex

worthy gobletBOT
#

Perfect, then just make the allowlist include all five models you want to keep around.

Paste this:

openclaw config set agents.defaults.models '{"zai/glm-5.1":{"alias":"glm"},"openai/gpt-5.4":{"alias":"gpt"},"openai/gpt-5.2":{"alias":"gpt52"},"openai/gpt-5.3-codex":{"alias":"codex53"},"openai/gpt-5.1-codex":{"alias":"codex51"}}' --strict-json

Then your defaults are easy to flip

GLM default

openclaw config set agents.defaults.model.primary "zai/glm-5.1"

OpenAI GPT-5.4 default

#
openclaw config set agents.defaults.model.primary "openai/gpt-5.4"

Result

Youโ€™ll be able to:

  • keep experimenting with the default
  • still explicitly use:
    • zai/glm-5.1
    • openai/gpt-5.4
    • openai/gpt-5.2
    • openai/gpt-5.3-codex
    • openai/gpt-5.1-codex

Thatโ€™s the clean setup for the way you want to work.

digital sundial
#

without the acp thing for each telegram topic, can i have different topics be different models?

worthy gobletBOT
#

Yes, usually.

ACP-per-topic is not required just to have different models in different Telegram topics.

Practical version

If topics are being treated as separate normal sessions, you can do:

  • in Topic A: /model zai/glm-5.1
  • in Topic B: /model openai/gpt-5.4

and those model choices should stick per topic/session.

What ACP would add

ACP is more about:

#
  • dedicated persistent agent sessions
  • cleaner topic-bound session behavior
  • more advanced thread/topic binding

not basic โ€œcan Topic A use one model and Topic B use anotherโ€.

One caveat

If you ever notice model choice bleeding across topics, then that means the sessions arenโ€™t being separated the way you expect, and ACP-per-topic can help make that cleaner.

But in principle:

yes, different Telegram topics can use different models even with ACP topic spawn off.