#Gateway not starting after updating to OpenClaw 2026.4.26

1 messages Ā· Page 1 of 1 (latest)

dusty schooner
#

Updated to the latest version right now. Running OC in a docker container with LM Studio on the host. gateway seems to have broken. Tried many different things. wont connect.

openclaw gateway probe

šŸ¦ž OpenClaw 2026.4.26 (unknown) — Welcome to the command line: where dreams compile and confidence segfaults.

│
ā—‡
Gateway Status
Reachable: no
Capability: unknown
Probe budget: 3000ms

Discovery (this machine)
Found 0 gateways via Bonjour (local.)
Tip: if the gateway is remote, mDNS won’t cross networks; use Wide-Area Bonjour (split DNS) or SSH tunnels.

Targets
Local loopback ws://127.0.0.1:18789
Connect: failed - connect ECONNREFUSED 127.0.0.1:18789 Ā· Capability: unknown

uneven mirageBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> āœ… Mark Solution

mental emberBOT
#

ECONNREFUSED 127.0.0.1:18789 means the host CLI can’t reach any gateway listener — usually container not running, crashed during startup, or port not published. LM Studio config may be related, but first we need the container logs.

From the OpenClaw repo / compose directory, can you paste:

docker compose ps
docker compose logs --tail=200 openclaw-gateway
curl -fsS http://127.0.0.1:18789/healthz || true

If you’re using custom Compose, double-check these too:

ports:
  - "18789:18789"
extra_hosts:
#
  - "host.docker.internal:host-gateway" # important on Linux
environment:
  OPENCLAW_DISABLE_BONJOUR: "1"

And for LM Studio from Docker, don’t use container loopback:

baseUrl: "http://host.docker.internal:1234/v1"

LM Studio also needs to listen beyond host-only loopback:

lms server start --port 1234 --bind 0.0.0.0
dusty schooner
#

I am on a mac

mental emberBOT
#

On Mac, the LM Studio bit is simpler: Docker Desktop already gives you host.docker.internal, so you usually don’t need the Linux extra_hosts line.

But your current error is earlier than LM Studio: ECONNREFUSED :18789 = the gateway container isn’t listening/published/crashed.

Please paste these from the folder with your docker-compose.yml:

docker compose ps
docker compose logs --tail=200 openclaw-gateway
curl -i http://127.0.0.1:18789/healthz

Mac-specific checks:

docker compose port openclaw-gateway 18789
#

Expected: it should show something like 0.0.0.0:18789 or 127.0.0.1:18789.

For LM Studio inside Docker config, use:

http://host.docker.internal:1234/v1

not 127.0.0.1, because inside the container that means the container itself.

Docs: https://docs.openclaw.ai/install/docker

dusty schooner
#

soham@Sohams-MBP openclaw % docker compose ps
NAME IMAGE COMMAND SERVICE CREATED STATUS PORTS
openclaw-openclaw-cli-1 openclaw:local "node dist/index.js" openclaw-cli 21 seconds ago Up 19 seconds (health: starting)
openclaw-openclaw-gateway-1 openclaw:local "docker-entrypoint.s…" openclaw-gateway 21 seconds ago Up 20 seconds (health: starting) 127.0.0.1:18789-18790->18789-18790/tcp
soham@Sohams-MBP openclaw % docker compose logs --tail=200 openclaw-gateway
openclaw-gateway-1 | 2026-04-27T16:07:09.119+05:30 [gateway] loading configuration…
openclaw-gateway-1 | 2026-04-27T16:07:09.831+05:30 [gateway] resolving authentication…
openclaw-gateway-1 | 2026-04-27T16:07:09.834+05:30 [gateway] starting...
openclaw-gateway-1 | 2026-04-27T16:07:28.165+05:30 [gateway] starting HTTP server...
openclaw-gateway-1 | 2026-04-27T16:07:28.169+05:30 [canvas] host mounted at http://0.0.0.0:18789/__openclaw__/canvas/ (root /home/node/.openclaw/canvas)
openclaw-gateway-1 | 2026-04-27T16:07:28.171+05:30 [gateway] āš ļø Gateway is binding to a non-loopback address. Ensure authentication is configured before exposing to public networks.
openclaw-gateway-1 | 2026-04-27T16:07:28.188+05:30 [health-monitor] started (interval: 300s, startup-grace: 60s, channel-connect-grace: 120s)
soham@Sohams-MBP openclaw % curl -i http://127.0.0.1:18789/healthz
HTTP/1.1 200 OK
X-Content-Type-Options: nosniff
Referrer-Policy: no-referrer
Permissions-Policy: camera=(), microphone=(self), geolocation=()
Content-Type: application/json; charset=utf-8
Cache-Control: no-store
Date: Mon, 27 Apr 2026 10:38:04 GMT
Connection: keep-alive
Keep-Alive: timeout=5
Content-Length: 27

{"ok":true,"status":"live"}%

mental emberBOT
#

Good, this is progress: the gateway is running now. healthz returning {"ok":true,"status":"live"} means ECONNREFUSED was likely from probing during the ~19s startup window before it bound the port.

Try these now:

curl -i http://127.0.0.1:18789/readyz
openclaw gateway probe

If the host CLI still says unreachable, compare with the container CLI:

docker compose run --rm openclaw-cli gateway probe
docker compose run --rm openclaw-cli status --all

Also give Docker another minute — your ps shows (health: starting), which may just mean the container healthcheck hasn’t flipped yet.

#

For the browser/control UI, open:

http://127.0.0.1:18789/

and use the gateway token from your .env if prompted.

If readyz fails while healthz passes, paste that output plus:

docker compose logs --tail=300 openclaw-gateway
dusty schooner
#

node@0680945fa423:/app$ curl -i http://127.0.0.1:18789/readyz
HTTP/1.1 200 OK
X-Content-Type-Options: nosniff
Referrer-Policy: no-referrer
Permissions-Policy: camera=(), microphone=(self), geolocation=()
Content-Type: application/json; charset=utf-8
Cache-Control: no-store
Date: Mon, 27 Apr 2026 10:41:30 GMT
Connection: keep-alive
Keep-Alive: timeout=5
Content-Length: 45

{"ready":true,"failing":[],"uptimeMs":241341}node@0680945fa423:/app$ openclaw gateway probe

šŸ¦ž OpenClaw 2026.4.26 (unknown) — Powered by open source, sustained by spite and good documentation.

│
ā—‡
Gateway Status
Reachable: no
Capability: unknown
Probe budget: 3000ms

Discovery (this machine)
Found 0 gateways via Bonjour (local.)
Tip: if the gateway is remote, mDNS won’t cross networks; use Wide-Area Bonjour (split DNS) or SSH tunnels.

Targets
Local loopback ws://127.0.0.1:18789
Connect: ok (49ms) Ā· Capability: admin-capable Ā· Read probe: failed - timeout

mental emberBOT
#

That output is actually a different failure now: WebSocket connect/auth works:

Connect: ok Ā· Capability: admin-capable
Read probe: failed - timeout

So the gateway is alive, but one of the detail RPCs that gateway probe runs is taking longer than the default 3s probe budget. Try first:

openclaw gateway probe --timeout 30000
openclaw status --all --timeout 30000

If that still times out, isolate which RPC is hanging:

#
openclaw gateway call health --timeout 30000 --json
openclaw gateway call status --timeout 30000 --json
openclaw gateway call system-presence --timeout 30000 --json
openclaw gateway call config.get --params '{}' --timeout 30000 --json

Don’t paste full JSON publicly if it includes config-ish stuff — just say which command times out and paste the gateway log lines around it:

docker compose logs --tail=300 openclaw-gateway

My hunch: this is not ā€œgateway won’t startā€ anymore; it’s either a slow/hung health/status/config read path in 2026.4.26, or something in model/provider/plugin discovery blocking that read probe. The readyz result is good.

dusty schooner
#

read probe is sometimes ok. sometimes fails...

node@0680945fa423:/app$ openclaw gateway probe --timeout 30000

šŸ¦ž OpenClaw 2026.4.26 (unknown) — Open source means you can see exactly how I judge your config.

│
ā—‡
Gateway Status
Reachable: yes
Capability: admin-capable
Probe budget: 30000ms

Discovery (this machine)
Found 0 gateways via Bonjour (local.)
Tip: if the gateway is remote, mDNS won’t cross networks; use Wide-Area Bonjour (split DNS) or SSH tunnels.

Targets
Local loopback ws://127.0.0.1:18789
Connect: ok (27ms) Ā· Capability: admin-capable Ā· Read probe: ok
Gateway: 0680945fa423 (172.18.0.2) Ā· linux 6.12.76-linuxkit Ā· app 2026.4.26
Wide-area discovery: unknown

node@0680945fa423:/app$ openclaw status --all --timeout 30000

šŸ¦ž OpenClaw 2026.4.26 (unknown) — Automation with claws: minimal fuss, maximal pinch.

│
◐ Checking local state….
What's next:
Debug this Compose error with Gordon → docker ai "help me fix this compose error"
soham@Sohams-MBP openclaw % openclaw gateway probe --timeout 30000
zsh: command not found: openclaw
soham@Sohams-MBP openclaw % enteropenclaw
node@0680945fa423:/app$ openclaw gateway probe --timeout 30000

šŸ¦ž OpenClaw 2026.4.26 (unknown) — I autocomplete your thoughts—just slower and with more API calls.

│
ā—‡
Gateway Status
Reachable: no
Capability: unknown
Probe budget: 30000ms

Discovery (this machine)
Found 0 gateways via Bonjour (local.)
Tip: if the gateway is remote, mDNS won’t cross networks; use Wide-Area Bonjour (split DNS) or SSH tunnels.

Targets
Local loopback ws://127.0.0.1:18789
Connect: ok (2604ms) Ā· Capability: admin-capable Ā· Read probe: failed - timeout

mental emberBOT
#

Yep — if the read probe is sometimes OK and sometimes timing out even with 30s, this is no longer a port/Docker bind issue. The gateway accepts the WS connection and auth, then the detail RPCs intermittently stall.

Also: zsh: command not found: openclaw on the Mac host is fine for Docker installs. Use the compose CLI instead of host openclaw, e.g.:

docker compose run --rm openclaw-cli gateway probe --timeout 30000

Next isolate the hanging RPC from inside the container / compose CLI:

for m in health status system-presence config.get; do
  echo "=== $m ==="
  openclaw gateway call "$m" --params '{}' --timeout 30000 --json >/tmp/openclaw-$m.json \
    && echo OK \
    || echo FAILED
#
done

Then paste only which one says FAILED, plus:

docker compose logs --tail=300 openclaw-gateway

If it’s health or status, that points at a runtime/diagnostics/status path hang. If it’s config.get, likely plugin/config schema loading. If all four pass individually but gateway probe flakes, that’s useful too — probably a probe concurrency/regression in 2026.4.26.

Docs for the Docker flow: https://docs.openclaw.ai/install/docker

dusty schooner
#

soham@Sohams-MBP openclaw % docker compose run --rm openclaw-cli gateway probe --timeout 30000
[+] 1/1t 1/11
āœ” Container openclaw-openclaw-gateway-1 Running 0.0s
Container openclaw-openclaw-cli-run-f124d3e931ea Creating
Container openclaw-openclaw-cli-run-f124d3e931ea Created

šŸ¦ž OpenClaw 2026.4.26 (unknown) — Hot reload for config, cold sweat for deploys.

│
ā—‡
Gateway Status
Reachable: no
Capability: unknown
Probe budget: 30000ms

Discovery (this machine)
Found 0 gateways via Bonjour (local.)
Tip: if the gateway is remote, mDNS won’t cross networks; use Wide-Area Bonjour (split DNS) or SSH tunnels.

Targets
Local loopback ws://127.0.0.1:18789
Connect: failed - timeout Ā· Capability: unknown

node@0680945fa423:/app$ for m in health status system-presence config.get; do
echo "=== $m ==="
openclaw gateway call "$m" --params '{}' --timeout 30000 --json >/tmp/openclaw-$m.json
&& echo OK
|| echo FAILED
done
=== health ===
Gateway call failed: Error: gateway timeout after 30000ms
Gateway target: ws://127.0.0.1:18789
Source: local loopback
Config: /home/node/.openclaw/openclaw.json
Bind: lan
FAILED
=== status ===
Gateway call failed: Error: gateway timeout after 30000ms
Gateway target: ws://127.0.0.1:18789
Source: local loopback
Config: /home/node/.openclaw/openclaw.json
Bind: lan
FAILED
=== system-presence ===
Gateway call failed: Error: gateway timeout after 30000ms
Gateway target: ws://127.0.0.1:18789
Source: local loopback
Config: /home/node/.openclaw/openclaw.json
Bind: lan
FAILED
=== config.get ===
Gateway call failed: Error: gateway timeout after 30000ms
Gateway target: ws://127.0.0.1:18789
Source: local loopback
Config: /home/node/.openclaw/openclaw.json
Bind: lan
FAILED

mental emberBOT
#

That’s the important clue: even system-presence and config.get time out. Those should be cheap. So this is very likely the gateway process getting wedged, not LM Studio and not Docker port publishing.

Do this while it’s in the bad state, before restarting:

curl -i http://127.0.0.1:18789/healthz
curl -i http://127.0.0.1:18789/readyz
docker stats --no-stream openclaw-openclaw-gateway-1
docker compose logs --tail=500 openclaw-gateway
docker compose exec openclaw-gateway sh -lc 'tail -n 300 "$(ls -t /tmp/openclaw/openclaw-*.log | head -1)"'

Then safe recovery:

docker compose restart openclaw-gateway
#
sleep 10
docker compose run -T --rm openclaw-cli gateway probe --timeout 30000

If it comes back and then wedges again, I’d treat this as a 2026.4.26 regression and file it with the outputs above: https://github.com/openclaw/openclaw/issues

One small hygiene thing: use run -T --rm for one-off Docker CLI commands:

docker compose run -T --rm openclaw-cli status --all --timeout 30000

The -T avoids weird interactive TTY behavior while debugging.

dusty schooner
#

openclaw-gateway-1 | 2026-04-27T16:35:04.454+05:30 [telegram] deleteWebhook failed: Network request for 'deleteWebhook' failed!
openclaw-gateway-1 | 2026-04-27T16:35:04.455+05:30 [telegram] webhook cleanup failed: Network request for 'deleteWebhook' failed!; retrying in 2.18s.
openclaw-gateway-1 | 2026-04-27T16:35:05.199+05:30 [ws] ⇄ res āœ“ commands.list 740ms conn=b7faddf6…763b id=c545630c…7600
openclaw-gateway-1 | 2026-04-27T16:35:05.202+05:30 WhatsApp session logged out. Run: openclaw channels login
openclaw-gateway-1 | 2026-04-27T16:35:05.203+05:30 [whatsapp] [default] channel exited: {"error":{"data":{"reason":"401","location":"lla"},"isBoom":true,"isServer":false,"output":{"statusCode":401,"payload":{"statusCode":401,"error":"Unauthorized","message":"Connection Failure"},"headers":{}}},"date":"2026-04-27T11:05:05.201Z"}
openclaw-gateway-1 | 2026-04-27T16:35:05.204+05:30 [whatsapp] [default] auto-restart attempt 1/10 in 5s

openclaw-gateway-1 | 2026-04-27T16:35:26.863+05:30 [telegram] deleteWebhook failed: Network request for 'deleteWebhook' failed!
openclaw-gateway-1 | 2026-04-27T16:35:26.864+05:30 [telegram] webhook cleanup failed: Network request for 'deleteWebhook' failed!; retrying in 4.14s.
openclaw-gateway-1 | 2026-04-27T16:35:26.870+05:30 [ws] closed before connect conn=4e576ba2-782b-4713-9634-7efab7079f92 peer=127.0.0.1:51782->127.0.0.1:18789 remote=127.0.0.1 fwd=n/a origin=n/a host=127.0.0.1:18789 ua=n/a code=1005 reason=n/a

mental emberBOT
#

Those logs point at channel startup loops now:

  • Telegram can’t reach deleteWebhook
  • WhatsApp is logged out / 401 and auto-restarting

That shouldn’t wedge all gateway RPC, but it’s a strong isolation target.

Do a clean ā€œgateway only, no channelsā€ test:

cat > docker-compose.skip-channels.yml <<'YAML'
services:
  openclaw-gateway:
    environment:
      OPENCLAW_SKIP_CHANNELS: "1"
YAML
#

docker compose -f docker-compose.yml -f docker-compose.skip-channels.yml up -d --force-recreate openclaw-gateway
docker compose -f docker-compose.yml -f docker-compose.skip-channels.yml run -T --rm openclaw-cli gateway probe --timeout 30000

If that’s stable, disable the broken channels persistently for now:

docker compose run -T --rm openclaw-cli config set channels.telegram.enabled false --strict-json
docker compose run -T --rm openclaw-cli config set channels.whatsapp.enabled false --strict-json
docker compose up -d --force-recreate openclaw-gateway

Then later fix/re-enable one at a time. WhatsApp needs relogin:

#
docker compose run -T --rm openclaw-cli channels login

For Telegram, first check the container can reach Telegram at all:

docker compose exec openclaw-gateway node -e "fetch('https://api.telegram.org').then(r=>console.log(r.status)).catch(e=>console.error(e.message))"

If skip-channels makes RPC stable, please file this as a 2026.4.26 bug: channel failures shouldn’t take down/timeout config.get and system-presence. https://github.com/openclaw/openclaw/issues

dusty schooner
#

openclaw-gateway-1 | 2026-04-27T16:38:28.481+05:30 [whatsapp] [default] auto-restart attempt 1/10 in 5s
openclaw-gateway-1 | 2026-04-27T16:38:34.272+05:30 [whatsapp] [default] starting provider ()
openclaw-gateway-1 | 2026-04-27T16:38:39.595+05:30 WhatsApp session logged out. Run: openclaw channels login
openclaw-gateway-1 | 2026-04-27T16:38:39.599+05:30 [whatsapp] [default] channel exited: {"error":{"data":{"reason":"401","location":"rva"},"isBoom":true,"isServer":false,"output":{"statusCode":401,"payload":{"statusCode":401,"error":"Unauthorized","message":"Connection Failure"},"headers":{}}},"date":"2026-04-27T11:08:39.594Z"}
openclaw-gateway-1 | 2026-04-27T16:38:39.600+05:30 [whatsapp] [default] auto-restart attempt 2/10 in 11s
openclaw-gateway-1 | 2026-04-27T16:38:50.802+05:30 [whatsapp] [default] starting provider ()
openclaw-gateway-1 | 2026-04-27T16:38:55.079+05:30 WhatsApp session logged out. Run: openclaw channels login
openclaw-gateway-1 | 2026-04-27T16:38:55.083+05:30 [whatsapp] [default] channel exited: {"error":{"data":{"reason":"401","location":"vll"},"isBoom":true,"isServer":false,"output":{"statusCode":401,"payload":{"statusCode":401,"error":"Unauthorized","message":"Connection Failure"},"headers":{}}},"date":"2026-04-27T11:08:55.079Z"}
openclaw-gateway-1 | 2026-04-27T16:38:55.084+05:30 [whatsapp] [default] auto-restart attempt 3/10 in 21s
openclaw-gateway-1 | 2026-04-27T16:39:16.880+05:30 [whatsapp] [default] starting provider ()
openclaw-gateway-1 | 2026-04-27T16:39:18.623+05:30 WhatsApp session logged out. Run: openclaw channels login
openclaw-gateway-1 | 2026-04-27T16:39:18.625+05:30 [whatsapp] [default] channel exited: {"error":{"data":{"reason":"401","location":"cco"},"isBoom":true,"isServer":false,"output":{"statusCode":401,"payload":{"statusCode":401,"error":"Unauthorized","message":"Connection Failure"},"headers":{}}},"date":"2026-04-27T11:09:18.623Z"}
openclaw-gateway-1 | [whatsapp] [default] auto-restart attempt 4/10 in 44s

mental emberBOT
#

Yep, that’s a clear bad WhatsApp auth loop:

WhatsApp session logged out
401 Unauthorized
auto-restart attempt ...

Do one of these:

Fast recovery / isolate:

docker compose run -T --rm openclaw-cli config set channels.whatsapp.enabled false --strict-json
docker compose run -T --rm openclaw-cli config set channels.telegram.enabled false --strict-json
docker compose restart openclaw-gateway
#
docker compose run -T --rm openclaw-cli gateway probe --timeout 30000

If that stabilizes, re-enable channels one at a time.

For WhatsApp, relogin after re-enabling:

docker compose run -T --rm openclaw-cli config set channels.whatsapp.enabled true --strict-json
docker compose run -T --rm openclaw-cli channels login

For Telegram, first verify container network:

#
docker compose exec openclaw-gateway node -e "fetch('https://api.telegram.org').then(r=>console.log(r.status)).catch(e=>console.error(e.message))"

Important bit: WhatsApp being logged out is normal-fixable, but it shouldn’t make basic gateway RPCs timeout. If disabling channels makes it stable, that’s good evidence for a 2026.4.26 bug/regression.

dusty schooner
#

soham@Sohams-MBP openclaw % docker compose run -T --rm openclaw-cli config set channels.whatsapp.enabled false --strict-json
[+] 1/1t 1/11
āœ” Container openclaw-openclaw-gateway-1 Running 0.0s
Container openclaw-openclaw-cli-run-d785b2b677bb Creating
Container openclaw-openclaw-cli-run-d785b2b677bb Created
Config overwrite: /home/node/.openclaw/openclaw.json (sha256 2edf8aabb21ccfcc18408baf18472fecade32f8c4aa086805ba305f1032680a6 -> ff8fea9b02ac3389a6c1d7eba5b8017b72cceaf10d83a39f9ba327723165a29c, backup=/home/node/.openclaw/openclaw.json.bak)
Updated channels.whatsapp.enabled. Restart the gateway to apply.
soham@Sohams-MBP openclaw % docker compose run -T --rm openclaw-cli gateway probe --timeout 30000
[+] 1/1t 1/11
āœ” Container openclaw-openclaw-gateway-1 Running 0.0s
Container openclaw-openclaw-cli-run-8e39d27e524d Creating
Container openclaw-openclaw-cli-run-8e39d27e524d Created
Gateway Status
Reachable: no
Capability: unknown
Probe budget: 30000ms

Discovery (this machine)
Found 0 gateways via Bonjour (local.)
Tip: if the gateway is remote, mDNS won’t cross networks; use Wide-Area Bonjour (split DNS) or SSH tunnels.

Targets
Local loopback ws://127.0.0.1:18789
Connect: failed - timeout Ā· Capability: unknown

mental emberBOT
#

You’re one step short — that config change does not apply until restart. The output says:

Updated channels.whatsapp.enabled. Restart the gateway to apply.

Do this now:

docker compose run -T --rm openclaw-cli config set channels.telegram.enabled false --strict-json
docker compose restart openclaw-gateway
sleep 20
docker compose run -T --rm openclaw-cli gateway probe --timeout 30000

If restart doesn’t clear the wedged process:

#
docker compose up -d --force-recreate openclaw-gateway
sleep 20
docker compose run -T --rm openclaw-cli gateway probe --timeout 30000
docker compose logs --tail=200 openclaw-gateway

The key is: don’t probe the currently running container after changing config — it’s still the old process with WhatsApp/Telegram active until restarted.

dusty schooner
#

soham@Sohams-MBP openclaw % docker compose restart openclaw-gateway
sleep 20
docker compose run -T --rm openclaw-cli gateway probe --timeout 30000
[+] restart 0/1
ā ¼ Container openclaw-openclaw-gateway-1 Restarting 1.4s
[+] 1/1t 1/11
āœ” Container openclaw-openclaw-gateway-1 Running 0.0s
Container openclaw-openclaw-cli-run-8826ec18dfa6 Creating
Container openclaw-openclaw-cli-run-8826ec18dfa6 Created
Gateway Status
Reachable: no
Capability: unknown
Probe budget: 30000ms

Discovery (this machine)
Found 0 gateways via Bonjour (local.)
Tip: if the gateway is remote, mDNS won’t cross networks; use Wide-Area Bonjour (split DNS) or SSH tunnels.

Targets
Local loopback ws://127.0.0.1:18789
Connect: failed - connect ECONNREFUSED 127.0.0.1:18789 Ā· Capability: unknown

mental emberBOT
#

Now we’re back to gateway not listening at all after restart (ECONNREFUSED). Need the fresh startup logs — that’ll tell us where it stops.

Please run:

docker compose ps
docker compose logs --tail=250 openclaw-gateway
curl -i http://127.0.0.1:18789/healthz || true

Also, since only WhatsApp was disabled, try the stronger isolation with env-level channel skip:

cat > docker-compose.skip-channels.yml <<'YAML'
services:
  openclaw-gateway:
#
    environment:
      OPENCLAW_SKIP_CHANNELS: "1"
YAML

docker compose -f docker-compose.yml -f docker-compose.skip-channels.yml up -d --force-recreate openclaw-gateway
sleep 30
docker compose -f docker-compose.yml -f docker-compose.skip-channels.yml logs --tail=200 openclaw-gateway
docker compose -f docker-compose.yml -f docker-compose.skip-channels.yml run -T --rm openclaw-cli gateway probe --timeout 30000

If that still gives ECONNREFUSED, then this is not WhatsApp/Telegram anymore — it’s hanging/crashing before HTTP bind, and the logs --tail=250 output is the thing we need.

dusty schooner
#

soham@Sohams-MBP openclaw % docker compose ps
NAME IMAGE COMMAND SERVICE CREATED STATUS PORTS
openclaw-openclaw-cli-1 openclaw:local "node dist/index.js" openclaw-cli 20 minutes ago Up 20 minutes (healthy)
openclaw-openclaw-gateway-1 openclaw:local "docker-entrypoint.s…" openclaw-gateway 20 minutes ago Up About a minute (unhealthy) 127.0.0.1:18789-18790->18789-18790/tcp

soham@Sohams-MBP openclaw % docker compose logs --tail=250 openclaw-gateway
openclaw-gateway-1 | 2026-04-27T16:35:04.454+05:30 [telegram] deleteWebhook failed: Network request for 'deleteWebhook' failed!
openclaw-gateway-1 | 2026-04-27T16:35:04.455+05:30 [telegram] webhook cleanup failed: Network request for 'deleteWebhook' failed!; retrying in 2.18s.
openclaw-gateway-1 | 2026-04-27T16:35:05.199+05:30 [ws] ⇄ res āœ“ commands.list 740ms conn=b7faddf6…763b id=c545630c…7600
openclaw-gateway-1 | 2026-04-27T16:35:05.202+05:30 WhatsApp session logged out. Run: openclaw channels login
openclaw-gateway-1 | 2026-04-27T16:35:05.203+05:30 [whatsapp] [default] channel exited: {"error":{"data":{"reason":"401","location":"lla"},"isBoom":true,"isServer":false,"output":{"statusCode":401,"payload":{"statusCode":401,"error":"Unauthorized","message":"Connection Failure"},"headers":{}}},"date":"2026-04-27T11:05:05.201Z"}

mental emberBOT
#

That ps is the key now:

openclaw-gateway ... Up About a minute (unhealthy)

and the logs are old — no fresh loading configuration… after restart. So the container is up, but the gateway is not reaching HTTP bind.

Please do the skip-channels force-recreate exactly now:

cat > docker-compose.skip-channels.yml <<'YAML'
services:
  openclaw-gateway:
    environment:
      OPENCLAW_SKIP_CHANNELS: "1"
#
YAML

docker compose -f docker-compose.yml -f docker-compose.skip-channels.yml up -d --force-recreate openclaw-gateway
sleep 30
docker compose -f docker-compose.yml -f docker-compose.skip-channels.yml ps
docker compose -f docker-compose.yml -f docker-compose.skip-channels.yml logs --tail=200 openclaw-gateway
docker compose -f docker-compose.yml -f docker-compose.skip-channels.yml run -T --rm openclaw-cli gateway probe --timeout 30000

If it’s still unhealthy/ECONNREFUSED, grab Docker’s healthcheck failure:

docker inspect openclaw-openclaw-gateway-1 \
  --format '{{json .State.Health}}'
#

At that point I’d also stop trying to nurse this container and either roll back to 2026.4.25/2026.4.24 or file the issue with these logs. This smells like a .26 startup wedge, especially since basic RPCs were timing out before.

dusty schooner
mental emberBOT
#

Quick privacy note: those logs include your WhatsApp phone number. If this is public, I’d delete/edit that message and redact the number before filing an issue.

Also those lines are still WhatsApp-startup logs, so the channel disable/skip either didn’t apply yet or these are older logs. Let’s force a clean no-channels boot and only look at fresh logs:

cat > docker-compose.skip-channels.yml <<'YAML'
services:
  openclaw-gateway:
    environment:
      OPENCLAW_SKIP_CHANNELS: "1"
YAML

docker compose -f docker-compose.yml -f docker-compose.skip-channels.yml down
docker compose -f docker-compose.yml -f docker-compose.skip-channels.yml up -d --force-recreate openclaw-gateway
sleep 30
#
docker compose -f docker-compose.yml -f docker-compose.skip-channels.yml ps
docker compose -f docker-compose.yml -f docker-compose.skip-channels.yml logs --since=2m --timestamps openclaw-gateway
docker compose -f docker-compose.yml -f docker-compose.skip-channels.yml run -T --rm openclaw-cli gateway probe --timeout 30000

Expected fresh log should include:

skipping channel start (OPENCLAW_SKIP_CHANNELS=1 ...)

If WhatsApp still starts after that, the override isn’t being used. If no channels start but gateway is still unhealthy/ECONNREFUSED, then it’s a .26 startup wedge before channel startup.

#

The new important bit is this fresh restart:

[gateway] loading configuration…
[gateway] resolving authentication…
[gateway] starting...

…and then no:

[gateway] starting HTTP server...

So it’s now wedged before HTTP bind. The older WhatsApp lines are from the previous process; after this restart it hasn’t even reached channel startup yet.

Run with startup tracing enabled:

#
cat > docker-compose.debug.yml <<'YAML'
services:
  openclaw-gateway:
    environment:
      OPENCLAW_SKIP_CHANNELS: "1"
      OPENCLAW_GATEWAY_STARTUP_TRACE: "1"
      OPENCLAW_NO_AUTO_UPDATE: "1"
YAML

docker compose -f docker-compose.yml -f docker-compose.debug.yml down
docker compose -f docker-compose.yml -f docker-compose.debug.yml up -d --force-recreate openclaw-gateway
sleep 30
docker compose -f docker-compose.yml -f docker-compose.debug.yml logs --since=2m --timestamps openclaw-gateway
docker compose -f docker-compose.yml -f docker-compose.debug.yml ps
#

We’re looking for the last line like:

startup trace: ...

That will show which startup phase is hanging. If it still only reaches starting... with no trace after that, this is very likely a 2026.4.26 pre-bind startup regression.

Also: please redact/delete the log message with your phone number before filing/sharing publicly.

#

Empty reply from server means something accepted the TCP connection but didn’t produce a valid HTTP response. That fits the ā€œprocess is wedged / half-startedā€ state.

At this point don’t keep probing the normal container — get the startup trace:

cat > docker-compose.debug.yml <<'YAML'
services:
  openclaw-gateway:
    environment:
      OPENCLAW_SKIP_CHANNELS: "1"
      OPENCLAW_GATEWAY_STARTUP_TRACE: "1"
      OPENCLAW_NO_AUTO_UPDATE: "1"
YAML

docker compose -f docker-compose.yml -f docker-compose.debug.yml down
docker compose -f docker-compose.yml -f docker-compose.debug.yml up -d --force-recreate openclaw-gateway
#
sleep 30
docker compose -f docker-compose.yml -f docker-compose.debug.yml logs --since=2m --timestamps openclaw-gateway
docker compose -f docker-compose.yml -f docker-compose.debug.yml ps

If that still stops after:

[gateway] starting...

then I’d roll back and file this as a .26 startup wedge. The strongest bug report is: Docker on macOS, 2026.4.26, hangs before HTTP bind even with OPENCLAW_SKIP_CHANNELS=1, healthz empty reply / probe timeout.

dusty schooner
#

one thing to note
soham@Sohams-MBP openclaw % docker compose down
[+] down 3/3
āœ” Container openclaw-openclaw-cli-1 Removed 0.1s
āœ” Container openclaw-openclaw-gateway-1 Removed 1.5s
āœ” Network openclaw_default Removed

these used to be just 2 before... after updating theres a 3rd one created called Network openclaw_default

mental emberBOT
#

That network line is normal Docker Compose behavior. Compose creates a project network like:

<folder>_default

So openclaw_default just means ā€œthe default network for this compose project.ā€ It may have been hidden/not shown before depending on Compose version or whether the network was already reused, but it’s not the cause.

The weird part remains the gateway hanging before/around HTTP bind. Since you already did down, now bring it up with debug env:

cat > docker-compose.debug.yml <<'YAML'
services:
  openclaw-gateway:
    environment:
      OPENCLAW_SKIP_CHANNELS: "1"
#
      OPENCLAW_GATEWAY_STARTUP_TRACE: "1"
      OPENCLAW_NO_AUTO_UPDATE: "1"
YAML

docker compose -f docker-compose.yml -f docker-compose.debug.yml up -d --force-recreate openclaw-gateway
sleep 30
docker compose -f docker-compose.yml -f docker-compose.debug.yml logs --since=2m --timestamps openclaw-gateway
docker compose -f docker-compose.yml -f docker-compose.debug.yml ps

That should tell us what phase .26 is hanging in.

dusty schooner
#

soham@Sohams-MBP openclaw % cat > docker-compose.debug.yml <<'YAML'
services:
openclaw-gateway:
environment:
OPENCLAW_SKIP_CHANNELS: "1"
OPENCLAW_GATEWAY_STARTUP_TRACE: "1"
OPENCLAW_NO_AUTO_UPDATE: "1"
YAML
soham@Sohams-MBP openclaw % docker compose -f docker-compose.yml -f docker-compose.debug.yml up -d --force-recreate openclaw-gateway
[+] up 2/2
āœ” Network openclaw_default Created 0.0s
āœ” Container openclaw-openclaw-gateway-1 Started 0.2s
soham@Sohams-MBP openclaw % sleep 30

#

soham@Sohams-MBP openclaw % docker compose -f docker-compose.yml -f docker-compose.debug.yml logs --since=2m --timestamps openclaw-gateway
openclaw-gateway-1 | 2026-04-27T11:31:37.164307004Z [gateway] startup trace: cli.server-import 0.6ms total=0.6ms
openclaw-gateway-1 | 2026-04-27T11:31:37.165927671Z 2026-04-27T17:01:37.165+05:30 [gateway] loading configuration…
openclaw-gateway-1 | 2026-04-27T11:31:37.539743463Z 2026-04-27T17:01:37.539+05:30 [gateway] startup trace: cli.config-load 372.4ms total=387.6ms
openclaw-gateway-1 | 2026-04-27T11:31:37.842577671Z 2026-04-27T17:01:37.842+05:30 [gateway] startup trace: cli.config-snapshot 301.5ms total=690.4ms
openclaw-gateway-1 | 2026-04-27T11:31:37.844543505Z 2026-04-27T17:01:37.844+05:30 [gateway] resolving authentication…
openclaw-gateway-1 | 2026-04-27T11:31:37.845684755Z 2026-04-27T17:01:37.845+05:30 [gateway] startup trace: cli.auth-resolve 0.3ms total=694.1ms
openclaw-gateway-1 | 2026-04-27T11:31:37.846295630Z 2026-04-27T17:01:37.846+05:30 [gateway] starting...
openclaw-gateway-1 | 2026-04-27T11:31:37.846974588Z 2026-04-27T17:01:37.846+05:30 [gateway] startup trace: cli.gateway-loop 1.4ms total=695.5ms
openclaw-gateway-1 | 2026-04-27T11:31:39.442121630Z 2026-04-27T17:01:39.441+05:30 [gateway] startup trace: config.snapshot 1296.8ms total=1296.9ms
soham@Sohams-MBP openclaw % docker compose -f docker-compose.yml -f docker-compose.debug.yml ps
NAME IMAGE COMMAND SERVICE CREATED STATUS PORTS
openclaw-openclaw-gateway-1 openclaw:local "docker-entrypoint.s…" openclaw-gateway 52 seconds ago Up 52 seconds (health: starting) 127.0.0.1:18789-18790->18789-18790/tcp

mental emberBOT
#

Good — now run the two inspection commands:

docker compose -f docker-compose.yml -f docker-compose.debug.yml logs --since=2m --timestamps openclaw-gateway
docker compose -f docker-compose.yml -f docker-compose.debug.yml ps

And optionally:

curl -i http://127.0.0.1:18789/healthz || true

Paste that output. The startup trace logs are the thing we need now.

dusty schooner
#

soham@Sohams-MBP openclaw % docker compose -f docker-compose.yml -f docker-compose.debug.yml logs --since=2m --timestamps openclaw-gateway
openclaw-gateway-1 | 2026-04-27T11:31:37.164307004Z [gateway] startup trace: cli.server-import 0.6ms total=0.6ms
openclaw-gateway-1 | 2026-04-27T11:31:37.165927671Z 2026-04-27T17:01:37.165+05:30 [gateway] loading configuration…
openclaw-gateway-1 | 2026-04-27T11:31:37.539743463Z 2026-04-27T17:01:37.539+05:30 [gateway] startup trace: cli.config-load 372.4ms total=387.6ms
openclaw-gateway-1 | 2026-04-27T11:31:37.842577671Z 2026-04-27T17:01:37.842+05:30 [gateway] startup trace: cli.config-snapshot 301.5ms total=690.4ms
openclaw-gateway-1 | 2026-04-27T11:31:37.844543505Z 2026-04-27T17:01:37.844+05:30 [gateway] resolving authentication…
openclaw-gateway-1 | 2026-04-27T11:31:37.845684755Z 2026-04-27T17:01:37.845+05:30 [gateway] startup trace: cli.auth-resolve 0.3ms total=694.1ms
openclaw-gateway-1 | 2026-04-27T11:31:37.846295630Z 2026-04-27T17:01:37.846+05:30 [gateway] starting...
openclaw-gateway-1 | 2026-04-27T11:31:37.846974588Z 2026-04-27T17:01:37.846+05:30 [gateway] startup trace: cli.gateway-loop 1.4ms total=695.5ms
openclaw-gateway-1 | 2026-04-27T11:31:39.442121630Z 2026-04-27T17:01:39.441+05:30 [gateway] startup trace: config.snapshot 1296.8ms total=1296.9ms
soham@Sohams-MBP openclaw % docker compose -f docker-compose.yml -f docker-compose.debug.yml ps
NAME IMAGE COMMAND SERVICE CREATED STATUS PORTS
openclaw-openclaw-gateway-1 openclaw:local "docker-entrypoint.s…" openclaw-gateway 52 seconds ago Up 52 seconds (health: starting) 127.0.0.1:18789-18790->18789-18790/tcp

mental emberBOT
#

Nice, the trace isolates it:

startup trace: config.snapshot ...

…and then nothing. The next expected phase is config.auth, so this is hanging in gateway auth / runtime secret activation, before HTTP bind. With OPENCLAW_SKIP_CHANNELS=1, WhatsApp/Telegram are not the current blocker.

Please don’t paste config values, but run this to list only SecretRef paths:

docker compose -f docker-compose.yml -f docker-compose.debug.yml run -T --rm --entrypoint sh openclaw-cli -lc '
node --input-type=module <<'"'"'NODE'"'"'
import fs from "node:fs";
import JSON5 from "json5";
for (const file of ["/home/node/.openclaw/openclaw.json", "/home/node/.openclaw/auth-profiles.json"]) {
#
  if (!fs.existsSync(file)) continue;
  const data = JSON5.parse(fs.readFileSync(file, "utf8"));
  function walk(v, path=[]) {
    if (!v || typeof v !== "object") return;
    if (["env","file","exec"].includes(v.source) && typeof v.id === "string") {
      console.log(`${file}: ${path.join(".")} -> ${v.source}:${v.provider ?? "default"}:${v.id}`);
    }
    for (const [k,val] of Object.entries(v)) walk(val, [...path,k]);
  }
  walk(data);
}
NODE
'

If any line says exec:..., that’s my top suspect for a startup hang.

#

For recovery, I’d roll back to the last working image/version now, then file this as a .26 bug with the startup trace. Also please redact the phone number from earlier logs before sharing publicly.

#

Yep, same trace confirms it: hang is between config.snapshot and the next phase config.auth.

Please run the SecretRef path scanner I posted. It prints paths only, not secret values:

docker compose -f docker-compose.yml -f docker-compose.debug.yml run -T --rm --entrypoint sh openclaw-cli -lc '
node --input-type=module <<'"'"'NODE'"'"'
import fs from "node:fs";
import JSON5 from "json5";
for (const file of ["/home/node/.openclaw/openclaw.json", "/home/node/.openclaw/auth-profiles.json"]) {
  if (!fs.existsSync(file)) continue;
  const data = JSON5.parse(fs.readFileSync(file, "utf8"));
  function walk(v, path=[]) {
    if (!v || typeof v !== "object") return;
    if (["env","file","exec"].includes(v.source) && typeof v.id === "string") {
      console.log(`${file}: ${path.join(".")} -> ${v.source}:${v.provider ?? "default"}:${v.id}`);
#
    }
    for (const [k,val] of Object.entries(v)) walk(val, [...path,k]);
  }
  walk(data);
}
NODE
'

If that shows any exec: refs, temporarily move/replace those refs or roll back. If it shows only env:/file: refs, I’d stop debugging and file this with the trace: .26 hangs in startup config.auth even with OPENCLAW_SKIP_CHANNELS=1.

For getting unstuck fastest: use the previous working Docker image/tag and keep the config volume.

dusty schooner
#

got stuck saying quote>

soham@Sohams-MBP openclaw % docker compose -f docker-compose.yml -f docker-compose.debug.yml run -T --rm --entrypoint sh openclaw-cli -lc '
node --input-type=module <<'"'"'NODE'"'"'
import fs from "node:fs";
import JSON5 from "json5";
for (const file of ["/home/node/.openclaw/openclaw.json", "/home/node/.openclaw/auth-profiles.json"]) {
quote>

mental emberBOT
#

Ah yep, zsh is waiting for the closing quote. Cancel that with Ctrl+C.

Use this simpler one-liner instead — less quoting hell:

docker compose -f docker-compose.yml -f docker-compose.debug.yml run -T --rm --entrypoint node openclaw-cli --input-type=module -e 'import fs from "node:fs"; import JSON5 from "json5"; for (const file of ["/home/node/.openclaw/openclaw.json","/home/node/.openclaw/auth-profiles.json"]) { if (!fs.existsSync(file)) continue; const data = JSON5.parse(fs.readFileSync(file,"utf8")); function walk(v,path=[]) { if (!v || typeof v !== "object") return; if (["env","file","exec"].includes(v.source) && typeof v.id === "string") console.log(`${file}: ${path.join(".")} -> ${v.source}:${v.provider ?? "default"}:${v.id}`); for (const [k,val] of Object.entries(v)) walk(val,[...path,k]); } walk(data); }'

If that still gets mangled, skip it — the startup trace is already enough to file/rollback.

dusty schooner
#

soham@Sohams-MBP openclaw % docker compose -f docker-compose.yml -f docker-compose.debug.yml run -T --rm --entrypoint node openclaw-cli --input-type=module -e 'import fs from "node:fs"; import JSON5 from "json5"; for (const file of ["/home/node/.openclaw/openclaw.json","/home/node/.openclaw/auth-profiles.json"]) { if (!fs.existsSync(file)) continue; const data = JSON5.parse(fs.readFileSync(file,"utf8")); function walk(v,path=[]) { if (!v || typeof v !== "object") return; if (["env","file","exec"].includes(v.source) && typeof v.id === "string") console.log(${file}: ${path.join(".")} -> ${v.source}:${v.provider ?? "default"}:${v.id}); for (const [k,val] of Object.entries(v)) walk(val,[...path,k]); } walk(data); }'
[+] 1/1t 1/11
āœ” Container openclaw-openclaw-gateway-1 Running 0.0s
Container openclaw-openclaw-cli-run-2fa1a8c749c6 Creating
Container openclaw-openclaw-cli-run-2fa1a8c749c6 Created

mental emberBOT
#

If it returned to the prompt with no output, that means the scanner found no SecretRefs in openclaw.json / auth-profiles.json.

If it’s still sitting after Created, hit Ctrl+C and use exec instead of run:

docker compose -f docker-compose.yml -f docker-compose.debug.yml exec -T openclaw-gateway node --input-type=module -e 'import fs from "node:fs"; import JSON5 from "json5"; for (const file of ["/home/node/.openclaw/openclaw.json","/home/node/.openclaw/auth-profiles.json"]) { if (!fs.existsSync(file)) continue; const data = JSON5.parse(fs.readFileSync(file,"utf8")); function walk(v,path=[]) { if (!v || typeof v !== "object") return; if (["env","file","exec"].includes(v.source) && typeof v.id === "string") console.log(`${file}: ${path.join(".")} -> ${v.source}:${v.provider ?? "default"}:${v.id}`); for (const [k,val] of Object.entries(v)) walk(val,[...path,k]); } walk(data); }'

But honestly, we already have enough signal:

  • Docker macOS
  • OpenClaw 2026.4.26
  • OPENCLAW_SKIP_CHANNELS=1
  • startup trace stops after config.snapshot
  • never reaches config.auth / HTTP bind

That’s a strong startup regression report. I’d roll back to the last working image/version now and file this with the trace.

dusty schooner
#

soham@Sohams-MBP openclaw % docker compose -f docker-compose.yml -f docker-compose.debug.yml exec -T openclaw-gateway node --input-type=module -e 'import fs from "node:fs"; import JSON5 from "json5"; for (const file of ["/home/node/.openclaw/openclaw.json","/home/node/.openclaw/auth-profiles.json"]) { if (!fs.existsSync(file)) continue; const data = JSON5.parse(fs.readFileSync(file,"utf8")); function walk(v,path=[]) { if (!v || typeof v !== "object") return; if (["env","file","exec"].includes(v.source) && typeof v.id === "string") console.log(${file}: ${path.join(".")} -> ${v.source}:${v.provider ?? "default"}:${v.id}); for (const [k,val] of Object.entries(v)) walk(val,[...path,k]); } walk(data); }'
soham@Sohams-MBP openclaw %

mental emberBOT
#

Perfect — no output means no SecretRefs, so this is not an exec/env/file secret resolver hang.

At this point I’d call it:

Likely OpenClaw 2026.4.26 Docker/macOS startup regression.

Evidence:

OPENCLAW_SKIP_CHANNELS=1
OPENCLAW_NO_AUTO_UPDATE=1
startup trace reaches: config.snapshot
then never reaches: config.auth
never binds HTTP / healthz

Fastest recovery: roll back the image tag to the previous version you had working. In .env or shell:

#
export OPENCLAW_IMAGE=ghcr.io/openclaw/openclaw:<previous-working-version>
docker compose -f docker-compose.yml up -d --force-recreate openclaw-gateway

If you don’t remember, try the most recent prior tag you used successfully.

Please file this with the trace here: https://github.com/openclaw/openclaw/issues

Useful title:

2026.4.26 Docker on macOS hangs during gateway startup after config.snapshot

Include:

  • Docker Desktop on macOS
#
  • LM Studio on host
  • OPENCLAW_SKIP_CHANNELS=1
  • OPENCLAW_NO_AUTO_UPDATE=1
  • no SecretRefs found
  • startup trace stopping after config.snapshot

Also redact the earlier WhatsApp phone number from logs before posting.

dusty schooner
#

do you want to see my dockerfile and docker-compose once?

mental emberBOT
#

Yes — docker-compose first is useful. Dockerfile only if Compose doesn’t explain it.

Please paste redacted output of:

cat docker-compose.yml
cat docker-compose.debug.yml
cat .env

Redact tokens/API keys/cookies/phone numbers. Keep these visible if present:

  • OPENCLAW_IMAGE
  • OPENCLAW_CONFIG_DIR
  • OPENCLAW_WORKSPACE_DIR
  • ports
  • volumes
#
  • command:
  • entrypoint:
  • environment: keys, with secret values replaced by REDACTED

Also run:

docker compose -f docker-compose.yml -f docker-compose.debug.yml config

That merged config is often the most useful, because it shows what Docker is actually launching.

dusty schooner
#

this is impossible to send in one go.. sending in parts

mental emberBOT
#

Yep, send in parts šŸ‘

Best order:

  1. docker compose -f docker-compose.yml -f docker-compose.debug.yml config — just the services: section if it’s huge
  2. .env redacted
  3. Dockerfile only if needed

Please redact secret values, but leave variable names visible.

dusty schooner
#

pasting it into online txt and showing you. faster

mental emberBOT
#

That works — just make sure secrets are redacted before uploading.

If you send a link, I’ll look at:

  • merged Compose (docker compose ... config)
  • env vars / volumes / command / entrypoint
  • whether the debug overlay actually applies
  • any custom Dockerfile changes that could affect node, npm, dist/, or runtime deps

Don’t include raw tokens, cookies, API keys, or phone numbers.

dusty schooner
mental emberBOT
#

I looked — the big thing is: you’re not running the stock release image. You have:

OPENCLAW_IMAGE=openclaw:local

plus a heavily customized Dockerfile: Homebrew, npm/build-essential, custom PATH, NODE_COMPILE_CACHE volume, etc. So first split the problem:

Test official image, same config

Create this override:

cat > docker-compose.official-debug.yml <<'YAML'
services:
  openclaw-gateway:
#
    image: ghcr.io/openclaw/openclaw:2026.4.26
    environment:
      OPENCLAW_SKIP_CHANNELS: "1"
      OPENCLAW_GATEWAY_STARTUP_TRACE: "1"
      OPENCLAW_NO_AUTO_UPDATE: "1"
      NODE_COMPILE_CACHE: ""
      PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
  openclaw-cli:
    image: ghcr.io/openclaw/openclaw:2026.4.26
    environment:
      NODE_COMPILE_CACHE: ""
      PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
YAML

docker compose -f docker-compose.yml -f docker-compose.official-debug.yml down
#
docker compose -f docker-compose.yml -f docker-compose.official-debug.yml up -d --force-recreate openclaw-gateway
sleep 30
docker compose -f docker-compose.yml -f docker-compose.official-debug.yml logs --since=2m --timestamps openclaw-gateway
docker compose -f docker-compose.yml -f docker-compose.official-debug.yml run -T --rm openclaw-cli gateway probe --timeout 30000

If this works: the issue is your custom local image / Dockerfile / compile cache, not core .26.

If this also hangs after config.snapshot: it’s likely a real .26 regression with your config volume.

Two hygiene notes:

  • Your pasted .env had a commented gateway token-looking value. If that was real, rotate it.
  • Remove/redact the earlier WhatsApp number before posting an issue.
dusty schooner
#

soham@Sohams-MBP openclaw % docker compose -f docker-compose.yml -f docker-compose.official-debug.yml down
[+] down 2/2
āœ” Container openclaw-openclaw-gateway-1 Removed 1.4s
āœ” Network openclaw_default Removed 0.2s
soham@Sohams-MBP openclaw % docker compose -f docker-compose.yml -f docker-compose.official-debug.yml up -d --force-recreate openclaw-gateway
[+] up 1/1
✘ Image ghcr.io/openclaw/openclaw:2026.4.26 Error failed to resolve reference "ghcr.io/openclaw/openclaw:2026.4.26": ghcr.io/openclaw/openclaw:2026.4... 1.0s
Error response from daemon: failed to resolve reference "ghcr.io/openclaw/openclaw:2026.4.26": ghcr.io/openclaw/openclaw:2026.4.26: not found

What's next:
Debug this Compose error with Gordon → docker ai "help me fix this compose error"

mental emberBOT
#

Ah, that explains the pull error: there is no official ghcr.io/openclaw/openclaw:2026.4.26 image tag published right now. GHCR is showing latest around 2026.4.25-beta.10, not .26. (github.com)

So your 2026.4.26 came from your locally built openclaw:local image, not an official GHCR image. Test against an actually published tag:

cat > docker-compose.official-debug.yml <<'YAML'
services:
  openclaw-gateway:
    image: ghcr.io/openclaw/openclaw:2026.4.25-beta.10
    environment:
      OPENCLAW_SKIP_CHANNELS: "1"
      OPENCLAW_GATEWAY_STARTUP_TRACE: "1"
      OPENCLAW_NO_AUTO_UPDATE: "1"
      NODE_COMPILE_CACHE: ""
  openclaw-cli:
    image: ghcr.io/openclaw/openclaw:2026.4.25-beta.10
#
    environment:
      NODE_COMPILE_CACHE: ""
YAML

Then:

docker compose -f docker-compose.yml -f docker-compose.official-debug.yml up -d --force-recreate openclaw-gateway
sleep 30
docker compose -f docker-compose.yml -f docker-compose.official-debug.yml logs --since=2m --timestamps openclaw-gateway
docker compose -f docker-compose.yml -f docker-compose.official-debug.yml run -T --rm openclaw-cli gateway probe --timeout 30000

If that works, your local .26 build/custom Dockerfile is the culprit. If you want stable recovery, use a known published tag instead of openclaw:local.

dusty schooner
#

soham@Sohams-MBP openclaw % docker compose -f docker-compose.yml -f docker-compose.official-debug.yml run -T --rm openclaw-cli gateway probe --timeout 30000
[+] 1/1t 1/11
āœ” Container openclaw-openclaw-gateway-1 Running 0.0s
Container openclaw-openclaw-cli-run-4aa3850a8f30 Creating
Container openclaw-openclaw-cli-run-4aa3850a8f30 Created
Config was last written by a newer OpenClaw (2026.4.26); current version is 2026.4.25-beta.10.
Config was last written by a newer OpenClaw (2026.4.26); current version is 2026.4.25-beta.10.
Gateway Status
Reachable: no
Capability: unknown
Probe budget: 30000ms

Discovery (this machine)
Found 0 gateways via Bonjour (local.)
Tip: if the gateway is remote, mDNS won’t cross networks; use Wide-Area Bonjour (split DNS) or SSH tunnels.

Targets
Local loopback ws://127.0.0.1:18789
Connect: failed - timeout Ā· Capability: unknown

mental emberBOT
#

That warning is expected and means this isn’t a clean rollback test yet:

Config was last written by a newer OpenClaw (2026.4.26); current version is 2026.4.25-beta.10

Older OpenClaw is refusing/struggling with a config stamped by .26.

For an emergency rollback test, add this env var to the older-image override:

OPENCLAW_ALLOW_OLDER_BINARY_DESTRUCTIVE_ACTIONS: "1"

So:

#
cat > docker-compose.official-debug.yml <<'YAML'
services:
  openclaw-gateway:
    image: ghcr.io/openclaw/openclaw:2026.4.25-beta.10
    environment:
      OPENCLAW_ALLOW_OLDER_BINARY_DESTRUCTIVE_ACTIONS: "1"
      OPENCLAW_SKIP_CHANNELS: "1"
      OPENCLAW_GATEWAY_STARTUP_TRACE: "1"
      OPENCLAW_NO_AUTO_UPDATE: "1"
      NODE_COMPILE_CACHE: ""
  openclaw-cli:
    image: ghcr.io/openclaw/openclaw:2026.4.25-beta.10
    environment:
      OPENCLAW_ALLOW_OLDER_BINARY_DESTRUCTIVE_ACTIONS: "1"
      NODE_COMPILE_CACHE: ""
#
YAML

docker compose -f docker-compose.yml -f docker-compose.official-debug.yml down
docker compose -f docker-compose.yml -f docker-compose.official-debug.yml up -d --force-recreate openclaw-gateway
sleep 30
docker compose -f docker-compose.yml -f docker-compose.official-debug.yml logs --since=2m --timestamps openclaw-gateway
docker compose -f docker-compose.yml -f docker-compose.official-debug.yml run -T --rm openclaw-cli gateway probe --timeout 30000

If that starts, the culprit is almost certainly your local .26 build/custom image. If it still hangs, paste the fresh startup trace from the older image.

dusty schooner
#

can you take me to 4.24 with this code for safety?

mental emberBOT
#

Yes — safest path is: don’t overwrite openclaw:local, build a separate 4.24 image and run it with a downgrade override.

# from your OpenClaw repo
git fetch --tags
git worktree add ../openclaw-2026.4.24 v2026.4.24

cd ../openclaw-2026.4.24
docker build -t openclaw:2026.4.24-local -f Dockerfile .

Then in your normal compose folder:

cat > docker-compose.4.24.yml <<'YAML'
services:
#
  openclaw-gateway:
    image: openclaw:2026.4.24-local
    environment:
      OPENCLAW_ALLOW_OLDER_BINARY_DESTRUCTIVE_ACTIONS: "1"
      OPENCLAW_SKIP_CHANNELS: "1"
      OPENCLAW_NO_AUTO_UPDATE: "1"
      OPENCLAW_GATEWAY_STARTUP_TRACE: "1"
      NODE_COMPILE_CACHE: ""
  openclaw-cli:
    image: openclaw:2026.4.24-local
    environment:
      OPENCLAW_ALLOW_OLDER_BINARY_DESTRUCTIVE_ACTIONS: "1"
      NODE_COMPILE_CACHE: ""
YAML
#
docker compose -f docker-compose.yml -f docker-compose.4.24.yml down
docker compose -f docker-compose.yml -f docker-compose.4.24.yml up -d --force-recreate openclaw-gateway
sleep 30
docker compose -f docker-compose.yml -f docker-compose.4.24.yml logs --since=2m --timestamps openclaw-gateway
docker compose -f docker-compose.yml -f docker-compose.4.24.yml run -T --rm openclaw-cli gateway probe --timeout 30000

Keep OPENCLAW_SKIP_CHANNELS=1 for the first boot. If gateway is stable, re-enable channels one at a time after.

dusty schooner
#

actually ill stick to 4.25... it has some good features I think

mental emberBOT
#

Yep, good call. Use the published 2026.4.25-beta.10 image and keep the downgrade override because your config is stamped by .26:

cat > docker-compose.4.25.yml <<'YAML'
services:
  openclaw-gateway:
    image: ghcr.io/openclaw/openclaw:2026.4.25-beta.10
    environment:
      OPENCLAW_ALLOW_OLDER_BINARY_DESTRUCTIVE_ACTIONS: "1"
      OPENCLAW_SKIP_CHANNELS: "1"
      OPENCLAW_NO_AUTO_UPDATE: "1"
      OPENCLAW_GATEWAY_STARTUP_TRACE: "1"
      NODE_COMPILE_CACHE: ""
  openclaw-cli:
    image: ghcr.io/openclaw/openclaw:2026.4.25-beta.10
    environment:
#
      OPENCLAW_ALLOW_OLDER_BINARY_DESTRUCTIVE_ACTIONS: "1"
      NODE_COMPILE_CACHE: ""
YAML

docker compose -f docker-compose.yml -f docker-compose.4.25.yml down
docker compose -f docker-compose.yml -f docker-compose.4.25.yml up -d --force-recreate openclaw-gateway
sleep 30
docker compose -f docker-compose.yml -f docker-compose.4.25.yml logs --since=2m --timestamps openclaw-gateway
docker compose -f docker-compose.yml -f docker-compose.4.25.yml run -T --rm openclaw-cli gateway probe --timeout 30000

If stable, remove OPENCLAW_SKIP_CHANNELS and restart, or re-enable channels one at a time. Keep OPENCLAW_NO_AUTO_UPDATE=1 for now so it doesn’t jump back to the broken local .26.

#

I just restarted into a fresh session and rebuilt what I could from the workspace: there wasn’t a resume file or any highlights for today, so I only have the thread context here.

Based on what you pasted, sticking to 2026.4.25 is a totally reasonable workaround for now.

The important clue is this:

connect ECONNREFUSED 127.0.0.1:18789

That means the gateway itself is not listening inside the container. So this looks more like a gateway startup/service/config issue than the LM Studio-on-host connection.

If 2026.4.25 works and 2026.4.26 breaks, I’d mainly suspect one of these:

  • service/config mismatch after upgrade
  • gateway startup blocked by stricter config/auth validation
  • old/new binary split-brain after updating
#

If you want to narrow it down later, paste these from the 2026.4.26 container:

openclaw --version
openclaw gateway status --deep
openclaw status --all
openclaw logs --follow

And if you’re using Docker, the container startup log too.

Docs:
https://docs.openclaw.ai/help/faq
https://docs.openclaw.ai/gateway/troubleshooting

If there’s one critical detail I’m missing from before the reset, tell me — otherwise drop the 2026.4.26 startup log and I’ll help pin the exact failure.

dusty schooner
#

soham@Sohams-MBP openclaw % docker compose -f docker-compose.yml -f docker-compose.4.25.yml run -T --rm openclaw-cli gateway probe --timeout 30000
[+] 1/1t 1/11
āœ” Container openclaw-openclaw-gateway-1 Running 0.0s
Container openclaw-openclaw-cli-run-2e3269854ac5 Creating
Container openclaw-openclaw-cli-run-2e3269854ac5 Created
Config was last written by a newer OpenClaw (2026.4.26); current version is 2026.4.25-beta.10.

mental emberBOT
#

Looks like the CLI still isn’t getting the downgrade override. Your docker-compose.yml hardcodes env blocks, so let’s inspect the merged config:

docker compose -f docker-compose.yml -f docker-compose.4.25.yml config | grep -A35 -E 'openclaw-(gateway|cli):'

You should see under both services:

OPENCLAW_ALLOW_OLDER_BINARY_DESTRUCTIVE_ACTIONS: "1"

If it’s missing, recreate the override file exactly:

cat > docker-compose.4.25.yml <<'YAML'
#
services:
  openclaw-gateway:
    image: ghcr.io/openclaw/openclaw:2026.4.25-beta.10
    environment:
      OPENCLAW_ALLOW_OLDER_BINARY_DESTRUCTIVE_ACTIONS: "1"
      OPENCLAW_SKIP_CHANNELS: "1"
      OPENCLAW_NO_AUTO_UPDATE: "1"
      OPENCLAW_GATEWAY_STARTUP_TRACE: "1"
      NODE_COMPILE_CACHE: ""
  openclaw-cli:
    image: ghcr.io/openclaw/openclaw:2026.4.25-beta.10
    environment:
      OPENCLAW_ALLOW_OLDER_BINARY_DESTRUCTIVE_ACTIONS: "1"
      NODE_COMPILE_CACHE: ""
YAML
#

Then fully recreate, including the CLI service:

docker compose -f docker-compose.yml -f docker-compose.4.25.yml down
docker compose -f docker-compose.yml -f docker-compose.4.25.yml up -d --force-recreate openclaw-gateway
docker compose -f docker-compose.yml -f docker-compose.4.25.yml run -T --rm openclaw-cli env | grep OPENCLAW_ALLOW
docker compose -f docker-compose.yml -f docker-compose.4.25.yml run -T --rm openclaw-cli gateway probe --timeout 30000

If the env | grep prints nothing, Compose isn’t reading the override you think it is.

#

I’m in a fresh session with limited memory here — there wasn’t a resume file or any highlights for today — but this new output is enough to move forward.

The new clue is actually the key:

Config was last written by a newer OpenClaw (2026.4.26); current version is 2026.4.25-beta.10.

So there are two separate issues:

  1. On 2026.4.26, the gateway wasn’t listening (ECONNREFUSED 127.0.0.1:18789).
  2. When you downgraded to 2026.4.25-beta.10, OpenClaw hit the newer-config guard because your config/state had already been touched by 2026.4.26.

That guard is intentional — older binaries are not supposed to mutate/start services against newer-written config.

Simplest next step

Best path: temporarily go back to the 2026.4.26 CLI/container, then grab the actual gateway failure log there.

#

Please paste these from the 2026.4.26 container:

openclaw --version
openclaw gateway status --deep
openclaw status --all
openclaw logs --follow

If Docker is easier, the gateway container logs from the 2026.4.26 stack are perfect too.

If you really want to stay on 4.25

Then you’ll usually need one of these:

  • restore a pre-4.26 config/state volume, or
  • use a fresh config/state, or
  • for intentional recovery only, use the one-command override from the docs:
#

OPENCLAW_ALLOW_OLDER_BINARY_DESTRUCTIVE_ACTIONS=1

I’d avoid forcing the downgrade until we see why 4.26 stopped starting, because the downgrade symptom may just be secondary.

Docs:
https://docs.openclaw.ai/help/faq
https://docs.openclaw.ai/gateway/troubleshooting

If there’s anything critical from before the reset that I’m missing, tell me — otherwise send the 2026.4.26 gateway log, that’s the next clean step.

dusty schooner
mental emberBOT
#

The override is applying — I can see:

OPENCLAW_ALLOW_OLDER_BINARY_DESTRUCTIVE_ACTIONS: "1"

But you also pasted your real gateway token in that ctxt link. Rotate it after recovery.

The reason you still saw the warning may be because the command exited before the probe, or the running gateway was still from the previous create. Do a full recreate and check the runtime env + version:

docker compose -f docker-compose.yml -f docker-compose.4.25.yml down
docker compose -f docker-compose.yml -f docker-compose.4.25.yml up -d --force-recreate openclaw-gateway
sleep 30

docker compose -f docker-compose.yml -f docker-compose.4.25.yml exec -T openclaw-gateway node dist/index.js --version
#
docker compose -f docker-compose.yml -f docker-compose.4.25.yml exec -T openclaw-gateway sh -lc 'echo $OPENCLAW_ALLOW_OLDER_BINARY_DESTRUCTIVE_ACTIONS'
docker compose -f docker-compose.yml -f docker-compose.4.25.yml logs --since=2m --timestamps openclaw-gateway
docker compose -f docker-compose.yml -f docker-compose.4.25.yml run -T --rm openclaw-cli gateway probe --timeout 30000

If it still warns, the guard may not allow gateway probe; but the gateway startup should be the important part. The logs will tell us whether 4.25 reaches starting HTTP server....

Also: rotate OPENCLAW_GATEWAY_TOKEN — it’s public now.

dusty schooner
#

tell me one thing... how to start from scratch... delete everything and build again?.. (but not delete .openclaw folder which has all data)

mental emberBOT
#

Yes. To start Docker/build fresh without deleting ~/.openclaw, remove containers/images/cache, but keep the host config dir.

First verify the data path:

echo "$OPENCLAW_CONFIG_DIR"
# should be /Users/soham/.openclaw

Then from the repo/compose folder:

# stop/remove containers + project network only
docker compose down --remove-orphans

# remove named compile cache volume only; does NOT touch /Users/soham/.openclaw
#
docker volume rm openclaw_openclaw_cache 2>/dev/null || true

# remove local images
docker image rm openclaw:local 2>/dev/null || true
docker image rm ghcr.io/openclaw/openclaw:2026.4.25-beta.10 2>/dev/null || true

Then rebuild clean:

git fetch --tags
git checkout v2026.4.25-beta.10   # or the tag/branch you want
docker build --no-cache -t openclaw:local -f Dockerfile .

Start with channels skipped first:

#
cat > docker-compose.clean-debug.yml <<'YAML'
services:
  openclaw-gateway:
    environment:
      OPENCLAW_SKIP_CHANNELS: "1"
      OPENCLAW_NO_AUTO_UPDATE: "1"
      OPENCLAW_GATEWAY_STARTUP_TRACE: "1"
      NODE_COMPILE_CACHE: ""
  openclaw-cli:
    environment:
      NODE_COMPILE_CACHE: ""
YAML

docker compose -f docker-compose.yml -f docker-compose.clean-debug.yml up -d --force-recreate openclaw-gateway
sleep 30
#
docker compose -f docker-compose.yml -f docker-compose.clean-debug.yml logs --since=2m --timestamps openclaw-gateway
docker compose -f docker-compose.yml -f docker-compose.clean-debug.yml run -T --rm openclaw-cli gateway probe --timeout 30000

Also: rotate your gateway token — it was pasted publicly in the ctxt link.

#

I just restarted into a fresh session, and there wasn’t a resume file or any highlights for today, so I’m reconstructing this from the thread and the paste you linked.

That paste gives two solid clues:

  • your gateway token is publicly exposed in the paste, so please rotate it immediately before doing anything else; OpenClaw’s security docs explicitly say to rotate gateway.auth.token / OPENCLAW_GATEWAY_PASSWORD if it’s been exposed. (ctxt.io)
  • the paste also shows openclaw-cli is using network_mode: service:openclaw-gateway, and the Docker docs say that setup lets the CLI reach the gateway over 127.0.0.1; gateway probe also always checks local loopback. So an ECONNREFUSED 127.0.0.1:18789 from that CLI path means the gateway likely was not listening inside the container, not that LM Studio host networking was the primary failure. (ctxt.io)

Also, your rollback test is still running into the newer-config guard: the docs say newer-written config is stamped, and older binaries refuse certain service/process mutations against it. That matches the Config was last written by a newer OpenClaw message you pasted earlier. (docs.openclaw.ai)

One more small gotcha: the paste is pinned to ghcr.io/openclaw/openclaw:2026.4.25-beta.10, so that is not the plain 2026.4.25 stable image. (ctxt.io)

Simplest next step: after rotating the token, paste the actual gateway startup log from the 2026.4.26 container:

docker compose logs --tail=200 openclaw-gateway
#

If you want one extra check too:

docker compose ps

I’m pretty confident the next answer is in the gateway log, not the LM Studio config. If there’s one critical detail from before the reset that I’m missing, tell me, and I’ll fold it in.

#

I just restarted into a fresh session — there wasn’t a resume file or any highlights for today — so I’m rebuilding this from the thread.

Short answer: yes, but there are 2 very different ā€œstart from scratchā€ paths.

1) Rebuild Docker, keep ~/.openclaw

This is the safer first try.

That means:

  • remove/recreate containers
  • optionally pull/rebuild images
  • do not delete the host path / volume that maps to /home/node/.openclaw

In the Docker docs, that state mount is exactly where config, auth, sessions, and .env live. So container replacement is fine as long as you keep that mount.

Use this shape:

#
docker compose down
docker compose pull
docker compose up -d --force-recreate openclaw-gateway
docker compose ps
docker compose logs --tail=200 openclaw-gateway

Important: do not use docker compose down -v if your data is in Docker volumes. That can wipe the persisted volume too.


2) True clean slate, but keep old data as backup

If you want a real fresh start, keeping the old ~/.openclaw untouched:

  • leave the current ~/.openclaw alone
  • point Docker at a different empty state directory
#
  • onboard again from scratch

That’s the cleanest test for ā€œis my current state/config broken?ā€

It also matters because your rollback to 2026.4.25-beta.10 is still blocked by the newer-config guard from the 2026.4.26-touched state. Rebuilding containers alone won’t fix that if you keep using the same old state.


What not to use

If you want to preserve data, avoid:

openclaw reset

and avoid:

#
docker compose down -v

My recommendation

Try this order:

  1. Rebuild containers but keep the same state
  2. If that still fails, start with a new empty state dir while preserving the old one as backup
  3. If you want to stay on 4.25, use a fresh/pre-4.26 state, not the 4.26-touched one

Also: the token in that public paste should be rotated.

Docs:
https://docs.openclaw.ai/install/docker

dusty schooner
#

i want getch the latest stable

mental emberBOT
#

Yep. Latest non-beta stable appears to be v2026.4.24; 4.25 is beta-line right now. Pin that explicitly instead of latest. (newreleases.io)

From your OpenClaw repo:

git fetch --tags origin
git checkout v2026.4.24

docker compose down --remove-orphans
docker volume rm openclaw_openclaw_cache 2>/dev/null || true
docker image rm openclaw:local 2>/dev/null || true

docker build --no-cache -t openclaw:local -f Dockerfile .

Then start safely, keeping ~/.openclaw:

#
cat > docker-compose.stable.yml <<'YAML'
services:
  openclaw-gateway:
    environment:
      OPENCLAW_ALLOW_OLDER_BINARY_DESTRUCTIVE_ACTIONS: "1"
      OPENCLAW_SKIP_CHANNELS: "1"
      OPENCLAW_NO_AUTO_UPDATE: "1"
      OPENCLAW_GATEWAY_STARTUP_TRACE: "1"
      NODE_COMPILE_CACHE: ""
  openclaw-cli:
    environment:
      OPENCLAW_ALLOW_OLDER_BINARY_DESTRUCTIVE_ACTIONS: "1"
      NODE_COMPILE_CACHE: ""
YAML
#
docker compose -f docker-compose.yml -f docker-compose.stable.yml up -d --force-recreate openclaw-gateway
sleep 30
docker compose -f docker-compose.yml -f docker-compose.stable.yml logs --since=2m --timestamps openclaw-gateway
docker compose -f docker-compose.yml -f docker-compose.stable.yml run -T --rm openclaw-cli gateway probe --timeout 30000

Also rotate your gateway token after this — it got pasted publicly earlier.

dusty schooner
#

soham@Sohams-MBP openclaw % git checkout v2026.4.24
error: Your local changes to the following files would be overwritten by checkout:
Dockerfile
docker-compose.yml
Please commit your changes or stash them before you switch branches.
Aborting

mental emberBOT
#

Don’t overwrite your edits. Use a separate worktree for stable:

git fetch --tags origin
git worktree add ../openclaw-2026.4.24 v2026.4.24
cd ../openclaw-2026.4.24

docker build --no-cache -t openclaw:2026.4.24-local -f Dockerfile .

Then go back to your original compose folder and point Compose at that image:

cd ../openclaw

cat > docker-compose.stable.yml <<'YAML'
#
services:
  openclaw-gateway:
    image: openclaw:2026.4.24-local
    environment:
      OPENCLAW_ALLOW_OLDER_BINARY_DESTRUCTIVE_ACTIONS: "1"
      OPENCLAW_SKIP_CHANNELS: "1"
      OPENCLAW_NO_AUTO_UPDATE: "1"
      OPENCLAW_GATEWAY_STARTUP_TRACE: "1"
      NODE_COMPILE_CACHE: ""
  openclaw-cli:
    image: openclaw:2026.4.24-local
    environment:
      OPENCLAW_ALLOW_OLDER_BINARY_DESTRUCTIVE_ACTIONS: "1"
      NODE_COMPILE_CACHE: ""
YAML
#

docker compose -f docker-compose.yml -f docker-compose.stable.yml down
docker compose -f docker-compose.yml -f docker-compose.stable.yml up -d --force-recreate openclaw-gateway

This keeps your current repo changes and ~/.openclaw intact.

dusty schooner
#

I have taken a backup of those files. I want them gone

mental emberBOT
#

Got it. Since you backed them up, reset just those two files:

git restore Dockerfile docker-compose.yml

If your Git is older:

git checkout -- Dockerfile docker-compose.yml

Then switch to stable and rebuild:

git fetch --tags origin
#
git checkout v2026.4.24

docker compose down --remove-orphans
docker volume rm openclaw_openclaw_cache 2>/dev/null || true
docker image rm openclaw:local 2>/dev/null || true

docker build --no-cache -t openclaw:local -f Dockerfile .

Then start with channels skipped:

cat > docker-compose.stable.yml <<'YAML'
services:
  openclaw-gateway:
#
    environment:
      OPENCLAW_ALLOW_OLDER_BINARY_DESTRUCTIVE_ACTIONS: "1"
      OPENCLAW_SKIP_CHANNELS: "1"
      OPENCLAW_NO_AUTO_UPDATE: "1"
      OPENCLAW_GATEWAY_STARTUP_TRACE: "1"
      NODE_COMPILE_CACHE: ""
  openclaw-cli:
    environment:
      OPENCLAW_ALLOW_OLDER_BINARY_DESTRUCTIVE_ACTIONS: "1"
      NODE_COMPILE_CACHE: ""
YAML

docker compose -f docker-compose.yml -f docker-compose.stable.yml up -d --force-recreate openclaw-gateway
sleep 30
docker compose -f docker-compose.yml -f docker-compose.stable.yml run -T --rm openclaw-cli gateway probe --timeout 30000
#

This won’t delete ~/.openclaw.

dusty schooner
#

docker build --no-cache -t openclaw:local -f Dockerfile . i want cache though

mental emberBOT
#

Totally fine — use cache:

docker build -t openclaw:local -f Dockerfile .

If you want a mostly fresh build but still keep dependency caches, you can prune only dangling build junk first:

docker builder prune
docker build -t openclaw:local -f Dockerfile .

Avoid --no-cache unless you suspect the image layers themselves are corrupted. For your case, cached rebuild is reasonable.

dusty schooner
#

what cache is this btw?

mental emberBOT
#

Docker build cache = saved intermediate layers from previous builds.

In this Dockerfile, the useful cached bits are mostly:

  • base image layers (node:24-bookworm, bookworm-slim)
  • pnpm install layer — expensive dependency install
  • BuildKit cache mounts, e.g. pnpm store / apt cache:
    --mount=type=cache,id=openclaw-pnpm-store,...
    --mount=type=cache,id=openclaw-bookworm-apt-cache,...
    
  • compiled build layers if inputs haven’t changed

It is not your OpenClaw data. Your data is the bind mount:

#
/Users/soham/.openclaw -> /home/node/.openclaw

So:

docker build -t openclaw:local -f Dockerfile .

reuses build cache.

docker build --no-cache ...

ignores layer cache and rebuilds everything slower.

dusty schooner
#

i wont skil channels or anything yet... ill run the clean version first and see if error still even persists

mental emberBOT
#

That’s reasonable. Clean stock files + rebuilt image first is a good sanity check.

Run normal first:

docker compose down --remove-orphans
docker build -t openclaw:local -f Dockerfile .
docker compose up -d --force-recreate openclaw-gateway
sleep 30
docker compose logs --since=2m --timestamps openclaw-gateway
docker compose run -T --rm openclaw-cli gateway probe --timeout 30000

If it wedges again, then immediately retry with only this tiny override:

#
OPENCLAW_SKIP_CHANNELS: "1"
OPENCLAW_GATEWAY_STARTUP_TRACE: "1"

That gives us a clean A/B test.

dusty schooner
#

soham@Sohams-MBP openclaw % docker compose run -T --rm openclaw-cli gateway probe --timeout 30000
[+] 1/1t 1/11
āœ” Container openclaw-openclaw-gateway-1 Running 0.0s
Container openclaw-openclaw-cli-run-ce92a2b77bda Creating
Container openclaw-openclaw-cli-run-ce92a2b77bda Created
Config was last written by a newer OpenClaw (2026.4.26); current version is 2026.4.24.
Config was last written by a newer OpenClaw (2026.4.26); current version is 2026.4.24.

mental emberBOT
#

That means the 4.24 binary is protecting you from a config stamped by 4.26. For rollback, add the downgrade override to your normal compose env, at least temporarily.

Create this:

cat > docker-compose.rollback.yml <<'YAML'
services:
  openclaw-gateway:
    environment:
      OPENCLAW_ALLOW_OLDER_BINARY_DESTRUCTIVE_ACTIONS: "1"
      OPENCLAW_NO_AUTO_UPDATE: "1"
  openclaw-cli:
    environment:
      OPENCLAW_ALLOW_OLDER_BINARY_DESTRUCTIVE_ACTIONS: "1"
YAML
#

Then recreate and probe:

docker compose -f docker-compose.yml -f docker-compose.rollback.yml down
docker compose -f docker-compose.yml -f docker-compose.rollback.yml up -d --force-recreate openclaw-gateway
sleep 30
docker compose -f docker-compose.yml -f docker-compose.rollback.yml logs --since=2m --timestamps openclaw-gateway
docker compose -f docker-compose.yml -f docker-compose.rollback.yml run -T --rm openclaw-cli gateway probe --timeout 30000

If it still warns, run:

docker compose -f docker-compose.yml -f docker-compose.rollback.yml run -T --rm openclaw-cli env | grep OPENCLAW_ALLOW

It should print OPENCLAW_ALLOW_OLDER_BINARY_DESTRUCTIVE_ACTIONS=1.

dusty schooner
#

still same..

running this

docker compose -f docker-compose.yml -f docker-compose.rollback.yml run -T --rm openclaw-cli env | grep OPENCLAW_ALLOW

mental emberBOT
#

Good. If that prints nothing, the override file isn’t being applied.

Also I want to correct something: that env var may not silence read-only commands like gateway probe; it mainly allows destructive/service mutations. So the cleaner rollback fix is to remove the .26 stamp from config.

First check the env output. Then run this to inspect the stamp:

docker compose -f docker-compose.yml -f docker-compose.rollback.yml run -T --rm openclaw-cli config get meta.lastTouchedVersion

If it says 2026.4.26, set it down to 4.24:

docker compose -f docker-compose.yml -f docker-compose.rollback.yml run -T --rm openclaw-cli config set meta.lastTouchedVersion '"2026.4.24"' --strict-json

If config set refuses because of the same guard, use a direct JSON edit inside the container:

#
docker compose -f docker-compose.yml -f docker-compose.rollback.yml run -T --rm --entrypoint node openclaw-cli --input-type=module -e 'import fs from "node:fs"; import JSON5 from "json5"; const p="/home/node/.openclaw/openclaw.json"; const cfg=JSON5.parse(fs.readFileSync(p,"utf8")); cfg.meta ??= {}; cfg.meta.lastTouchedVersion="2026.4.24"; fs.copyFileSync(p,p+".bak-before-rollback-stamp"); fs.writeFileSync(p, JSON.stringify(cfg,null,2)); console.log("updated", p);'

Then:

docker compose -f docker-compose.yml -f docker-compose.rollback.yml down
docker compose -f docker-compose.yml -f docker-compose.rollback.yml up -d --force-recreate openclaw-gateway
sleep 30
docker compose -f docker-compose.yml -f docker-compose.rollback.yml run -T --rm openclaw-cli gateway probe --timeout 30000

You already backed up config, but this also creates openclaw.json.bak-before-rollback-stamp.

dusty schooner
#

soham@Sohams-MBP openclaw % docker compose -f docker-compose.yml -f docker-compose.rollback.yml run -T --rm openclaw-cli env | grep OPENCLAW_ALLOW
Container openclaw-openclaw-gateway-1 Running
Container openclaw-openclaw-cli-run-90dd3e2d7b1a Creating
Container openclaw-openclaw-cli-run-90dd3e2d7b1a Created
Config was last written by a newer OpenClaw (2026.4.26); current version is 2026.4.24.
Config was last written by a newer OpenClaw (2026.4.26); current version is 2026.4.24.

mental emberBOT
#

That happened because openclaw-cli has this entrypoint:

entrypoint: ["node", "dist/index.js"]

So run openclaw-cli env runs OpenClaw CLI command env, not shell env.

Use --entrypoint env:

docker compose -f docker-compose.yml -f docker-compose.rollback.yml run -T --rm --entrypoint env openclaw-cli | grep OPENCLAW_ALLOW

But the better fix now is to remove the .26 config stamp directly:

#
cp ~/.openclaw/openclaw.json ~/.openclaw/openclaw.json.bak-before-rollback-stamp

docker compose -f docker-compose.yml -f docker-compose.rollback.yml exec -T openclaw-gateway node --input-type=module -e 'import fs from "node:fs"; import JSON5 from "json5"; const p="/home/node/.openclaw/openclaw.json"; const cfg=JSON5.parse(fs.readFileSync(p,"utf8")); cfg.meta ??= {}; cfg.meta.lastTouchedVersion="2026.4.24"; fs.writeFileSync(p, JSON.stringify(cfg,null,2)); console.log("updated meta.lastTouchedVersion");'

Then restart:

docker compose -f docker-compose.yml -f docker-compose.rollback.yml down
docker compose -f docker-compose.yml -f docker-compose.rollback.yml up -d --force-recreate openclaw-gateway
sleep 30
docker compose -f docker-compose.yml -f docker-compose.rollback.yml run -T --rm openclaw-cli gateway probe --timeout 30000

And rotate the gateway token after you’re back up — it was pasted publicly.

dusty schooner
#

ok.im gonna manually write an older version to the config file

mental emberBOT
#

Yep — set only this field:

"meta": {
  "lastTouchedVersion": "2026.4.24"
}

If meta already exists, just change lastTouchedVersion. Don’t remove the rest of the config.

After saving:

docker compose -f docker-compose.yml -f docker-compose.rollback.yml down
docker compose -f docker-compose.yml -f docker-compose.rollback.yml up -d --force-recreate openclaw-gateway
sleep 30
#
docker compose -f docker-compose.yml -f docker-compose.rollback.yml logs --since=2m --timestamps openclaw-gateway
docker compose -f docker-compose.yml -f docker-compose.rollback.yml run -T --rm openclaw-cli gateway probe --timeout 30000

Also worth validating JSON before restart if you edit manually:

python3 -m json.tool ~/.openclaw/openclaw.json >/dev/null && echo OK
dusty schooner
#

no need for rollback now as I have updated config files.. tell me next

mental emberBOT
#

Great. Next do a clean restart/probe on your 4.24 build:

python3 -m json.tool ~/.openclaw/openclaw.json >/dev/null && echo "config JSON OK"

docker compose down --remove-orphans
docker compose up -d --force-recreate openclaw-gateway
sleep 30

docker compose ps
docker compose logs --since=2m --timestamps openclaw-gateway
docker compose run -T --rm openclaw-cli gateway probe --timeout 30000

If it starts, rotate the leaked gateway token:

#
NEW_TOKEN="$(openssl rand -hex 32)"
docker compose run -T --rm openclaw-cli config set gateway.auth.token "\"$NEW_TOKEN\"" --strict-json
docker compose restart openclaw-gateway

If probe still fails, paste the fresh logs --since=2m output.

dusty schooner
mental emberBOT
#

That confirms it’s still hanging before HTTP bind:

[gateway] loading configuration…
[gateway] resolving authentication…
[gateway] starting...
# no "starting HTTP server..."

So this is no longer a channel issue — it never gets that far.

Run one more boot with trace enabled, using the clean compose:

cat > docker-compose.trace.yml <<'YAML'
services:
#
  openclaw-gateway:
    environment:
      OPENCLAW_GATEWAY_STARTUP_TRACE: "1"
      OPENCLAW_NO_AUTO_UPDATE: "1"
      OPENCLAW_SKIP_CHANNELS: "1"
  openclaw-cli:
    environment:
      OPENCLAW_NO_AUTO_UPDATE: "1"
YAML

docker compose -f docker-compose.yml -f docker-compose.trace.yml down
docker compose -f docker-compose.yml -f docker-compose.trace.yml up -d --force-recreate openclaw-gateway
sleep 30
docker compose -f docker-compose.yml -f docker-compose.trace.yml logs --since=2m --timestamps openclaw-gateway
docker compose -f docker-compose.yml -f docker-compose.trace.yml run -T --rm openclaw-cli --version
#

If the trace again stops after config.snapshot, then the remaining suspect is something in config/auth processing, not Docker networking.

Also check config validity:

python3 -m json.tool ~/.openclaw/openclaw.json >/dev/null && echo "JSON OK"
dusty schooner
mental emberBOT
#

Good trace — this moved forward. It now passes:

config.snapshot
config.auth
control-ui.seed

and then hangs before the next completed phase, which is almost certainly plugins.bootstrap.

In 2026.4.24, the main startup maintenance path that can run before plugin bootstrap is Matrix migration maintenance. Quick check:

docker compose run -T --rm openclaw-cli config get channels.matrix --json
docker compose run -T --rm openclaw-cli config get plugins.entries.matrix --json
#

If you don’t use Matrix, disable it and retry:

docker compose run -T --rm openclaw-cli config set channels.matrix.enabled false --strict-json
docker compose run -T --rm openclaw-cli config set plugins.entries.matrix.enabled false --strict-json

docker compose -f docker-compose.yml -f docker-compose.trace.yml down
docker compose -f docker-compose.yml -f docker-compose.trace.yml up -d --force-recreate openclaw-gateway
sleep 30
docker compose -f docker-compose.yml -f docker-compose.trace.yml logs --since=2m --timestamps openclaw-gateway

Expected next healthy line would be:

startup trace: plugins.bootstrap ...
#
[gateway] starting HTTP server...

If Matrix isn’t configured, then next suspect is plugin runtime-deps/bootstrap generally, and we can list enabled plugins next.

dusty schooner
#

soham@Sohams-MBP openclaw % docker compose run -T --rm openclaw-cli config get channels.matrix --json
[+] 1/1t 1/11
āœ” Container openclaw-openclaw-gateway-1 Started 1.8s
Container openclaw-openclaw-cli-run-f847b66e3afd Creating
Container openclaw-openclaw-cli-run-f847b66e3afd Created
Config path not found: channels.matrix

What's next:
Debug this Compose error with Gordon → docker ai "help me fix this compose error"
soham@Sohams-MBP openclaw % docker compose run -T --rm openclaw-cli config get plugins.entries.matrix --json
[+] 1/1t 1/11
āœ” Container openclaw-openclaw-gateway-1 Running 0.0s
Container openclaw-openclaw-cli-run-ea5e5776131c Creating
Container openclaw-openclaw-cli-run-ea5e5776131c Created
Config path not found: plugins.entries.matrix

What's next:
Debug this Compose error with Gordon → docker ai "help me fix this compose error"

mental emberBOT
#

Good, Matrix isn’t configured. Next suspect is plugin bootstrap / runtime deps generally.

Let’s list what plugins are enabled/configured without needing the gateway:

docker compose run -T --rm --entrypoint node openclaw-cli --input-type=module -e 'import fs from "node:fs"; import JSON5 from "json5"; const p="/home/node/.openclaw/openclaw.json"; const cfg=JSON5.parse(fs.readFileSync(p,"utf8")); console.log("plugins.enabled =", cfg.plugins?.enabled); console.log("plugins.allow =", JSON.stringify(cfg.plugins?.allow ?? null)); console.log("plugins.deny =", JSON.stringify(cfg.plugins?.deny ?? null)); console.log("plugins.entries =", JSON.stringify(cfg.plugins?.entries ?? {}, null, 2)); console.log("channels keys =", Object.keys(cfg.channels ?? {}));'

Also run the trace again but wait longer, because plugin bootstrap can stage deps:

docker compose -f docker-compose.yml -f docker-compose.trace.yml down
docker compose -f docker-compose.yml -f docker-compose.trace.yml up -d --force-recreate openclaw-gateway
sleep 120
docker compose -f docker-compose.yml -f docker-compose.trace.yml logs --since=3m --timestamps openclaw-gateway
#

If it’s staging deps, logs should eventually show plugin/runtime-deps lines. If it stays after control-ui.seed for 2 minutes with no logs, it’s hard-hanging inside plugin bootstrap.

mental emberBOT
#

That confirms a hard hang inside plugins.bootstrap.

Also: your pasted config includes a real Tavily API key. Rotate that key ASAP.

For isolation, disable only the enabled plugin entries while preserving their config, then boot again:

docker compose run -T --rm openclaw-cli config set --batch-json '[
  {"path":"plugins.entries.memory-wiki.enabled","value":false},
  {"path":"plugins.entries.tavily.enabled","value":false},
  {"path":"plugins.entries.openrouter.enabled","value":false},
  {"path":"plugins.entries.openai.enabled","value":false},
  {"path":"plugins.entries.lmstudio.enabled","value":false}
]'

Then:

#
docker compose -f docker-compose.yml -f docker-compose.trace.yml down
docker compose -f docker-compose.yml -f docker-compose.trace.yml up -d --force-recreate openclaw-gateway
sleep 60
docker compose -f docker-compose.yml -f docker-compose.trace.yml logs --since=2m --timestamps openclaw-gateway
docker compose -f docker-compose.yml -f docker-compose.trace.yml run -T --rm openclaw-cli gateway probe --timeout 30000

If gateway starts, re-enable one at a time. My first suspects are memory-wiki or runtime-deps around provider plugins, not channels. Start with:

docker compose run -T --rm openclaw-cli config set plugins.entries.lmstudio.enabled true --strict-json
docker compose restart openclaw-gateway

Then test; repeat for others.

#

Good data. Since disabling those plugin entries did not move the trace past control-ui.seed, it’s probably not memory-wiki / Tavily / LM Studio themselves.

The hang is inside the early part of plugins.bootstrap, before the trace can print that phase. In 4.24 that includes startup maintenance + session migration. Let’s test the session migration directly with a timeout:

docker compose run -T --rm --entrypoint node openclaw-cli --input-type=module -e 'import fs from "node:fs"; import JSON5 from "json5"; const cfg=JSON5.parse(fs.readFileSync("/home/node/.openclaw/openclaw.json","utf8")); const t=Date.now(); console.log("importing migration..."); const {migrateOrphanedSessionKeys}=await import("./dist/infra/state-migrations.js"); console.log("running migration..."); const result=await Promise.race([migrateOrphanedSessionKeys({cfg,env:process.env}), new Promise((_,rej)=>setTimeout(()=>rej(new Error("migration timeout after 30s")),30000))]); console.log("done", Date.now()-t, result);'

Also check state size:

du -sh ~/.openclaw ~/.openclaw/agents ~/.openclaw/sessions 2>/dev/null || true
find ~/.openclaw -name 'sessions.json' -o -name '*.jsonl' | wc -l

If the migration command times out, the workaround is to temporarily move the session store aside (not delete) and boot fresh:

#
mkdir -p ~/.openclaw/rollback-hold
find ~/.openclaw -name 'sessions.json' -maxdepth 5 -print

Don’t move yet until we see where it is. Also rotate that Tavily key — it was public in the link.

final loom
#

.26 or .25? I just updated to .25 and have the same issue

mental emberBOT
#

Could be both. In Soham’s case .26 was a local build, but after rolling to .24 the trace still hangs inside startup, so this may be a broader .25-ish / config-migration / plugin-bootstrap issue rather than just .26.

If you’re seeing the same thing, the most useful quick diagnostic is:

OPENCLAW_GATEWAY_STARTUP_TRACE=1 OPENCLAW_NO_AUTO_UPDATE=1 openclaw gateway --port 18789

or Docker equivalent:

OPENCLAW_GATEWAY_STARTUP_TRACE: "1"
OPENCLAW_NO_AUTO_UPDATE: "1"

Then check the last startup trace line. Soham’s currently stops after:

#
startup trace: control-ui.seed ...

meaning it’s hanging in the next phase, likely plugins.bootstrap / startup migration.

Also: if you can paste your exact openclaw --version and the last ~30 startup log lines, we can compare whether it’s the same failure.

final loom
#

schergr@theclaw2:~$ OPENCLAW_GATEWAY_STARTUP_TRACE=1 OPENCLAW_NO_AUTO_UPDATE=1 openclaw gateway --port 18789

šŸ¦ž OpenClaw 2026.4.25 (aa36ee6) — Shell yeah—I'm here to pinch the toil and leave you the glory.

│
ā—‡
08:55:56 [gateway] startup trace: cli.server-import 2.5ms total=2.5ms
08:55:56 [gateway] loading configuration…
08:55:57 [gateway] startup trace: cli.config-load 397.7ms total=412.9ms
08:55:57 [gateway] startup trace: cli.config-snapshot 397.5ms total=811.7ms
08:55:57 [gateway] resolving authentication…
08:55:57 [gateway] startup trace: cli.auth-resolve 0.3ms total=815.1ms
08:55:57 [gateway] starting...
08:55:57 [gateway] startup trace: cli.gateway-loop 1.6ms total=816.7ms
08:56:02 Gateway failed to start: gateway already running (pid 13295); lock timeout after 5000ms
If the gateway is supervised, stop it with: openclaw gateway stop
08:56:02 Port 18789 is already in use.
08:56:02 - pid 13295 schergr: openclaw-gateway (127.0.0.1:18789)
08:56:02 - Gateway already running locally. Stop it (openclaw gateway stop) or use a different port.
08:56:02 Gateway service appears enabled. Stop it first.
08:56:02 Tip: openclaw gateway stop
08:56:02 Or: systemctl --user stop openclaw-gateway.service

mental emberBOT
#

Greg, yours is different: the trace didn’t hang — it found an already-running gateway on the port:

gateway already running (pid 13295)
Port 18789 is already in use
Gateway service appears enabled

So stop the supervised service first, then run the trace manually:

openclaw gateway stop
# or if needed:
systemctl --user stop openclaw-gateway.service

OPENCLAW_GATEWAY_STARTUP_TRACE=1 OPENCLAW_NO_AUTO_UPDATE=1 openclaw gateway --port 18789
#

In another terminal, check:

openclaw gateway probe --timeout 30000
openclaw gateway status --deep

If you want to debug the already-running service instead, don’t start a second gateway — use:

openclaw gateway status --deep
openclaw logs --follow
#

That import failed because 4.24’s built dist layout doesn’t have that exact path. No big deal — we got the useful bit:

sessions.json = /Users/soham/.openclaw/agents/main/sessions/sessions.json

Do the reversible isolation now:

mkdir -p ~/.openclaw/rollback-hold/sessions
cp ~/.openclaw/agents/main/sessions/sessions.json \
  ~/.openclaw/rollback-hold/sessions/sessions.json.bak

mv ~/.openclaw/agents/main/sessions/sessions.json \
  ~/.openclaw/rollback-hold/sessions/sessions.json
#

Then restart with trace:

docker compose -f docker-compose.yml -f docker-compose.trace.yml down
docker compose -f docker-compose.yml -f docker-compose.trace.yml up -d --force-recreate openclaw-gateway
sleep 60
docker compose -f docker-compose.yml -f docker-compose.trace.yml logs --since=2m --timestamps openclaw-gateway
docker compose -f docker-compose.yml -f docker-compose.trace.yml run -T --rm openclaw-cli gateway probe --timeout 30000

If it boots, the blocker is that session store / startup migration path. Your transcripts/data are still there; we only moved the session index file aside.

Also: rotate the Tavily key you pasted publicly.

#

Okay, moving sessions.json didn’t change it, so session migration is probably not the blocker.

Next isolation: disable the whole plugin layer temporarily. This is reversible and should tell us if plugins.bootstrap is the culprit.

docker compose run -T --rm openclaw-cli config set plugins.enabled false --strict-json

Then restart with trace:

docker compose -f docker-compose.yml -f docker-compose.trace.yml down
docker compose -f docker-compose.yml -f docker-compose.trace.yml up -d --force-recreate openclaw-gateway
sleep 60
docker compose -f docker-compose.yml -f docker-compose.trace.yml logs --since=2m --timestamps openclaw-gateway
docker compose -f docker-compose.yml -f docker-compose.trace.yml run -T --rm openclaw-cli gateway probe --timeout 30000
#

If that boots, we’ve proven it’s plugin bootstrap/registry/runtime-deps. Then re-enable with:

docker compose run -T --rm openclaw-cli config set plugins.enabled true --strict-json

and we’ll isolate the specific plugin/registry issue.

If even plugins.enabled=false still hangs after control-ui.seed, that’s very strong evidence for a core startup regression in 4.24/4.25 with this config volume.

Also: rotate that Tavily key — it was visible in the ctxt paste.

#

Good news: plugins.enabled=false proves the startup hang was plugin bootstrap. It reached:

[gateway] ready (0 plugins, 2.0s)

So the original blocker is definitely plugin-related.

Now there’s a second issue: the gateway says ready, but gateway probe times out. Let’s verify HTTP health directly:

curl -i http://127.0.0.1:18789/healthz
curl -i http://127.0.0.1:18789/readyz

docker compose -f docker-compose.yml -f docker-compose.trace.yml ps
docker compose -f docker-compose.yml -f docker-compose.trace.yml exec -T openclaw-gateway node -e "fetch('http://127.0.0.1:18789/healthz').then(r=>r.text()).then(console.log).catch(e=>console.error(e.message))"
#

If healthz works but gateway probe times out, try explicit URL:

docker compose -f docker-compose.yml -f docker-compose.trace.yml run -T --rm openclaw-cli gateway probe --url ws://127.0.0.1:18789 --timeout 30000

Don’t re-enable all plugins yet. Once HTTP/WS probe is stable with plugins.enabled=false, we’ll turn plugins back on one at a time. First suspect is still one of:

  • memory-wiki
  • tavily
  • openrouter/openai/lmstudio provider plugin runtime deps

Also rotate the Tavily key from that paste.

dusty schooner
#

this should be on the host mac or inside the container?

mental emberBOT
#

Run the first two curl commands on the host Mac:

curl -i http://127.0.0.1:18789/healthz
curl -i http://127.0.0.1:18789/readyz

Run these from the host Mac too — Docker will execute inside the container where needed:

docker compose -f docker-compose.yml -f docker-compose.trace.yml ps

docker compose -f docker-compose.yml -f docker-compose.trace.yml exec -T openclaw-gateway node -e "fetch('http://127.0.0.1:18789/healthz').then(r=>r.text()).then(console.log).catch(e=>console.error(e.message))"

docker compose -f docker-compose.yml -f docker-compose.trace.yml run -T --rm openclaw-cli gateway probe --url ws://127.0.0.1:18789 --timeout 30000
#

So: type everything in your Mac terminal, not an interactive shell inside the container.

dusty schooner
mental emberBOT
#

Interesting — Gateway logged ā€œreadyā€, but host HTTP hangs. That suggests the Node process/event loop is wedged after ready, or Docker port forwarding is stuck.

Cancel curl with Ctrl+C, then run:

docker compose -f docker-compose.yml -f docker-compose.trace.yml ps

docker compose -f docker-compose.yml -f docker-compose.trace.yml exec -T openclaw-gateway node -e "fetch('http://127.0.0.1:18789/healthz', {signal: AbortSignal.timeout(5000)}).then(r=>r.text()).then(console.log).catch(e=>console.error(e.name, e.message))"

docker compose -f docker-compose.yml -f docker-compose.trace.yml logs --tail=80 --timestamps openclaw-gateway

If the inside-container fetch also times out, the gateway’s event loop/server is wedged after ready.

If inside-container fetch works but host curl hangs, it’s Docker port publishing/networking. In that case try:

#
docker compose -f docker-compose.yml -f docker-compose.trace.yml down
docker compose -f docker-compose.yml -f docker-compose.trace.yml up -d --force-recreate

(note: no service name, so it recreates the CLI sharing network namespace too).

dusty schooner
mental emberBOT
#

That nails it: the gateway reaches ready, then the Node process stops answering even /healthz. Since channels are skipped and plugins are globally disabled, the remaining smoking gun is this line:

[gateway] agent model: lmstudio/openai/gpt-oss-20b
...
[heartbeat] started

It may be the heartbeat/main agent path immediately hitting LM Studio/local model and wedging the process.

Disable heartbeats next:

docker compose run -T --rm openclaw-cli config set agents.defaults.heartbeat.enabled false --strict-json

If that path doesn’t exist, use:

#
docker compose run -T --rm openclaw-cli config set heartbeat.enabled false --strict-json

Then restart and test healthz:

docker compose -f docker-compose.yml -f docker-compose.trace.yml down
docker compose -f docker-compose.yml -f docker-compose.trace.yml up -d --force-recreate openclaw-gateway
sleep 20
curl -i --max-time 5 http://127.0.0.1:18789/healthz
docker compose -f docker-compose.yml -f docker-compose.trace.yml logs --since=1m --timestamps openclaw-gateway

If disabling heartbeat fixes /healthz, then the model/heartbeat startup path is wedging against LM Studio. Then we can fix LM Studio config separately.

dusty schooner
#

soham@Sohams-MBP openclaw % docker compose run -T --rm openclaw-cli config set agents.defaults.heartbeat.enabled false --strict-json
[+] 1/1t 1/11
āœ” Container openclaw-openclaw-gateway-1 Started 1.5s
Container openclaw-openclaw-cli-run-cf5ff93ae2ab Creating
Container openclaw-openclaw-cli-run-cf5ff93ae2ab Created
│
ā—‡ Config warnings ───────────────────────────────────────────────────────╮
│ │
│ - plugins.entries.memory-core: plugin disabled (plugins disabled) but │
│ config is present │
│ - plugins.entries.memory-wiki: plugin disabled (plugins disabled) but │
│ config is present │
│ - plugins.entries.tavily: plugin disabled (plugins disabled) but │
│ config is present │
│ │
ā”œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā•Æ
Error: Config validation failed: agents.defaults.heartbeat: Unrecognized key: "enabled"

mental emberBOT
#

Good catch — wrong key. Use the actual heartbeat interval knob: set it to off/zero.

First inspect current value:

docker compose run -T --rm openclaw-cli config get agents.defaults.heartbeat --json

Then set every to disabled:

docker compose run -T --rm openclaw-cli config set agents.defaults.heartbeat.every '"0m"' --strict-json

If validation rejects 0m, use:

#
docker compose run -T --rm openclaw-cli config set agents.defaults.heartbeat.every '"off"' --strict-json

Then restart/test:

docker compose -f docker-compose.yml -f docker-compose.trace.yml down
docker compose -f docker-compose.yml -f docker-compose.trace.yml up -d --force-recreate openclaw-gateway
sleep 20
curl -i --max-time 5 http://127.0.0.1:18789/healthz
docker compose -f docker-compose.yml -f docker-compose.trace.yml logs --since=1m --timestamps openclaw-gateway
dusty schooner
mental emberBOT
#

I checked the docs: correct disable value is 0m for agents.defaults.heartbeat.every.

Run:

docker compose run -T --rm openclaw-cli config set agents.defaults.heartbeat.every '"0m"' --strict-json

Then restart/test:

docker compose -f docker-compose.yml -f docker-compose.trace.yml down
docker compose -f docker-compose.yml -f docker-compose.trace.yml up -d --force-recreate openclaw-gateway
sleep 20
curl -i --max-time 5 http://127.0.0.1:18789/healthz
docker compose -f docker-compose.yml -f docker-compose.trace.yml logs --since=1m --timestamps openclaw-gateway
#

If healthz works after this, we found the post-ready wedge: heartbeat/model path.

dusty schooner
mental emberBOT
#

Heartbeat is ruled out — logs show it disabled, but /healthz still times out. We need to catch what the Node process is doing after it says ready.

Run this from the Mac host:

docker compose -f docker-compose.yml -f docker-compose.trace.yml exec -T openclaw-gateway sh -lc '
echo "== processes ==";
ps -eo pid,ppid,stat,pcpu,pmem,etime,args;
echo;
PID=$(pgrep -f "node .*dist/index.js gateway" | head -1);
echo "node pid=$PID";
echo;
echo "== wchan ==";
cat /proc/$PID/wchan 2>/dev/null || true;
echo;
echo "== fd count ==";
#
ls /proc/$PID/fd 2>/dev/null | wc -l;
echo;
echo "== node active handles via proc only unavailable, checking logs ==";
tail -n 120 /tmp/openclaw/openclaw-2026-04-27.log 2>/dev/null || true
'

Also test if changing the model away from LM Studio stops the post-ready freeze. Since plugins are disabled, set a harmless cloud/default model if you have one configured, or at least inspect current models config:

docker compose run -T --rm openclaw-cli config get agents.defaults.models --json
docker compose run -T --rm openclaw-cli config get models --json

The current line:

#
agent model: lmstudio/openai/gpt-oss-20b

is still suspicious, even with heartbeat disabled, because model/runtime initialization may still touch LM Studio during startup/status paths.

dusty schooner
mental emberBOT
#

The process check picked PID 1 by accident; the real gateway process is PID 6:

6  ... Sl 20.2% ... openclaw-gateway

That’s important: it’s burning CPU after ā€œreadyā€, so this is an event-loop/busy-loop wedge, not just Docker networking.

Run the process check again against PID 6 explicitly:

docker compose -f docker-compose.yml -f docker-compose.trace.yml exec -T openclaw-gateway sh -lc '
PID=6
echo "== pid $PID status ==";
cat /proc/$PID/status | egrep "State|Threads|VmRSS|voluntary|nonvoluntary";
echo;
#
echo "== wchan ==";
cat /proc/$PID/wchan || true;
echo;
echo "== thread wchans ==";
for t in /proc/$PID/task/*; do printf "%s " "$(basename "$t")"; cat "$t/wchan" 2>/dev/null; done | head -50;
echo;
echo "== fd count ==";
ls /proc/$PID/fd | wc -l;
'

Next isolation: the only suspicious startup line left is still:

agent model: lmstudio/openai/gpt-oss-20b
#

Temporarily switch away from LM Studio, or at least disable provider startup too:

cat > docker-compose.trace2.yml <<'YAML'
services:
  openclaw-gateway:
    environment:
      OPENCLAW_GATEWAY_STARTUP_TRACE: "1"
      OPENCLAW_NO_AUTO_UPDATE: "1"
      OPENCLAW_SKIP_CHANNELS: "1"
      OPENCLAW_SKIP_PROVIDERS: "1"
  openclaw-cli:
    environment:
      OPENCLAW_NO_AUTO_UPDATE: "1"
YAML
#
docker compose -f docker-compose.yml -f docker-compose.trace2.yml down
docker compose -f docker-compose.yml -f docker-compose.trace2.yml up -d --force-recreate openclaw-gateway
sleep 20
curl -i --max-time 5 http://127.0.0.1:18789/healthz
docker compose -f docker-compose.yml -f docker-compose.trace2.yml logs --since=1m --timestamps openclaw-gateway

If OPENCLAW_SKIP_PROVIDERS=1 makes /healthz respond, the wedge is provider/model initialization — likely LM Studio/local model path.

dusty schooner
mental emberBOT
#

OPENCLAW_SKIP_PROVIDERS=1 still hangs, so that mostly rules out LM Studio/provider init.

New strongest suspect: internal startup hooks. Your logs show:

[hooks] loaded 4 internal hook handlers

Gateway schedules a gateway:startup internal hook shortly after startup. If one of those hooks wedges, /healthz would hang exactly like this.

Try disabling internal hooks:

docker compose run -T --rm openclaw-cli config set hooks.internal.enabled false --strict-json

Then restart/test:

#
docker compose -f docker-compose.yml -f docker-compose.trace2.yml down
docker compose -f docker-compose.yml -f docker-compose.trace2.yml up -d --force-recreate openclaw-gateway
sleep 20
curl -i --max-time 5 http://127.0.0.1:18789/healthz
docker compose -f docker-compose.yml -f docker-compose.trace2.yml logs --since=1m --timestamps openclaw-gateway

If that fixes it, the culprit is one of the internal hooks / gateway:startup hook path, likely BOOT/session-memory style startup work.

If it still hangs, get the live PID dynamically:

docker compose -f docker-compose.yml -f docker-compose.trace2.yml exec -T openclaw-gateway sh -lc '
PID=$(pgrep -x openclaw-gateway | head -1)
echo "PID=$PID"
#
ps -p "$PID" -o pid,ppid,stat,pcpu,pmem,etime,args
cat /proc/$PID/status | egrep "State|Threads|VmRSS|voluntary|nonvoluntary"
cat /proc/$PID/wchan
'
dusty schooner
mental emberBOT
#

Okay, internal hooks are ruled out too — good isolation. You now have:

  • plugins.enabled=false
  • OPENCLAW_SKIP_CHANNELS=1
  • OPENCLAW_SKIP_PROVIDERS=1
  • heartbeat disabled
  • internal hooks disabled
  • gateway reaches ready
  • but /healthz still hangs

That’s now a post-ready Node/event-loop wedge with almost everything optional disabled.

Run the live PID check dynamically now:

docker compose -f docker-compose.yml -f docker-compose.trace2.yml exec -T openclaw-gateway sh -lc '
#
PID=$(pgrep -x openclaw-gateway | head -1)
echo "PID=$PID"
ps -p "$PID" -o pid,ppid,stat,pcpu,pmem,etime,args
echo
cat /proc/$PID/status | egrep "State|Threads|VmRSS|voluntary|nonvoluntary"
echo
echo "wchan:"
cat /proc/$PID/wchan
echo
echo "thread wchans:"
for t in /proc/$PID/task/*; do printf "%s " "$(basename "$t")"; cat "$t/wchan" 2>/dev/null; done | head -80
'

Also do a clean config-volume A/B test without touching ~/.openclaw:

#
mkdir -p /tmp/openclaw-empty-config /tmp/openclaw-empty-workspace

OPENCLAW_CONFIG_DIR=/tmp/openclaw-empty-config \
OPENCLAW_WORKSPACE_DIR=/tmp/openclaw-empty-workspace \
docker compose -f docker-compose.yml -f docker-compose.trace2.yml up -d --force-recreate openclaw-gateway

sleep 20
curl -i --max-time 5 http://127.0.0.1:18789/healthz

If empty config works, the wedge is in your existing ~/.openclaw state/config. If empty config also hangs, it’s the image/runtime/Node/container setup.

dusty schooner
mental emberBOT
#

Two issues in that run:

  1. pgrep -x openclaw-gateway failed because process names are truncated to 15 chars. Use pidof.
  2. The empty config test did not apply because the env vars were entered as separate commands. They need to prefix the docker compose command on the same line.

Run this:

docker compose -f docker-compose.yml -f docker-compose.trace2.yml exec -T openclaw-gateway sh -lc '
PID=$(pidof openclaw-gateway | awk "{print \$1}")
echo "PID=$PID"
ps -p "$PID" -o pid,ppid,stat,pcpu,pmem,etime,args
cat /proc/$PID/status | egrep "State|Threads|VmRSS|voluntary|nonvoluntary"
cat /proc/$PID/wchan
'
#

Then do the empty config A/B test exactly like this:

mkdir -p /tmp/openclaw-empty-config /tmp/openclaw-empty-workspace

OPENCLAW_CONFIG_DIR=/tmp/openclaw-empty-config \
OPENCLAW_WORKSPACE_DIR=/tmp/openclaw-empty-workspace \
docker compose -f docker-compose.yml -f docker-compose.trace2.yml down

OPENCLAW_CONFIG_DIR=/tmp/openclaw-empty-config \
OPENCLAW_WORKSPACE_DIR=/tmp/openclaw-empty-workspace \
docker compose -f docker-compose.yml -f docker-compose.trace2.yml up -d --force-recreate openclaw-gateway

sleep 20
curl -i --max-time 5 http://127.0.0.1:18789/healthz
#

If that still hangs with empty config, it’s not your .openclaw data — it’s image/runtime/Docker setup.

dusty schooner
mental emberBOT
#

Great, this tells us two things:

  1. With your real config, the gateway process is alive but CPU-busy:
    openclaw-gateway ... 36% CPU ... do_epoll_wait
    
  2. With empty config, it doesn’t even listen after 20s — we need the logs for that run. It may be stuck in onboarding/allow-unconfigured flow or missing required setup.

Run:

OPENCLAW_CONFIG_DIR=/tmp/openclaw-empty-config \
OPENCLAW_WORKSPACE_DIR=/tmp/openclaw-empty-workspace \
docker compose -f docker-compose.yml -f docker-compose.trace2.yml logs --since=2m --timestamps openclaw-gateway

OPENCLAW_CONFIG_DIR=/tmp/openclaw-empty-config \
#
OPENCLAW_WORKSPACE_DIR=/tmp/openclaw-empty-workspace \
docker compose -f docker-compose.yml -f docker-compose.trace2.yml ps

But honestly, at this point we’ve isolated enough:

  • clean/empty config also doesn’t come up cleanly
  • real config only starts when all plugins disabled, then HTTP hangs
  • image is local 4.24/previously local 4.26
  • this smells like local image/runtime/build issue, not just your config

Fastest recovery path now: stop building locally and use the published image/tag if possible, or rebuild from clean source without custom Dockerfile changes. If GHCR 4.25 beta image pulled before, use that with a clean temp config to confirm.

dusty schooner
mental emberBOT
#

You’re right — you did do a clean rebuild. Sorry, I shouldn’t have implied you hadn’t. The empty-config test is actually fine: it’s restarting because empty config lacks gateway.mode=local, and your restored stock compose apparently no longer has --allow-unconfigured.

For a valid empty-config test, create a minimal config:

cat > /tmp/openclaw-empty-config/openclaw.json <<'JSON'
{
  "gateway": {
    "mode": "local",
    "bind": "lan",
    "auth": {
      "mode": "none"
    }
  },
  "plugins": {
    "enabled": false
#
  },
  "channels": {}
}
JSON

Then:

OPENCLAW_CONFIG_DIR=/tmp/openclaw-empty-config \
OPENCLAW_WORKSPACE_DIR=/tmp/openclaw-empty-workspace \
docker compose -f docker-compose.yml -f docker-compose.trace2.yml down

OPENCLAW_CONFIG_DIR=/tmp/openclaw-empty-config \
OPENCLAW_WORKSPACE_DIR=/tmp/openclaw-empty-workspace \
#
docker compose -f docker-compose.yml -f docker-compose.trace2.yml up -d --force-recreate openclaw-gateway

sleep 20
curl -i --max-time 5 http://127.0.0.1:18789/healthz

If this minimal config works, your real config/state has the post-ready busy-loop. If it still hangs, it’s the current image/runtime/compose behavior.

dusty schooner
#

soham@Sohams-MBP openclaw % curl -i --max-time 5 http://127.0.0.1:18789/healthz
curl: (7) Failed to connect to 127.0.0.1 port 18789 after 0 ms: Couldn't connect to server

mental emberBOT
#

That’s different from ā€œhangā€ — now nothing is listening. Need the minimal-config container logs:

OPENCLAW_CONFIG_DIR=/tmp/openclaw-empty-config \
OPENCLAW_WORKSPACE_DIR=/tmp/openclaw-empty-workspace \
docker compose -f docker-compose.yml -f docker-compose.trace2.yml ps

OPENCLAW_CONFIG_DIR=/tmp/openclaw-empty-config \
OPENCLAW_WORKSPACE_DIR=/tmp/openclaw-empty-workspace \
docker compose -f docker-compose.yml -f docker-compose.trace2.yml logs --since=2m --timestamps openclaw-gateway

If it says config validation failed for auth.mode: none, change minimal config to token mode instead:

cat > /tmp/openclaw-empty-config/openclaw.json <<'JSON'
#
{
  "gateway": {
    "mode": "local",
    "bind": "lan",
    "auth": {
      "mode": "token",
      "token": "dev-local-token"
    }
  },
  "plugins": {
    "enabled": false
  },
  "channels": {}
}
JSON
#

Then recreate again.

dusty schooner
mental emberBOT
#

You wrote the token config but didn’t close the heredoc in the paste. Make sure you finish with JSON on its own line:

cat > /tmp/openclaw-empty-config/openclaw.json <<'JSON'
{
  "gateway": {
    "mode": "local",
    "bind": "lan",
    "auth": {
      "mode": "token",
      "token": "dev-local-token"
    }
  },
  "plugins": {
    "enabled": false
  },
#
  "channels": {}
}
JSON

Then run:

OPENCLAW_CONFIG_DIR=/tmp/openclaw-empty-config \
OPENCLAW_WORKSPACE_DIR=/tmp/openclaw-empty-workspace \
docker compose -f docker-compose.yml -f docker-compose.trace2.yml down

OPENCLAW_CONFIG_DIR=/tmp/openclaw-empty-config \
OPENCLAW_WORKSPACE_DIR=/tmp/openclaw-empty-workspace \
docker compose -f docker-compose.yml -f docker-compose.trace2.yml up -d --force-recreate openclaw-gateway
#

sleep 20
curl -i --max-time 5 http://127.0.0.1:18789/healthz

OPENCLAW_CONFIG_DIR=/tmp/openclaw-empty-config \
OPENCLAW_WORKSPACE_DIR=/tmp/openclaw-empty-workspace \
docker compose -f docker-compose.yml -f docker-compose.trace2.yml logs --since=1m --timestamps openclaw-gateway
#

Great — that’s definitive:

  • empty/minimal config works
  • your real config/state causes the hang
  • image/Docker itself is not the primary issue

Notice the key difference:

minimal config: agent model: openai/gpt-5.5  āœ… healthz works
real config:    agent model: lmstudio/openai/gpt-oss-20b  āŒ healthz hangs

Next test: switch your real config off LM Studio as default.

#
# stop the empty-config stack first
OPENCLAW_CONFIG_DIR=/tmp/openclaw-empty-config \
OPENCLAW_WORKSPACE_DIR=/tmp/openclaw-empty-workspace \
docker compose -f docker-compose.yml -f docker-compose.trace2.yml down

# set real config default model away from LM Studio
docker compose run -T --rm openclaw-cli config set agents.defaults.model '{"primary":"openai/gpt-5.5"}' --strict-json

Then boot real config again:

docker compose -f docker-compose.yml -f docker-compose.trace2.yml up -d --force-recreate openclaw-gateway
sleep 20
curl -i --max-time 5 http://127.0.0.1:18789/healthz
#
docker compose -f docker-compose.yml -f docker-compose.trace2.yml logs --since=1m --timestamps openclaw-gateway

If that fixes it, the remaining problem is specifically the LM Studio/default model path. Also rotate the Tavily key and gateway token that got pasted publicly.

dusty schooner
mental emberBOT
#

Good: switching default model to openai/gpt-5.5 did not fix it, so LM Studio is probably not the post-ready hang.

Now isolate config file vs rest of ~/.openclaw state.

Run this: it uses your real openclaw.json, but none of the rest of your data/state.

mkdir -p /tmp/openclaw-realconfig-only /tmp/openclaw-realconfig-workspace
cp ~/.openclaw/openclaw.json /tmp/openclaw-realconfig-only/openclaw.json

OPENCLAW_CONFIG_DIR=/tmp/openclaw-realconfig-only \
OPENCLAW_WORKSPACE_DIR=/tmp/openclaw-realconfig-workspace \
docker compose -f docker-compose.yml -f docker-compose.trace2.yml down

OPENCLAW_CONFIG_DIR=/tmp/openclaw-realconfig-only \
OPENCLAW_WORKSPACE_DIR=/tmp/openclaw-realconfig-workspace \
#
docker compose -f docker-compose.yml -f docker-compose.trace2.yml up -d --force-recreate openclaw-gateway

sleep 20
curl -i --max-time 5 http://127.0.0.1:18789/healthz

OPENCLAW_CONFIG_DIR=/tmp/openclaw-realconfig-only \
OPENCLAW_WORKSPACE_DIR=/tmp/openclaw-realconfig-workspace \
docker compose -f docker-compose.yml -f docker-compose.trace2.yml logs --since=1m --timestamps openclaw-gateway

Interpretation:

  • If this works: openclaw.json is okay; the hang is in some state file under ~/.openclaw — auth profiles, sessions, wiki, canvas, etc.
  • If this still hangs: the culprit is in openclaw.json itself.

Also, one config warning still stands out:

#
gateway.controlUi.allowInsecureAuth=true

Not saying it’s the cause, but once stable, turn that off. And rotate the exposed Tavily key / gateway token.

dusty schooner
#

curl: (28) Operation timed out after 5006 milliseconds with 0 bytes received

mental emberBOT
#

That means real openclaw.json alone is enough to reproduce. So the culprit is in config, not the rest of ~/.openclaw state.

Now reduce the temp copy until it works. Since the minimal config worked, easiest is start from minimal config and add only required safe parts later. For recovery, use this minimal working config as your real config backup path:

cp ~/.openclaw/openclaw.json ~/.openclaw/openclaw.json.bak-bad-config

cat > ~/.openclaw/openclaw.json <<'JSON'
{
  "gateway": {
    "mode": "local",
    "bind": "lan",
    "auth": {
      "mode": "token",
      "token": "dev-local-token"
    }
#
  },
  "plugins": {
    "enabled": false
  },
  "channels": {},
  "agents": {
    "defaults": {
      "model": {
        "primary": "openai/gpt-5.5"
      },
      "heartbeat": {
        "every": "0m"
      }
    }
  },
#
  "hooks": {
    "internal": {
      "enabled": false
    }
  }
}
JSON

Then boot normal real config dir:

docker compose -f docker-compose.yml -f docker-compose.trace2.yml down
docker compose -f docker-compose.yml -f docker-compose.trace2.yml up -d --force-recreate openclaw-gateway
sleep 20
#
curl -i --max-time 5 http://127.0.0.1:18789/healthz

If that works, we can selectively copy sections back from openclaw.json.bak-bad-config. Start with model provider config, then channels, then plugins. Don’t restore all at once.

Also replace dev-local-token with a fresh real token before exposing/using Control UI.

dusty schooner
#

curl: (28) Operation timed out after 5006 milliseconds with 0 bytes received

mental emberBOT
#

If even that minimal real config still times out, then one important difference remains: the real config dir (~/.openclaw) vs /tmp/openclaw-empty-config. So it’s not just openclaw.json; it’s something else in the config dir that OpenClaw loads even with minimal config.

Next isolate by keeping your minimal ~/.openclaw/openclaw.json, but moving likely sidecar files aside one by one.

First check what top-level state exists:

find ~/.openclaw -maxdepth 2 -type f | sed "s|$HOME|~|" | sort | head -200

Quick reversible isolation for auth/model stores:

mkdir -p ~/.openclaw/rollback-hold/root-files

for f in auth-profiles.json credentials.json models.json plugins.json plugin-registry.json; do
#
  if [ -f ~/.openclaw/$f ]; then
    mv ~/.openclaw/$f ~/.openclaw/rollback-hold/root-files/$f
  fi
done

Then restart/test:

docker compose -f docker-compose.yml -f docker-compose.trace2.yml down
docker compose -f docker-compose.yml -f docker-compose.trace2.yml up -d --force-recreate openclaw-gateway
sleep 20
curl -i --max-time 5 http://127.0.0.1:18789/healthz
docker compose -f docker-compose.yml -f docker-compose.trace2.yml logs --since=1m --timestamps openclaw-gateway
#

If that works, one of those root state files is corrupt/problematic. If not, we’ll move to agent-scoped files next.

dusty schooner
mental emberBOT
#

The root-file move didn’t change it. Also your log is back to:

agent model: lmstudio/openai/gpt-oss-20b

So the model change didn’t persist or got reverted by a backup/real config switch.

At this point, use the known-good minimal config as the base, then add back only what you need. That’s the fastest path to working.

  1. Put the working minimal config into real ~/.openclaw/openclaw.json:
cp ~/.openclaw/openclaw.json ~/.openclaw/openclaw.json.bak-stuck

cat > ~/.openclaw/openclaw.json <<'JSON'
#
{
  "gateway": {
    "mode": "local",
    "bind": "lan",
    "auth": {
      "mode": "token",
      "token": "dev-local-token"
    }
  },
  "plugins": { "enabled": false },
  "channels": {},
  "agents": {
    "defaults": {
      "model": { "primary": "openai/gpt-5.5" },
      "heartbeat": { "every": "0m" }
#
    }
  },
  "hooks": { "internal": { "enabled": false } }
}
JSON
  1. Start real config dir:
docker compose -f docker-compose.yml -f docker-compose.trace2.yml down
docker compose -f docker-compose.yml -f docker-compose.trace2.yml up -d --force-recreate openclaw-gateway
sleep 20
curl -i --max-time 5 http://127.0.0.1:18789/healthz
#

If that works, stop restoring the old file wholesale. Pull sections back manually from openclaw.json.bak-stuck one at a time.

And don’t forget: rotate the gateway token and Tavily key you pasted publicly.

dusty schooner
#

curl: (28) Operation timed out after 5002 milliseconds with 0 bytes received

mental emberBOT
#

If the same minimal JSON works in /tmp but times out in ~/.openclaw, then the difference is not openclaw.json anymore — it’s other files in ~/.openclaw being loaded by default.

Fastest clean-but-keep-data recovery:

mv ~/.openclaw ~/.openclaw.broken-2026-04-27
mkdir -p ~/.openclaw
cp ~/.openclaw.broken-2026-04-27/openclaw.json ~/.openclaw/openclaw.json

Then put the minimal config in ~/.openclaw/openclaw.json again and start:

cat > ~/.openclaw/openclaw.json <<'JSON'
{
  "gateway": {
#
    "mode": "local",
    "bind": "lan",
    "auth": {
      "mode": "token",
      "token": "dev-local-token"
    }
  },
  "plugins": { "enabled": false },
  "channels": {},
  "agents": {
    "defaults": {
      "model": { "primary": "openai/gpt-5.5" },
      "heartbeat": { "every": "0m" }
    }
  },
#
  "hooks": { "internal": { "enabled": false } }
}
JSON

docker compose -f docker-compose.yml -f docker-compose.trace2.yml down
docker compose -f docker-compose.yml -f docker-compose.trace2.yml up -d --force-recreate openclaw-gateway
sleep 20
curl -i --max-time 5 http://127.0.0.1:18789/healthz

This does not delete your data — it moves it to:

~/.openclaw.broken-2026-04-27
#

Once healthy, copy back only needed pieces gradually, not the whole folder. Start with channels/auth after rotating leaked tokens.

dusty schooner
#

curl: (28) Operation timed out after 5004 milliseconds with 0 bytes received

mental emberBOT
#

If it still times out after the ā€œmove old ~/.openclaw aside + minimal configā€ step, we need to verify Docker is actually mounting the new minimal config.

Run these and paste the output:

cat ~/.openclaw/openclaw.json

docker compose -f docker-compose.yml -f docker-compose.trace2.yml exec -T openclaw-gateway sh -lc '
echo "== mounted config ==";
cat /home/node/.openclaw/openclaw.json;
echo;
echo "== files ==";
find /home/node/.openclaw -maxdepth 2 -type f | sort | head -80;
echo;
echo "== process ==";
ps -eo pid,ppid,stat,pcpu,pmem,etime,args
#
'

Then fresh logs:

docker compose -f docker-compose.yml -f docker-compose.trace2.yml logs --since=2m --timestamps openclaw-gateway

Key thing I’m checking: if logs still say:

agent model: lmstudio/openai/gpt-oss-20b

then Docker is not using the minimal config we think it is. If it says openai/gpt-5.5 and still hangs, then something else in the fresh config dir or image is causing the post-ready HTTP freeze.

dusty schooner
mental emberBOT
#

This confirms Docker is mounting the minimal config, but ~/.openclaw still behaves differently than the /tmp clean config.

At this point, fastest safe recovery: use a new clean config dir and keep the old one as archive.

docker compose -f docker-compose.yml -f docker-compose.trace2.yml down

mkdir -p ~/.openclaw-clean ~/.openclaw-clean/workspace

cat > ~/.openclaw-clean/openclaw.json <<'JSON'
{
  "gateway": {
    "mode": "local",
    "bind": "lan",
    "auth": {
      "mode": "token",
#
      "token": "CHANGE_ME_TO_A_NEW_RANDOM_TOKEN"
    }
  },
  "plugins": { "enabled": false },
  "channels": {},
  "agents": {
    "defaults": {
      "model": { "primary": "openai/gpt-5.5" },
      "heartbeat": { "every": "0m" }
    }
  },
  "hooks": { "internal": { "enabled": false } }
}
JSON
#

Edit .env:

OPENCLAW_CONFIG_DIR=/Users/soham/.openclaw-clean
OPENCLAW_WORKSPACE_DIR=/Users/soham/.openclaw-clean/workspace

Then:

docker compose -f docker-compose.yml -f docker-compose.trace2.yml up -d --force-recreate openclaw-gateway
sleep 20
curl -i --max-time 5 http://127.0.0.1:18789/healthz

If this works, your old data is safe in the previous folder, but don’t copy it back wholesale. Re-add only needed pieces gradually.

#

Also use a fresh random gateway token, not the leaked one.

dusty schooner
#

curl: (28) Operation timed out after 5003 milliseconds with 0 bytes received

mental emberBOT
#

If ~/.openclaw-clean also times out, then the earlier /tmp success was likely because the compose env override was different, or .env didn’t actually point where we think.

Let’s stop changing variables and compare exactly what Compose is launching.

Run:

docker compose -f docker-compose.yml -f docker-compose.trace2.yml config | grep -A30 -E 'openclaw-gateway:|volumes:'

Then test with an explicit one-off override file that forces the clean dir, ignoring .env ambiguity:

cat > docker-compose.clean-dir.yml <<'YAML'
services:
  openclaw-gateway:
#
    volumes:
      - /Users/soham/.openclaw-clean:/home/node/.openclaw
      - /Users/soham/.openclaw-clean/workspace:/home/node/.openclaw/workspace
  openclaw-cli:
    volumes:
      - /Users/soham/.openclaw-clean:/home/node/.openclaw
      - /Users/soham/.openclaw-clean/workspace:/home/node/.openclaw/workspace
YAML

docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml down
docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml up -d --force-recreate openclaw-gateway
sleep 20

docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml exec -T openclaw-gateway cat /home/node/.openclaw/openclaw.json
curl -i --max-time 5 http://127.0.0.1:18789/healthz
#
docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml logs --since=1m --timestamps openclaw-gateway

This removes .env from the equation.

dusty schooner
mental emberBOT
#

Great — this confirms the clean-dir override works. So the practical recovery is:

  • keep old ~/.openclaw as archive
  • run on ~/.openclaw-clean
  • copy back only needed config/data piece by piece

Right now, make this permanent in .env:

OPENCLAW_CONFIG_DIR=/Users/soham/.openclaw-clean
OPENCLAW_WORKSPACE_DIR=/Users/soham/.openclaw-clean/workspace

Then use normal compose:

#
docker compose down
docker compose up -d --force-recreate openclaw-gateway
sleep 10
curl -i --max-time 5 http://127.0.0.1:18789/healthz

After that, generate a real token and replace CHANGE_ME...:

NEW_TOKEN="$(openssl rand -hex 32)"
docker compose run -T --rm openclaw-cli config set gateway.auth.token "\"$NEW_TOKEN\"" --strict-json
docker compose restart openclaw-gateway

Then re-add things slowly:

  1. model config / LM Studio
#
  1. Telegram
  2. WhatsApp
  3. plugins one at a time

Do not copy old root sqlite/state files back yet (tasks, flows, memory, plugin-runtime-deps, etc.). One of those or a root state interaction is what was wedging the HTTP server.

dusty schooner
#

can we narrow down please? I dont wanna lost all my data

mental emberBOT
#

Yes — absolutely. You don’t need to lose data. We proved a clean dir works, so now we can binary-search the old ~/.openclaw contents.

Keep ~/.openclaw-clean as the working target. Copy old data back in chunks, restart/test after each. When it breaks, the last chunk contains the culprit.

Start with the safest state chunks:

OLD="$HOME/.openclaw"
NEW="$HOME/.openclaw-clean"

# 1) identity/devices/credentials
cp -a "$OLD/identity" "$NEW/" 2>/dev/null || true
cp -a "$OLD/devices" "$NEW/" 2>/dev/null || true
cp -a "$OLD/credentials" "$NEW/" 2>/dev/null || true

docker compose restart openclaw-gateway
#
sleep 10
curl -i --max-time 5 http://127.0.0.1:18789/healthz

If OK:

# 2) agents, but NOT sessions index first
mkdir -p "$NEW/agents"
cp -a "$OLD/agents/main" "$NEW/agents/" 2>/dev/null || true
mv "$NEW/agents/main/sessions/sessions.json" "$NEW/agents/main/sessions/sessions.json.hold" 2>/dev/null || true

docker compose restart openclaw-gateway
sleep 10
curl -i --max-time 5 http://127.0.0.1:18789/healthz
#

If OK:

# 3) wiki/memory
cp -a "$OLD/wiki" "$NEW/" 2>/dev/null || true
cp -a "$OLD/memory" "$NEW/" 2>/dev/null || true

docker compose restart openclaw-gateway
sleep 10
curl -i --max-time 5 http://127.0.0.1:18789/healthz

Then test riskier state one at a time:

for d in tasks flows cron telegram canvas media; do
#
  echo "== testing $d =="
  cp -a "$OLD/$d" "$NEW/" 2>/dev/null || true
  docker compose restart openclaw-gateway
  sleep 10
  curl -i --max-time 5 http://127.0.0.1:18789/healthz || break
done

Do not copy these yet:

plugin-runtime-deps
openclaw.json backups
update-check.json
#

My bet is one of tasks/, flows/, memory/, or an agent/session state file. This narrows it without deleting anything.

dusty schooner
#

wait

mental emberBOT
#

Yep, pause. Don’t copy anything yet.

Current safe state:

  • old data is still in ~/.openclaw
  • clean working dir is ~/.openclaw-clean
  • nothing needs to be deleted

When you’re ready, we’ll narrow it slowly and reversibly.

dusty schooner
#

Great — this confirms the clean-dir override works. So the practical recovery is:

keep old ~/.openclaw as archive
run on ~/.openclaw-clean
copy back only needed config/data piece by piece

Right now, make this permanent in .env:

OPENCLAW_CONFIG_DIR=/Users/soham/.openclaw-clean
OPENCLAW_WORKSPACE_DIR=/Users/soham/.openclaw-clean/workspace

Then use normal compose:

bash
docker compose down
docker compose up -d --force-recreate openclaw-gateway
sleep 10
curl -i --max-time 5 http://127.0.0.1:18789/healthz

i did this and got
curl: (28) Operation timed out after 5001 milliseconds with 0 bytes received

mental emberBOT
#

Ah — then normal compose is not equivalent to the clean override test. The clean override test also used docker-compose.trace2.yml, which had these important env vars:

OPENCLAW_SKIP_CHANNELS: "1"
OPENCLAW_SKIP_PROVIDERS: "1"
OPENCLAW_GATEWAY_STARTUP_TRACE: "1"
OPENCLAW_NO_AUTO_UPDATE: "1"

Your normal compose likely started more stuff again.

For now, make the working isolation permanent too:

cat > docker-compose.recovery.yml <<'YAML'
services:
#
  openclaw-gateway:
    environment:
      OPENCLAW_SKIP_CHANNELS: "1"
      OPENCLAW_SKIP_PROVIDERS: "1"
      OPENCLAW_NO_AUTO_UPDATE: "1"
      OPENCLAW_GATEWAY_STARTUP_TRACE: "1"
  openclaw-cli:
    environment:
      OPENCLAW_NO_AUTO_UPDATE: "1"
YAML

Then:

#
docker compose -f docker-compose.yml -f docker-compose.recovery.yml down
docker compose -f docker-compose.yml -f docker-compose.recovery.yml up -d --force-recreate openclaw-gateway
sleep 10
curl -i --max-time 5 http://127.0.0.1:18789/healthz

If that works, we know the difference is channels/providers starting under normal compose. Keep using docker-compose.recovery.yml while restoring data, then re-enable providers/channels later one at a time.

dusty schooner
#

curl: (28) Operation timed out after 5002 milliseconds with 0 bytes received

mental emberBOT
#

Okay — then let’s verify what normal/recovery compose is actually mounting and starting. The last known-working command included three compose files:

-f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml

So use that exact known-good stack for now:

docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml down
docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml up -d --force-recreate openclaw-gateway
sleep 10
curl -i --max-time 5 http://127.0.0.1:18789/healthz

If that still works, compare merged config against your ā€œpermanentā€ setup:

#
docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml config > /tmp/compose-working.yml
docker compose -f docker-compose.yml -f docker-compose.recovery.yml config > /tmp/compose-recovery.yml
diff -u /tmp/compose-working.yml /tmp/compose-recovery.yml | sed -n '1,200p'

That diff will show exactly what’s different. My bet: the .env change didn’t fully affect both volume mounts, or docker-compose.clean-dir.yml has explicit volumes that are overriding something normal compose still points at.

dusty schooner
#

docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml down
docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml up -d --force-recreate openclaw-gateway
sleep 10
curl -i --max-time 5 http://127.0.0.1:18789/healthz

curl: (28) Operation timed out after 5006 milliseconds with 0 bytes received

mental emberBOT
#

Then the ā€œknown goodā€ changed after copying/editing. Need fresh logs + mounted config again for this exact stack:

docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml exec -T openclaw-gateway cat /home/node/.openclaw/openclaw.json

docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml logs --since=1m --timestamps openclaw-gateway

docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml ps

Also try /healthz from inside:

docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml exec -T openclaw-gateway node -e "fetch('http://127.0.0.1:18789/healthz', {signal: AbortSignal.timeout(5000)}).then(r=>r.text()).then(console.log).catch(e=>console.error(e.name, e.message))"

If inside fetch also times out, the process is wedged again. If inside works but host curl times out, it’s Docker port publishing.

dusty schooner
#

soham@Sohams-MBP openclaw % docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml exec -T openclaw-gateway cat /home/node/.openclaw/openclaw.json

docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml logs --since=1m --timestamps openclaw-gateway

docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml ps
{
"gateway": {
"mode": "local",
"bind": "lan",
"auth": {
"mode": "token",
"token": "CHANGE_ME_TO_A_NEW_RANDOM_TOKEN"
}
},
"plugins": { "enabled": false },
"channels": {},
"agents": {
"defaults": {
"model": { "primary": "openai/gpt-5.5" },
"heartbeat": { "every": "0m" }
}
},
"hooks": { "internal": { "enabled": false } }
}
NAME IMAGE COMMAND SERVICE CREATED STATUS PORTS
openclaw-openclaw-gateway-1 openclaw:local "docker-entrypoint.s…" openclaw-gateway About a minute ago Up About a minute (healthy) 0.0.0.0:18789-18790->18789-18790/tcp, [::]:18789-18790->18789-18790/tcp
soham@Sohams-MBP openclaw % docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml exec -T openclaw-gateway node -e "fetch('http://127.0.0.1:18789/healthz', {signal: AbortSignal.timeout(5000)}).then(r=>r.text()).then(console.log).catch(e=>console.error(e.name, e.message))"
{"ok":true,"status":"live"}
soham@Sohams-MBP openclaw %
soham@Sohams-MBP openclaw %
soham@Sohams-MBP openclaw %
soham@Sohams-MBP openclaw %
soham@Sohams-MBP openclaw % docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml exec -T openclaw-gateway node -e "fetch('http://127.0.0.1:18789/healthz', {signal: AbortSignal.timeout(5000)}).then(r=>r.text()).then(console.log).catch(e=>console.error(e.name, e.message))"
{"ok":true,"status":"live"}

mental emberBOT
#

Good — container health is green and inside /healthz works. That means Gateway is fine; if host curl still hangs, it’s host/Docker port path or stale curl timing.

Try host now:

curl -4 -i --max-time 5 http://127.0.0.1:18789/healthz
curl -6 -i --max-time 5 http://[::1]:18789/healthz

If -4 works, use IPv4/127.0.0.1. If only inside works and host doesn’t, recreate both services + network:

docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml down
docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml up -d --force-recreate
sleep 10
curl -4 -i --max-time 5 http://127.0.0.1:18789/healthz
#

Also test the CLI now:

docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml run -T --rm openclaw-cli gateway probe --timeout 30000

If CLI probe works, you’re back online in recovery mode.

dusty schooner
#

how?? there was no reply at all

mental emberBOT
#

Because your command batch didn’t actually show host curl output — it only showed:

  • mounted config āœ…
  • docker compose ps says healthy āœ…
  • inside-container fetch(.../healthz) returns OK āœ…

But the host curl line/output wasn’t in that paste. So I shouldn’t say host is fixed yet — only that the gateway is healthy inside Docker.

Run just this on the Mac host and paste the result:

curl -4 -i --max-time 5 http://127.0.0.1:18789/healthz

If that hangs while inside-container works, the remaining issue is Docker port forwarding / host binding, not OpenClaw itself.

dusty schooner
#

HTTP/1.1 200 OK
X-Content-Type-Options: nosniff
Referrer-Policy: no-referrer
Permissions-Policy: camera=(), microphone=(self), geolocation=()
Content-Type: application/json; charset=utf-8
Cache-Control: no-store
Date: Mon, 27 Apr 2026 14:06:33 GMT
Connection: keep-alive
Keep-Alive: timeout=5
Content-Length: 27

{"ok":true,"status":"live"}%

mental emberBOT
#

Perfect — host HTTP works now. You’re back to a healthy recovery baseline.

Next verify WebSocket/CLI:

docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml run -T --rm openclaw-cli gateway probe --timeout 30000

If that passes, you’re operational in recovery mode.

Then we narrow carefully:

  • keep using ~/.openclaw-clean
  • don’t copy old ~/.openclaw wholesale
  • rotate leaked gateway token + Tavily key
  • restore data in chunks only after each health check passes

Also replace CHANGE_ME_TO_A_NEW_RANDOM_TOKEN with a real token:

#
NEW_TOKEN="$(openssl rand -hex 32)"
docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml run -T --rm openclaw-cli config set gateway.auth.token "\"$NEW_TOKEN\"" --strict-json
docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml restart openclaw-gateway
dusty schooner
#

soham@Sohams-MBP openclaw % docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml run -T --rm openclaw-cli gateway probe --timeout 30000
[+] 1/1t 1/11
āœ” Container openclaw-openclaw-gateway-1 Running 0.0s
Container openclaw-openclaw-cli-run-c77943bc7f5b Creating
Container openclaw-openclaw-cli-run-c77943bc7f5b Created
Gateway Status
Reachable: yes
Capability: read-only
Probe budget: 30000ms

Discovery (this machine)
Found 0 gateways via Bonjour (local.)
Tip: if the gateway is remote, mDNS won’t cross networks; use Wide-Area Bonjour (split DNS) or SSH tunnels.

Targets
Local loopback ws://127.0.0.1:18789
Connect: ok (25ms) Ā· Capability: read-only Ā· Read probe: ok
Gateway: a455b3d16513 (172.18.0.2) Ā· linux 6.12.76-linuxkit Ā· app 2026.4.24

mental emberBOT
#

šŸŽ‰ Nice — gateway is reachable and read probe is OK. You’re officially back online in recovery mode.

Capability: read-only is because the CLI isn’t using the token yet. Either pass it:

docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml run -T --rm openclaw-cli gateway probe --token CHANGE_ME_TO_A_NEW_RANDOM_TOKEN --timeout 30000

or set/rotate a real token and put it in .env.

Now recovery plan:

  1. Rotate token
    NEW_TOKEN="$(openssl rand -hex 32)"
    docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml run -T --rm openclaw-cli config set gateway.auth.token "\"$NEW_TOKEN\"" --strict-json
    
#
   docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml restart openclaw-gateway
  1. Copy old data back slowly from old ~/.openclaw into ~/.openclaw-clean.

  2. After each copy:

    curl -i --max-time 5 http://127.0.0.1:18789/healthz
    docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml run -T --rm openclaw-cli gateway probe --timeout 30000
    

Don’t re-enable plugins/channels yet. First restore identity/devices/credentials, then agents without sessions.json, then wiki/memory, then channels one at a time.

dusty schooner
#

guide me through copying

mental emberBOT
#

Yep. We’ll copy in small reversible batches. After each batch, test health. If it breaks, we know the last batch contains the bad file.

Set vars first:

OLD="$HOME/.openclaw"
NEW="$HOME/.openclaw-clean"
COMPOSE="docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml"

0) Backup clean working state

cp -a "$NEW" "$NEW.bak-working-$(date +%Y%m%d-%H%M%S)"

1) Identity / devices / credentials

#
cp -a "$OLD/identity" "$NEW/" 2>/dev/null || true
cp -a "$OLD/devices" "$NEW/" 2>/dev/null || true
cp -a "$OLD/credentials" "$NEW/" 2>/dev/null || true

$COMPOSE restart openclaw-gateway
sleep 10
curl -i --max-time 5 http://127.0.0.1:18789/healthz

If OK, continue.

2) Agent files, but hold back session index

mkdir -p "$NEW/agents"
#
cp -a "$OLD/agents/main" "$NEW/agents/" 2>/dev/null || true
mv "$NEW/agents/main/sessions/sessions.json" "$NEW/agents/main/sessions/sessions.json.hold" 2>/dev/null || true

$COMPOSE restart openclaw-gateway
sleep 10
curl -i --max-time 5 http://127.0.0.1:18789/healthz

If OK, continue.

3) Wiki + memory

cp -a "$OLD/wiki" "$NEW/" 2>/dev/null || true
cp -a "$OLD/memory" "$NEW/" 2>/dev/null || true
#

$COMPOSE restart openclaw-gateway
sleep 10
curl -i --max-time 5 http://127.0.0.1:18789/healthz

4) Optional state, one by one

for d in cron telegram canvas media; do
  echo "== copying $d =="
  cp -a "$OLD/$d" "$NEW/" 2>/dev/null || true
  $COMPOSE restart openclaw-gateway
  sleep 10
  curl -i --max-time 5 http://127.0.0.1:18789/healthz || break
#
done

Do not copy yet:

tasks/
flows/
plugin-runtime-deps/
openclaw.json backups

Tell me which step fails, if any.

dusty schooner
#

soham@Sohams-MBP openclaw % cp -a "$OLD/identity" "$NEW/" 2>/dev/null || true
cp -a "$OLD/devices" "$NEW/" 2>/dev/null || true
cp -a "$OLD/credentials" "$NEW/" 2>/dev/null || true

$COMPOSE restart openclaw-gateway
sleep 10
curl -i --max-time 5 http://127.0.0.1:18789/healthz
zsh: command not found: docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml

mental emberBOT
#

zsh doesn’t split $COMPOSE the way bash does. Use a function instead:

occompose() {
  docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean-dir.yml "$@"
}

Then rerun the test part:

occompose restart openclaw-gateway
sleep 10
curl -i --max-time 5 http://127.0.0.1:18789/healthz

Your copy commands likely already ran fine.

dusty schooner
#

curl: (28) Operation timed out after 5006 milliseconds with 0 bytes received

mental emberBOT
#

Great, that means Step 1 broke it. Culprit is in one of:

identity/
devices/
credentials/

Revert those three from clean, then test:

NEW="$HOME/.openclaw-clean"

rm -rf "$NEW/identity" "$NEW/devices" "$NEW/credentials"

occompose restart openclaw-gateway
#
sleep 10
curl -i --max-time 5 http://127.0.0.1:18789/healthz

If it works again, copy them one at a time:

OLD="$HOME/.openclaw"
NEW="$HOME/.openclaw-clean"

for d in identity devices credentials; do
  echo "== testing $d =="
  rm -rf "$NEW/$d"
  cp -a "$OLD/$d" "$NEW/" 2>/dev/null || true
  occompose restart openclaw-gateway
#
  sleep 10
  curl -i --max-time 5 http://127.0.0.1:18789/healthz || break
done

Tell me which directory breaks it. My bet is identity/ or devices/paired.json.

dusty schooner
#

soham@Sohams-MBP openclaw % NEW="$HOME/.openclaw-clean"

rm -rf "$NEW/identity" "$NEW/devices" "$NEW/credentials"

occompose restart openclaw-gateway
sleep 10
curl -i --max-time 5 http://127.0.0.1:18789/healthz
[+] restart 0/1
ā ø Container openclaw-openclaw-gateway-1 Restarting 0.3s
curl: (28) Operation timed out after 5004 milliseconds with 0 bytes received

mental emberBOT
#

Maybe restart didn’t fully recreate the bind/container state. Do a full recreate after removing them:

occompose down
occompose up -d --force-recreate openclaw-gateway
sleep 15
curl -i --max-time 5 http://127.0.0.1:18789/healthz
occompose logs --since=1m --timestamps openclaw-gateway

If still hangs, something else got copied into clean already. Show current clean files:

find ~/.openclaw-clean -maxdepth 2 -type f | sed "s|$HOME|~|" | sort
dusty schooner
#

soham@Sohams-MBP openclaw % occompose down
occompose up -d --force-recreate openclaw-gateway
sleep 15
curl -i --max-time 5 http://127.0.0.1:18789/healthz
occompose logs --since=1m --timestamps openclaw-gateway
[+] down 2/2
āœ” Container openclaw-openclaw-gateway-1 Removed 0.3s
āœ” Network openclaw_default Removed 0.2s
[+] up 2/2
āœ” Network openclaw_default Created 0.0s
āœ” Container openclaw-openclaw-gateway-1 Started 0.2s
curl: (28) Operation timed out after 5003 milliseconds with 0 bytes received

mental emberBOT
#

Looks like the logs command didn’t make it into the paste, likely because curl timing delayed/ate the flow. Run these separately:

occompose logs --since=1m --timestamps openclaw-gateway

and:

find ~/.openclaw-clean -maxdepth 2 -type f | sed "s|$HOME|~|" | sort

Also confirm occompose is still the 3-file stack:

type occompose
dusty schooner
#

soham@Sohams-MBP openclaw % occompose logs --since=1m --timestamps openclaw-gateway
soham@Sohams-MBP openclaw % find ~/.openclaw-clean -maxdepth 2 -type f | sed "s|$HOME|~|" | sort
~/.openclaw-clean/canvas/index.html
~/.openclaw-clean/identity/device.json
~/.openclaw-clean/logs/config-audit.jsonl
~/.openclaw-clean/logs/config-health.json
~/.openclaw-clean/openclaw.json
~/.openclaw-clean/openclaw.json.bak
~/.openclaw-clean/openclaw.json.last-good
~/.openclaw-clean/tasks/runs.sqlite
~/.openclaw-clean/tasks/runs.sqlite-shm
~/.openclaw-clean/tasks/runs.sqlite-wal
~/.openclaw-clean/update-check.json
soham@Sohams-MBP openclaw % type occompose
occompose is a shell function
soham@Sohams-MBP openclaw %

mental emberBOT
#

logs --since=1m empty means the container likely hasn’t emitted fresh logs in that window, or we’re looking at a stale/quiet process. Check status and mounted config:

occompose ps

occompose exec -T openclaw-gateway sh -lc '
echo "== config ==";
cat /home/node/.openclaw/openclaw.json;
echo;
echo "== process ==";
ps -eo pid,ppid,stat,pcpu,pmem,etime,args;
echo;
echo "== health inside ==";
node -e "fetch(\"http://127.0.0.1:18789/healthz\", {signal: AbortSignal.timeout(5000)}).then(r=>r.text()).then(console.log).catch(e=>console.error(e.name, e.message))"
'
#

Also, remove the new tasks state from clean — it reappeared and could be the culprit:

rm -rf ~/.openclaw-clean/tasks

occompose down
occompose up -d --force-recreate openclaw-gateway
sleep 10
curl -i --max-time 5 http://127.0.0.1:18789/healthz

If that fixes it, tasks/runs.sqlite is the bad state.

dusty schooner
mental emberBOT
#

We’ve narrowed it enough to be careful and systematic:

  • ~/.openclaw-clean + explicit clean-dir compose works.
  • After copying identity/devices/credentials, it broke.
  • Removing those dirs didn’t recover because some runtime files got generated (tasks/, identity/device.json, openclaw.json.last-good, etc.).
  • So we should reset clean-dir back to the known-good baseline, then copy one directory at a time.

Do this reset first:

occompose down

rm -rf ~/.openclaw-clean
mkdir -p ~/.openclaw-clean/workspace

cat > ~/.openclaw-clean/openclaw.json <<'JSON'
#
{
  "gateway": {
    "mode": "local",
    "bind": "lan",
    "auth": {
      "mode": "token",
      "token": "CHANGE_ME_TO_A_NEW_RANDOM_TOKEN"
    }
  },
  "plugins": { "enabled": false },
  "channels": {},
  "agents": {
    "defaults": {
      "model": { "primary": "openai/gpt-5.5" },
      "heartbeat": { "every": "0m" }
#
    }
  },
  "hooks": { "internal": { "enabled": false } }
}
JSON

occompose up -d --force-recreate openclaw-gateway
sleep 10
curl -i --max-time 5 http://127.0.0.1:18789/healthz

If OK, copy only credentials:

cp -a ~/.openclaw/credentials ~/.openclaw-clean/
#
occompose down && occompose up -d --force-recreate openclaw-gateway
sleep 10
curl -i --max-time 5 http://127.0.0.1:18789/healthz

Then only devices:

cp -a ~/.openclaw/devices ~/.openclaw-clean/
occompose down && occompose up -d --force-recreate openclaw-gateway
sleep 10
curl -i --max-time 5 http://127.0.0.1:18789/healthz

Then only identity:

#
cp -a ~/.openclaw/identity ~/.openclaw-clean/
occompose down && occompose up -d --force-recreate openclaw-gateway
sleep 10
curl -i --max-time 5 http://127.0.0.1:18789/healthz

Whichever step breaks is the culprit. Given the symptoms, I’d especially suspect identity/device-auth.json or paired device state.

dusty schooner
#

soham@Sohams-MBP openclaw % rm -rf ~/.openclaw-clean
mkdir -p ~/.openclaw-clean/workspace
rm: /Users/soham/.openclaw-clean/workspace: Permission denied
rm: /Users/soham/.openclaw-clean: Directory not empty

mental emberBOT
#

That’s from files/dirs created by the container as a different UID. Don’t fight it with rm -rf yet. Use a new clean dir name:

occompose down

NEWCLEAN="$HOME/.openclaw-clean2"
mkdir -p "$NEWCLEAN/workspace"

cat > "$NEWCLEAN/openclaw.json" <<'JSON'
{
  "gateway": {
    "mode": "local",
    "bind": "lan",
    "auth": {
      "mode": "token",
      "token": "CHANGE_ME_TO_A_NEW_RANDOM_TOKEN"
#
    }
  },
  "plugins": { "enabled": false },
  "channels": {},
  "agents": {
    "defaults": {
      "model": { "primary": "openai/gpt-5.5" },
      "heartbeat": { "every": "0m" }
    }
  },
  "hooks": { "internal": { "enabled": false } }
}
JSON

Create a new override:

#
cat > docker-compose.clean2.yml <<'YAML'
services:
  openclaw-gateway:
    volumes:
      - /Users/soham/.openclaw-clean2:/home/node/.openclaw
      - /Users/soham/.openclaw-clean2/workspace:/home/node/.openclaw/workspace
  openclaw-cli:
    volumes:
      - /Users/soham/.openclaw-clean2:/home/node/.openclaw
      - /Users/soham/.openclaw-clean2/workspace:/home/node/.openclaw/workspace
YAML

Then define a new function:

#
oc2() {
  docker compose -f docker-compose.yml -f docker-compose.trace2.yml -f docker-compose.clean2.yml "$@"
}

Test baseline:

oc2 up -d --force-recreate openclaw-gateway
sleep 10
curl -i --max-time 5 http://127.0.0.1:18789/healthz

If OK, continue the one-directory-at-a-time copy using .openclaw-clean2.

dusty schooner
#

curl: (28) Operation timed out after 5002 milliseconds with 0 bytes received

mental emberBOT
#

If clean2 baseline times out too, check whether it actually mounted clean2 and whether container health is okay:

oc2 ps

oc2 exec -T openclaw-gateway sh -lc '
echo "== config ==";
cat /home/node/.openclaw/openclaw.json;
echo;
echo "== files ==";
find /home/node/.openclaw -maxdepth 2 -type f | sort;
echo;
echo "== inside health ==";
node -e "fetch(\"http://127.0.0.1:18789/healthz\", {signal: AbortSignal.timeout(5000)}).then(r=>r.text()).then(console.log).catch(e=>console.error(e.name, e.message))"
'
#
oc2 logs --since=1m --timestamps openclaw-gateway

Also try host curl with IPv4 forced:

curl -4 -i --max-time 5 http://127.0.0.1:18789/healthz

We need to distinguish ā€œgateway wedgedā€ from ā€œhost port path weird again.ā€

dusty schooner
mental emberBOT
#

Great — oc2 baseline is healthy. The earlier timeout was likely timing/race/host curl without -4.

Now continue, but use curl -4 for every check.

Test credentials only

cp -a ~/.openclaw/credentials ~/.openclaw-clean2/

oc2 down
oc2 up -d --force-recreate openclaw-gateway
sleep 10
curl -4 -i --max-time 5 http://127.0.0.1:18789/healthz

If OK:

#

Test devices only

cp -a ~/.openclaw/devices ~/.openclaw-clean2/

oc2 down
oc2 up -d --force-recreate openclaw-gateway
sleep 10
curl -4 -i --max-time 5 http://127.0.0.1:18789/healthz

If OK:

Test identity only

#
cp -a ~/.openclaw/identity ~/.openclaw-clean2/

oc2 down
oc2 up -d --force-recreate openclaw-gateway
sleep 10
curl -4 -i --max-time 5 http://127.0.0.1:18789/healthz

If any step fails, tell me which one. If all pass, we’ll move to agent/wiki/memory next.

dusty schooner
#

soham@Sohams-MBP openclaw % cp -a ~/.openclaw/credentials ~/.openclaw-clean2/

oc2 down
oc2 up -d --force-recreate openclaw-gateway
sleep 10
curl -4 -i --max-time 5 http://127.0.0.1:18789/healthz
cp: /Users/soham/.openclaw/credentials: No such file or directory
[+] down 2/2
āœ” Container openclaw-openclaw-gateway-1 Removed 0.3s
āœ” Network openclaw_default Removed 0.2s
[+] up 2/2
āœ” Network openclaw_default Created 0.0s
āœ” Container openclaw-openclaw-gateway-1 Started 0.2s
curl: (28) Operation timed out after 5002 milliseconds with 0 bytes received
soham@Sohams-MBP openclaw %

smoky aspen
#

This thread has gotten too long, we recommend you start a new one as the context has been bloated and you may start experiencing a degraded experience.

mental emberBOT
#

The copy didn’t happen, so this is just baseline flaking/timing again. Check inside + host after a little longer:

sleep 10
oc2 ps
oc2 exec -T openclaw-gateway node -e "fetch('http://127.0.0.1:18789/healthz', {signal: AbortSignal.timeout(5000)}).then(r=>r.text()).then(console.log).catch(e=>console.error(e.name, e.message))"
curl -4 -i --max-time 5 http://127.0.0.1:18789/healthz

Also note: ~/.openclaw/credentials is gone because earlier we moved it to rollback-hold. Your old credentials dir is likely here:

ls ~/.openclaw/rollback-hold/root-files

Let’s not copy credentials for now. If baseline is stable after the extra wait, test devices next.

#

Yeah, Json is right. This thread is huge now and we’re at the point where mistakes are more likely.

Soham, please start a new help thread with this short summary:

Docker macOS gateway recovery thread.

Current known-good baseline:
- Running OpenClaw 2026.4.24 in Docker
- Clean override stack uses:
  docker-compose.yml + docker-compose.trace2.yml + docker-compose.clean2.yml
- Config dir: ~/.openclaw-clean2
- Minimal config:
  plugins.enabled=false
  channels={}
  agents.defaults.model.primary=openai/gpt-5.5
#
  agents.defaults.heartbeat.every=0m
  hooks.internal.enabled=false
- Gateway can become healthy with clean config, but baseline sometimes needs extra wait/recheck.
- Old data is still in ~/.openclaw
- Some credentials/root files may have been moved to ~/.openclaw/rollback-hold/root-files
- Need to continue narrowing which copied old state dir breaks healthz.
- Use curl -4 for host checks.
- Do NOT copy old ~/.openclaw wholesale.

Also include the next commands to run:

sleep 10
oc2 ps
#
oc2 exec -T openclaw-gateway node -e "fetch('http://127.0.0.1:18789/healthz', {signal: AbortSignal.timeout(5000)}).then(r=>r.text()).then(console.log).catch(e=>console.error(e.name, e.message))"
curl -4 -i --max-time 5 http://127.0.0.1:18789/healthz
ls ~/.openclaw/rollback-hold/root-files

And please rotate the leaked gateway/Tavily tokens once stable.

dusty schooner
#

I think Ill just delete everything and install again.

mental emberBOT
#

Totally understandable. Before deleting, do one archive move so you can recover anything later:

docker compose down --remove-orphans

mv ~/.openclaw ~/.openclaw.broken-2026-04-27
mv ~/.openclaw-clean ~/.openclaw-clean.bak-2026-04-27 2>/dev/null || true
mv ~/.openclaw-clean2 ~/.openclaw-clean2.bak-2026-04-27 2>/dev/null || true

mkdir -p ~/.openclaw

Then reset Docker bits:

docker volume rm openclaw_openclaw_cache 2>/dev/null || true
#
docker image rm openclaw:local 2>/dev/null || true

Use a known version / clean source:

git restore Dockerfile docker-compose.yml
git fetch --tags origin
git checkout v2026.4.24
docker build -t openclaw:local -f Dockerfile .

Then run setup fresh:

#
./scripts/docker/setup.sh

Do not paste tokens publicly. And rotate the gateway token + Tavily key that were exposed earlier.

glad gust
#

I had a similar issue last night. I disabled ā€œbonjourā€ it it stabilized

mental emberBOT
#

Good datapoint, thanks Henry. For Docker especially, disabling Bonjour is sane — mDNS on bridge networking is flaky.

Soham, if you do one more quick try before full reinstall, add this to Compose/env:

OPENCLAW_DISABLE_BONJOUR: "1"

or in .env:

OPENCLAW_DISABLE_BONJOUR=1

Then recreate:

#
docker compose down
docker compose up -d --force-recreate openclaw-gateway

But if you’re already set on reinstalling, keep Bonjour disabled in the fresh Docker setup too.

dusty schooner
#

I disabled bonjour as step 1... didnt help... my setup was proper and thorough and working seamlessly for a while now.. the recent update broke it

mental emberBOT
#

Yeah, fair. Your setup wasn’t ā€œbadā€ — the update clearly pushed it into a broken state. We proved a clean config can boot, but your existing state/config path triggers a post-ready hang. That’s exactly the kind of regression that should be filed.

If you’re done debugging, do the archive-not-delete reinstall path:

docker compose down --remove-orphans

mv ~/.openclaw ~/.openclaw.broken-2026-04-27
mv ~/.openclaw-clean ~/.openclaw-clean.bak-2026-04-27 2>/dev/null || true
mv ~/.openclaw-clean2 ~/.openclaw-clean2.bak-2026-04-27 2>/dev/null || true

mkdir -p ~/.openclaw

Then rebuild/install stable: