#gateway disconnects/interruptions

1 messages · Page 1 of 1 (latest)

jagged flowerBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

topaz foxBOT
#

That log is a pre-auth gateway WebSocket handshake timeout: something on 127.0.0.1 connected to the gateway port but didn’t finish the OpenClaw connect/auth handshake within ~15s, so the gateway closed it with code 1000.

First thing: what version are you on?

openclaw --version
openclaw status --all
openclaw gateway status
openclaw doctor

Please paste the output from those, especially status --all.

A useful immediate workaround, if the host is slow/loaded, is to raise the gateway handshake timeout:

#
openclaw config set gateway.handshakeTimeoutMs 30000
openclaw gateway restart

Default is 15000; docs mention this knob specifically for loaded/low-powered hosts where local clients connect during warmup: https://docs.openclaw.ai/gateway/configuration#tune-gateway-websocket-handshake-timeout

Also check logs for a real underlying cause around the same timestamp:

openclaw logs --limit 300 --local-time

If you see anything like device-pair: notify poll failed / Unexpected end of JSON input / devices/pending.json, say so — that’s a known bad state where deleting the corrupted pending device file and restarting fixes it.

#

Got it — the key thing is: the gateway process is alive, but local RPC is intermittently timing out. That matches your symptoms: slow replies, typing indicator dropping, /stop//status sometimes not getting through.

I’d do this in order:

openclaw doctor --repair
openclaw gateway restart
openclaw gateway status

Your report specifically says the systemd service config is out of date/non-standard and recommends doctor. That PATH warning is worth fixing before chasing deeper ghosts.

Then raise the handshake timeout so local CLI/UI calls don’t get killed while the host is busy:

openclaw config set gateway.handshakeTimeoutMs 30000
#
openclaw gateway restart

If it still happens after that, grab the actual stall evidence:

openclaw logs --limit 500 --local-time
journalctl --user -u openclaw-gateway.service -n 200 --no-pager

What I’m looking for next: model/tool calls hanging, repeated gateway restarts, plugin/runtime errors, OOM/systemd kills, or event-loop stalls. Also: you’re running from a dirty git checkout on main, so if this continues, I’d strongly consider rebuilding/cleaning that checkout or switching to the packaged install path.

late blade
#

after changing the timeout things are beter!

topaz foxBOT
#

Nice, that confirms it was the local gateway handshake timing out under load. 🎉

I’d still run this when you have a minute because your report showed the service config/PATH warning:

openclaw doctor --repair
openclaw gateway restart

If it stays stable after the timeout change, you’re probably good.

If this fixed it, please right-click the solution message → Apps → ✅ Mark Solution so Answer Overflow indexes it for future users. 🦐

#

That’s a different issue from the gateway timeout: openclaw update is trying to create a temporary git preflight worktree, and git can’t write files into it.

Most likely: disk full or inode exhaustion, usually in /tmp because the preflight worktree is created there.

Run:

df -h /tmp /home/ubuntu
df -ih /tmp /home/ubuntu
ls -ld /tmp

Also check for stale failed update temp dirs:

ls -lah /tmp | grep openclaw
#

If /tmp is full, clear stale temp files or reboot the VPS, then rerun:

openclaw update

If disk/inodes look fine, the next suspect is permissions/ownership weirdness in /tmp or the repo checkout. In that case paste:

cd /home/ubuntu/openclaw
git status --short
git worktree list
#

Yep, /tmp is pretty tight: only 885M free, and the preflight worktree has to check out the whole repo. That can easily fail partway through with “unable to write file”.

Clean the stale preflight dir first:

rm -rf /tmp/openclaw-update-preflight-7qyaAM

Then check what’s eating /tmp:

du -h --max-depth=1 /tmp | sort -h

If /tmp/openclaw is logs/cache and huge, don’t nuke it blindly while the gateway is running — paste the du output and we can pick the safe cleanup.

You also only have 6.1G free on /, so if update/build keeps failing, free a couple GB more or reboot to clear tmpfs, then retry:

#
openclaw update
#

Found it: /tmp/.pnpm-store is taking 1.8G. Since this is a tmpfs and you’re failing during a temp git worktree checkout, that’s the obvious cleanup target.

Run:

rm -rf /tmp/.pnpm-store
rm -rf /tmp/openclaw-update-preflight-*
df -h /tmp
openclaw update

If update recreates a big pnpm store and still runs out, point pnpm’s store somewhere on disk instead of tmpfs for this run:

mkdir -p ~/.pnpm-store
PNPM_HOME="$HOME/.local/share/pnpm" pnpm config set store-dir "$HOME/.pnpm-store"
#
openclaw update

But I’d try the simple cleanup first.

topaz foxBOT
#

Your source checkout is now in an unbuilt/broken state — the openclaw shim is trying to run:

/home/ubuntu/openclaw/dist/entry.js

…but dist/entry.js doesn’t exist, probably because the earlier update/preflight failure interrupted things.

Repair the checkout manually, without using openclaw:

cd /home/ubuntu/openclaw
pnpm install
pnpm build
pnpm ui:build
pnpm link --global
#
hash -r
openclaw --version

Then restart the gateway:

openclaw gateway restart
openclaw gateway status

If pnpm itself is missing, run:

corepack enable
#

then retry the pnpm install line.

#

errno -122 = EDQUOT / disk quota exceeded. So the update got further, but the dev preflight build is still running out of writable space while copying bundled Codex runtime deps inside /tmp/openclaw-update-preflight-*.

Move OpenClaw’s temp worktree off /tmp:

mkdir -p ~/tmp/openclaw-update ~/.pnpm-store
rm -rf /tmp/openclaw-update-preflight-*

pnpm config set store-dir "$HOME/.pnpm-store"

TMPDIR="$HOME/tmp/openclaw-update" openclaw update --channel dev

If that still fails, you likely need more free disk than the current ~6GB on /. Check:

#
df -h "$HOME" /tmp
quota -s 2>/dev/null || true
du -h --max-depth=1 ~ | sort -h | tail -20

The important bit: /tmp is a 3.9G tmpfs here, and dev update/preflight can exceed that because it creates a full worktree + installs/builds + stages plugin runtime deps.

late blade
#

is it possible to specify custom dir as temp dir for update