#multiple gateways running at once?

1 messages Ā· Page 1 of 1 (latest)

small forge
#

Im having an issue where after an update im getting an issue where openclaw complains that theres multiple gateways running at one time. Started when i updated to 2026.4.11 and got worse when i updated to 2026.4.26. Im not sure how to resolve this issue. the gateway eventually works after 10 minutes but the issue makes the gateway only work after 10 minutes after restarting. https://pastebin.com/m6cDrw1y openclaw gateway status . Openclaw version OpenClaw 2026.4.26 (be8c246). This issue also randomly appears when interacting with the agent.

night heartBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> āœ… Mark Solution

cosmic basinBOT
#

I don’t think this paste shows multiple gateways actually running. It shows the same PID as both the systemd runtime and the process holding port 18789 (pid 105621). The real failure is: gateway process is running, but the WebSocket probe to ws://127.0.0.1:18789 times out, so it isn’t responsive yet. (pastebin.com)

Can you run these and paste the output?

openclaw gateway status --deep
openclaw logs --limit 500 --plain
systemctl --user status openclaw-gateway.service
systemctl --user list-units 'openclaw*'

If --deep reports extra/stale services, that’s the duplicate-gateway path. Docs say most installs should run one gateway, and gateway status --deep is specifically for finding stale launchd/systemd/schtasks services. (docs.openclaw.ai)

If --deep still shows only one gateway, this looks more like a 2026.4.x startup/readiness regression: systemd starts it, port binds, but the gateway doesn’t answer RPC for ~10 minutes. In that case I’d file a GitHub issue with the paste + the four outputs above: https://github.com/openclaw/openclaw/issues

small forge
#

the pastebin was the deep status. Heres the logs. https://pastebin.com/3h6TbxGE ignore the openrouter billing report. my main agent is running on ollama right now.

cosmic basinBOT
#

Yep, that makes sense — the ā€œmultiple gatewaysā€ message is misleading here.

Your deep status shows one systemd gateway PID owning port 18789, but the RPC/WebSocket probe is timing out. The logs point harder at session-store churn:

  • sessions.json was huge: before:1447, capped to 500
  • repeated sessions.json.bak... backed up session store file before rotation
  • repeated 40–53s gateway handshake timeouts
  • Discord reconnects happen during those timeouts

So I’d try session maintenance first:

openclaw sessions cleanup --all-agents --dry-run

If that looks sane, back up sessions, then enforce:

#
cp -a ~/.openclaw/agents/main/sessions ~/.openclaw/agents/main/sessions.backup.$(date +%Y%m%d-%H%M%S)
openclaw sessions cleanup --all-agents --enforce
openclaw doctor
openclaw gateway restart
openclaw gateway status --deep --require-rpc

Also check for old rotation config:

openclaw config get session.maintenance

If you see rotateBytes, run:

#
openclaw doctor --fix

The bonjour warnings are probably noisy; the concerning part is the session-store backup/rotation spam + gateway handshake timeouts. If cleanup doesn’t fix it, this is worth a GitHub issue with both pastebins, because 2026.4.26 appears to be repeatedly doing heavy session-store writes during normal operation.

small forge
#

after that when i try to retrieve the logs i get a message that the gateway is not reachable after running the cleanup im getting this when i can finally get the logs. https://pastebin.com/xTtva5XD

cosmic basinBOT
#

Ah, got it. Cleanup didn’t fix the root issue — the new paste still shows the gateway getting wedged by session/memory churn.

Two notable things:

  1. Tons of agent:main:dream... sessions are being kept.
  2. sessions.json.bak... backed up session store file before rotation is still happening repeatedly around gateway timeouts.

I’d temporarily turn off the two memory systems that are creating/running these background sessions, then restart:

openclaw config set plugins.entries.memory-core.config.dreaming.enabled false --strict-json
openclaw config set plugins.entries.active-memory.config.enabled false --strict-json
openclaw doctor --fix
openclaw gateway restart

Then wait ~2 minutes and check:

#
openclaw gateway status --deep --require-rpc

If openclaw logs can’t connect while it’s wedged, use local systemd/file logs instead:

journalctl --user -u openclaw-gateway.service -n 300 --no-pager
tail -n 300 /tmp/openclaw/openclaw-2026-04-28.log

If disabling dreaming/active-memory stabilizes it, that’s a solid bug report: 2026.4.26 + memory-core/active-memory session writes causing gateway RPC handshake timeouts. Docs refs: https://docs.openclaw.ai/concepts/dreaming, https://docs.openclaw.ai/concepts/active-memory, https://docs.openclaw.ai/cli/sessions

small forge
#

hasnt been stableized. after it came back online and i interacted with my agent said two gateways were running happened for a few seconds then went back to normal.

cosmic basinBOT
#

Yeah — at this point I’d stop changing settings. This looks like a 2026.4.x gateway/RPC regression, not normal config drift.

Important distinction: if ss/status shows one PID holding 18789, that’s not actually two gateways — it’s the current gateway owning the port while RPC is wedged, and the diagnostic text is confusing.

Please capture this while it’s happening:

date
timeout 8s openclaw gateway status --deep --json
ss -ltnp 'sport = :18789'
ps -eo pid,ppid,lstart,cmd | grep -E 'openclaw|node' | grep -v grep
systemctl --user status openclaw-gateway.service --no-pager
journalctl --user -u openclaw-gateway.service -n 300 --no-pager

If you need stability meanwhile, I’d roll back to 2026.4.10 since you said it started at 2026.4.11:

small forge
#

i checked logs and it shows the same session cleanup thing while the issue is happening.

cosmic basinBOT
#

That basically confirms it: this is session-store maintenance/rotation wedging the gateway, not real duplicate gateways.

Do this next — especially the rotateBytes unset:

openclaw config get session.maintenance --json
openclaw config unset session.maintenance.rotateBytes
openclaw doctor --fix

Then shrink the store more aggressively for now:

cp -a ~/.openclaw/agents/main/sessions ~/.openclaw/agents/main/sessions.backup.$(date +%Y%m%d-%H%M%S)
openclaw config set session.maintenance.maxEntries 100 --strict-json
openclaw sessions cleanup --all-agents --enforce
#
openclaw gateway restart

If it still does the cleanup/backup spam after that, I’d roll back to 2026.4.10 for stability and file a bug. The key evidence is: same PID owns port 18789 + gateway RPC timeouts + repeated sessions.json.bak... / cleanup lines during the outage.

small forge
#

i already did those steps and it didnt do anything. rotateBytes is not set in the config.

cosmic basinBOT
#

Got it — sorry, then my rotateBytes guess was a dead end.

At this point the cleanest test is: prove whether the existing session store is the trigger by moving it aside temporarily. This is reversible, but it will make OpenClaw start with fresh session history until you restore it.

openclaw config set cron.enabled false --strict-json
openclaw config set plugins.entries.memory-core.config.dreaming.enabled false --strict-json
openclaw config set plugins.entries.active-memory.config.enabled false --strict-json

openclaw gateway stop

mv ~/.openclaw/agents/main/sessions ~/.openclaw/agents/main/sessions.quarantine.$(date +%Y%m%d-%H%M%S)
mkdir -p ~/.openclaw/agents/main/sessions

openclaw gateway restart
openclaw gateway status --deep --require-rpc
#

If that stabilizes it, the bug is definitely ā€œ2026.4.26 chokes on existing session store/session artifacts,ā€ not duplicate gateways.

If it still happens with a fresh sessions dir + cron/memory disabled, I’d stop spending time on config and roll back for stability:

npm i -g openclaw@2026.4.10
openclaw gateway restart

Also: don’t upload the quarantined sessions publicly — it can contain private transcript/tool data. For a GitHub issue, sizes/counts + logs are enough.

vernal jackal
#

I have same issue, it shows that two gateways open
I have used sudo ss -tulpn | grep :18789, the result:

tcp   LISTEN 0      511               127.0.0.1:18789      0.0.0.0:*    users:(("openclaw-gatewa",pid=45576,fd=29))            
tcp   LISTEN 0      511                   [::1]:18789         [::]:*    users:(("openclaw-gatewa",pid=45576,fd=28))  

I tried to use doctor, i tried to restart gateway, i tried to restart vps, nothing helped
This is started happening after latest update

cosmic basinBOT
#

That ss output is not two gateways — it’s one openclaw-gateway process (pid=45576) listening on both IPv4 loopback (127.0.0.1) and IPv6 loopback (::1). That part is normal.

The warning is probably coming from OpenClaw’s probe logic seeing more than one reachable target — e.g. ā€œconfigured remoteā€ + ā€œlocal loopbackā€ — not necessarily two processes.

Can you paste these outputs?

openclaw --version
openclaw gateway status --deep
openclaw gateway probe

If --deep shows ā€œOther gateway-like services detectedā€ or both system/user systemd services, then we can clean up the duplicate supervisor. If it only shows one PID like your ss output, this smells like a latest-update false-positive/regression rather than an actual duplicate gateway.

Docs reference: https://docs.openclaw.ai/gateway/multiple-gateways and https://docs.openclaw.ai/gateway/gateway-lock 🦐

vernal jackal
#

OpenClaw 2026.4.26 (be8c246)

Service: systemd (enabled)
File logs: /tmp/openclaw/openclaw-2026-04-29.log
Command: /usr/local/bin/node /usr/local/lib/node_modules/openclaw/dist/index.js gateway --port 18789
Service file: ~/.config/systemd/user/openclaw-gateway.service
Service env: OPENCLAW_GATEWAY_PORT=18789

Config (cli): ~/.openclaw/openclaw.json
Config (service): ~/.openclaw/openclaw.json

Gateway: bind=loopback (127.0.0.1), port=18789 (service args)
Probe target: ws://127.0.0.1:18789
Dashboard: http://127.0.0.1:18789/
Probe note: Loopback-only gateway; only local clients can connect.

Runtime: running (pid 918, state active, sub running, last exit 0, reason 0)
Warm-up: launch agents can take a few seconds. Try again shortly.
Connectivity probe: failed
Probe target: ws://127.0.0.1:18789
  gateway closed (1000): 
Capability: unknown

Port 18789 is already in use.
- pid 918 root: openclaw-gateway (127.0.0.1:18789)
- Gateway already running locally. Stop it (openclaw gateway stop) or use a different port.
Listening: 127.0.0.1:18789
Gateway Status
Reachable: no
Capability: unknown
Probe budget: 3000ms

Discovery (this machine)
Found 0 gateways via Bonjour (local.)
Tip: if the gateway is remote, mDNS won’t cross networks; use Wide-Area Bonjour (split DNS) or SSH tunnels.

Targets
Local loopback ws://127.0.0.1:18789
  Connect: failed - gateway closed (1000):  Ā· Capability: unknown
cosmic basinBOT
#

Yep — this confirms it’s not actually two gateways.

Port 18789 is already in use is pointing at the same gateway process that Runtime: running reports. The real failure is this part:

Connectivity probe: failed
gateway closed (1000):

So: gateway process starts, binds the port, but the WebSocket probe is being closed / timing out during startup. The Pastebin shows the same pattern with a timeout, not a second process. (pastebin.com)

Please paste this next:

openclaw status --all
openclaw logs --follow
#

If logs are huge, grab the startup errors with:

grep -Ei "error|warn|gateway|listen|ready|lock|EADDRINUSE|timeout|closed|bootstrap|Unhandled" /tmp/openclaw/openclaw-2026-04-29.log | tail -150

Also useful read-only check:

ps -ef | grep '[o]penclaw.*gateway'
systemctl --user status openclaw-gateway --no-pager -l
sudo systemctl status openclaw-gateway --no-pager -l

OpenClaw’s docs say multiple gateways should only be intentional with isolated profiles/ports, and gateway status --deep is the right check for stale duplicate services. Your output doesn’t show that yet — it shows one bound gateway that isn’t responding correctly. (docs.openclaw.ai)

vernal jackal
#
Gateway connection details:
  Gateway target: ws://127.0.0.1:18789
  Source: local loopback
  Config: /root/.openclaw/openclaw.json
  Bind: loopback

āœ“ Config: /root/.openclaw/openclaw.json
āœ“ Secret diagnostics (0)
āœ“ Restart sentinel: none
! Port 18789
  Port 18789 is already in use.
  - pid 918 root: openclaw-gateway (127.0.0.1:18789)
  - Gateway already running locally. Stop it (openclaw gateway stop) or use a different port.
! Tailscale exposure: off Ā· daemon unknown
āœ“ Skills: 10 eligible Ā· 0 missing Ā· /root/.openclaw/workspace
āœ“ Plugin compatibility (none)

Gateway restart attempts (tail): /root/.openclaw/logs/gateway-restart.log
  [2026-04-26T17:38:10Z] openclaw restart attempt source=update target=openclaw-gateway.service
  [2026-04-26T17:38:10Z] openclaw restart done source=update
  [2026-04-26T17:59:21Z] openclaw restart attempt source=update target=openclaw-gateway.service
  [2026-04-26T17:59:21Z] openclaw restart done source=update
  [2026-04-29T12:41:16Z] openclaw restart attempt source=update target=openclaw-gateway.service
  [2026-04-29T12:41:16Z] openclaw restart done source=update
! Channel issues skipped (gateway unreachable)

Gateway health:
  gateway closed (1000): 

Pasteable debug report. Auth tokens redacted.
cosmic basinBOT
#

This output still points to one gateway process, not two:

  • Runtime: running (pid 918)
  • Port 18789 ... pid 918 root: openclaw-gateway

So the ā€œport already in use / gateway already runningā€ bit is a red herring: the port is in use by the gateway itself.

The real problem is:

Gateway health:
  gateway closed (1000):

That means the gateway accepts the socket but closes during/around the connect handshake. If it only becomes healthy ~10 minutes later, something in startup is probably blocking readiness.

Please grab these next — they should show the actual reason:

#
curl -i http://127.0.0.1:18789/healthz
curl -i http://127.0.0.1:18789/readyz

journalctl --user -u openclaw-gateway -b --no-pager -n 250
tail -n 250 /tmp/openclaw/openclaw-2026-04-29.log

And this filtered one is especially useful:

grep -Ei "gateway ready|startup|sidecar|closed before connect|handshake timeout|plugin|channel|error|warn|uncaught|unhandled" /tmp/openclaw/openclaw-2026-04-29.log | tail -200

Also: you’re running this as root (/root/.openclaw, pid owner root). That’s okay if intentional, but if you ever installed OpenClaw as a normal user too, that’s where duplicate service confusion can creep in. The current output itself only shows the root-owned gateway. 🦐

vernal jackal
#

So basically it showed repeated output of this

Apr 29 19:17:30 localhost node[918]: 2026-04-29T19:17:30.934+02:00 [plugins] device-pair: notify poll failed: Failed to parse JSON file: /root/.openclaw/devices/pending.json | Unexpected end of JSON input

Ye, looks like issue about me, because i emptied the file with stuck pairing device, but better was to delete it instead
Now gateway works ok, no more errors, i can open control ui using ssh tunnel, and my agent respond to me
Thank you very much, you are very good crustacean

cosmic basinBOT
#

Nice, glad we found it 🦐

Yep: an empty/truncated pending.json is invalid JSON, so deleting it is safer than leaving a zero-byte file there. That explains why the gateway was listening on the port but closing health/probe connections during startup.

If this fixed it, please right-click the solution message → Apps → āœ… Mark Solution so Answer Overflow indexes it for the next person. Happy to help!