Timed out after 60s waiting for gateway port 18789 to become healthy.
Gateway process is running but port 18789 is still free (startup hang/crash loop or very slow VM startup).
Service runtime: status=running, state=active, pid=14222
Gateway port 18789 status: free.
Gateway restart timed out after 60s waiting for health checks.
#Gateway Timeout Issue
89 messages · Page 1 of 1 (latest)
That usually means the supervisor is alive, but the gateway itself never finishes binding to port 18789, so this is more likely a startup crash/hang than a port conflict.
First, what version are you on?
openclaw --version
Then please paste these, in order:
openclaw status --all
openclaw gateway status --deep
openclaw doctor
openclaw logs --follow
For logs --follow, just grab the first startup error after a restart, then Ctrl+C.
Common causes I’d check next:
- old/stale service config after an upgrade
- service pointing at the wrong Node/OpenClaw install, especially with
nvm/fnm/asdf - crash before bind
- very slow VM boot/startup
Docs:
Paste those outputs and I’ll help narrow it down.
openclaw --version
- OpenClaw 2026.4.14 (323493f)
Nice, that version is recent enough, so I wouldn’t blame an old known bug first.
Next I’d want these 3:
openclaw gateway status --deep
openclaw doctor
openclaw logs --follow
If openclaw logs --follow doesn’t show anything useful because RPC is down, use the file log fallback:
tail -f "$(ls -t /tmp/openclaw/openclaw-*.log | head -1)"
Please paste:
- the full
gateway status --deep - anything suspicious from
doctor - the first startup error from the logs after a restart
Also, if doctor offers to rewrite/repair the service config, that’s worth accepting.
Config warnings:\n- plugins.entries.acpx: plugin disabled (disabled in config) but config is present
- plugins.entries.memory-core: plugin disabled (disabled in config) but config is present
- plugins.entries.memory-wiki: plugin disabled (disabled in config) but config is present
- plugins.entries.telegram: plugin disabled (disabled in config) but config is present
🦞 OpenClaw 2026.4.14 (323493f) — We ship features faster than Apple ships calculator updates.
│
◇
Service: LaunchAgent (loaded)
File logs: /tmp/openclaw/openclaw-2026-04-15.log
Command: /Users/<user>/.nvm/versions/node/v24.13.1/bin/node /Users/<user>/.nvm/versions/node/v24.13.1/lib/node_modules/openclaw/dist/index.js gateway --port 18789
Service file: ~/Library/LaunchAgents/ai.openclaw.gateway.plist
Service env: OPENCLAW_GATEWAY_PORT=18789
Service config looks out of date or non-standard.
Service config issue: Gateway service embeds OPENCLAW_GATEWAY_TOKEN and should be reinstalled. (Run `openclaw gateway install --force` to remove embedded service token.)
Service config issue: Gateway service PATH includes version managers or package managers; recommend a minimal PATH. (/Users/<user>/.nvm/versions/node/v24.13.1/bin)
Service config issue: Gateway service uses Node from a version manager; it can break after upgrades. (/Users/<user>/.nvm/versions/node/v24.13.1/bin/node)
Service config issue: System Node 22 LTS (22.14+) or Node 24 not found; install it before migrating away from version managers.
Recommendation: run "openclaw doctor" (or "openclaw doctor --repair").
Config (cli): ~/.openclaw/openclaw.json
Config (service): ~/.openclaw/openclaw.json
Gateway: bind=loopback (127.0.0.1), port=18789 (service args)
Probe target: ws://127.0.0.1:18789
Dashboard: http://127.0.0.1:18789/
Probe note: Loopback-only gateway; only local clients can connect.
Runtime: running (pid 14222, state active)
Warm-up: launch agents can take a few seconds. Try again shortly.
RPC probe: failed
RPC target: ws://127.0.0.1:18789
timeout
Port 18789 is already in use.
- pid 14222 <user>: openclaw-gateway (127.0.0.1:18789)
- Gateway already running locally. Stop it (openclaw gateway stop) or use a different port.
Listening: 127.0.0.1:18789
Troubles: run openclaw status
Troubleshooting: https://docs.openclaw.ai/troubleshooting
Yep, the important part is not the plugin warnings, those are just noise here.
The real red flag is this:
- LaunchAgent is running from nvm
- service config is flagged as out of date/non-standard
- token is embedded in the service
- macOS
launchd+ version-managed Node is a pretty common footgun
So my guess is the gateway is dying before it ever binds 18789.
I’d try this next:
brew install node
openclaw doctor --repair
openclaw gateway install --force
openclaw gateway restart
If it still fails, then run this in the foreground and paste the first error:
openclaw gateway --verbose
And/or paste the top startup failure from:
tail -n 120 /tmp/openclaw/openclaw-2026-04-15.log
So, short version: this looks much more like a broken LaunchAgent/runtime setup than a gateway port problem.
Docs:
Ah, this changes the picture a bit.
The gateway is binding 127.0.0.1:18789 now, but the RPC probe is timing out. So this looks less like “can’t start” and more like “starts listening, then hangs during/after startup.”
A couple things to grab next.
1) Service logs
tail -n 120 ~/.openclaw/logs/gateway.err.log
tail -n 120 ~/.openclaw/logs/gateway.log
tail -n 120 /tmp/openclaw/openclaw-gateway.log
2) Smoke-test the core gateway on a different port, with channels/canvas skipped
OPENCLAW_SKIP_CHANNELS=1 \
OPENCLAW_SKIP_CANVAS_HOST=1 \
openclaw gateway --port 18999 --bind loopback
Then in another terminal:
openclaw gateway call health --url ws://127.0.0.1:18999 --timeout 3000
What this tells us:
- if 18999 works, something in normal startup, likely channel/plugin/canvas/service env, is wedging the main service
- if 18999 also hangs, then it’s a more core gateway/runtime problem
Also, the nvm LaunchAgent setup is still suspicious, but this latest output says the immediate blocker is the RPC hang, not the bind itself.
Paste the log tails plus whether the 18999 smoke test answers, and we can narrow it down fast.
11:34:24 [gateway] loading configuration…
11:34:24 Config warnings:\n- plugins.entries.acpx: plugin disabled (disabled in config) but config is present
- plugins.entries.memory-core: plugin disabled (disabled in config) but config is present
- plugins.entries.memory-wiki: plugin disabled (disabled in config) but config is present
- plugins.entries.telegram: plugin disabled (disabled in config) but config is present
11:34:24 [gateway] resolving authentication…
11:34:24 [gateway] starting...
11:34:43 [lcm] Plugin loaded (enabled=true, db=/Users/<user>/.openclaw/lcm.db, threshold=0.75)
11:34:43 [lcm] Compaction summarization model: openai-codex/gpt-5.4 (default)
11:34:43 [gateway] starting HTTP server...
11:34:44 [gateway] agent model: openai-codex/gpt-5.4
11:34:44 [gateway] ready (5 plugins: device-pair, discord, lossless-claw, phone-control, talk-voice; 19.8s)
11:34:44 [gateway] log file: /tmp/openclaw/openclaw-2026-04-15.log
11:34:44 [gateway] starting channels and sidecars...
11:34:44 [gateway/channels] skipping channel start (OPENCLAW_SKIP_CHANNELS=1 or OPENCLAW_SKIP_PROVIDERS=1)
11:34:45 [heartbeat] started
Nice, that narrows it down a lot.
That smoke test looks healthy, so the core gateway is fine. The problem is very likely in the normal post-startup path, meaning when it begins channels/sidecars, with Discord or Discord voice as the first suspects.
Next I’d do a normal foreground boot and capture the last lines after:
[gateway] starting channels and sidecars...
openclaw gateway stop
openclaw gateway --verbose
Then paste the last 20 to 40 lines, especially anything tagged like:
[discord][gateway/channels]
[voice][talk]
So, current diagnosis:
- ✅ gateway core boots
- ✅ HTTP server comes up
- ✅ skip-channels mode is healthy
- ❌ something in normal channel/sidecar startup is wedging the RPC path
If you paste that normal --verbose tail, we should be able to pin down whether it’s Discord itself, Discord voice, or another sidecar.
openclaw gateway call health --url ws://127.0.0.1:18999 --timeout 3000
Config warnings:\n- plugins.entries.acpx: plugin disabled (disabled in config) but config is present
- plugins.entries.memory-core: plugin disabled (disabled in config) but config is present
- plugins.entries.memory-wiki: plugin disabled (disabled in config) but config is present
- plugins.entries.telegram: plugin disabled (disabled in config) but config is present
🦞 OpenClaw 2026.4.14 (323493f)
I can't fix your code taste, but I can fix your build and your backlog.
│
◇ Config warnings ──────────────────────────────────────────────╮
│ │
│ - plugins.entries.acpx: plugin disabled (disabled in config) │
│ but config is present │
│ - plugins.entries.memory-core: plugin disabled (disabled in │
│ config) but config is present │
│ - plugins.entries.memory-wiki: plugin disabled (disabled in │
│ config) but config is present │
│ - plugins.entries.telegram: plugin disabled (disabled in │
│ config) but config is present │
│ │
├────────────────────────────────────────────────────────────────╯
│
◇
Gateway call failed: Error: gateway url override requires explicit credentials
Fix: pass --token or --password (or gatewayToken in tools).
Config: /Users/<user>/.openclaw/openclaw.json
Yep, that specific error is expected.
When you pass an explicit --url, OpenClaw wants explicit auth too, so the health call needs a local token/password. Don’t paste that token here.
If you want to finish that check locally, use your configured token, for example:
openclaw gateway call health \
--url ws://127.0.0.1:18999 \
--token "$(openclaw config get gateway.auth.token)" \
--timeout 3000
But honestly, the earlier ready ... skipping channel start log already told us the important part: skip-channels boot is healthy.
So the next useful step is still the same one:
openclaw gateway stop
openclaw gateway --verbose
Then paste the last lines after:
[gateway] starting channels and sidecars...
That should show which startup path is wedging, most likely Discord or Discord voice.
11:43:04 [gateway] starting HTTP server...
11:43:04 [canvas] host mounted at http://127.0.0.1:18789/__openclaw__/canvas/ (root /Users/<user>/.openclaw/canvas)
11:43:04 [gateway] agent model: openai-codex/gpt-5.4
11:43:04 [gateway] ready (4 plugins: device-pair, lossless-claw, phone-control, talk-voice; 13.4s)
11:43:04 [gateway] log file: /tmp/openclaw/openclaw-2026-04-15.log
11:43:04 [gateway] starting channels and sidecars...
11:43:05 [bedrock-mantle-discovery] Mantle IAM token generation unavailable
11:43:20 Registered plugin command: /pair (plugin: device-pair)
11:43:20 Registered plugin command: /phone (plugin: phone-control)
11:43:20 Registered plugin command: /voice (plugin: talk-voice)
11:43:21 [plugins] [lcm] Migration successful — 28 tables: conversations, sqlite_sequence, messages, summaries, message_parts, summary_messages, summary_parents, context_items, large_files, messages_fts, messages_fts_data, messages_fts_idx, messages_fts_content, messages_fts_docsize, messages_fts_config, summaries_fts, summaries_fts_data, summaries_fts_idx, summaries_fts_content, summaries_fts_docsize, summaries_fts_config, conversation_bootstrap_state, summaries_fts_cjk, summaries_fts_cjk_data, summaries_fts_cjk_idx, summaries_fts_cjk_content, summaries_fts_cjk_docsize, summaries_fts_cjk_config
11:43:21 Registered plugin command: /lcm (plugin: lossless-claw)
11:43:21 [heartbeat] started
11:43:21 [ws] ← open remoteAddr=127.0.0.1 remotePort=62645 localAddr=127.0.0.1 localPort=18789 endpoint=127.0.0.1:62645->127.0.0.1:18789 conn=6b722b65…73a5
11:43:22 [ws] ← connect client=openclaw-tui clientDisplayName=openclaw-tui version=2026.4.12 mode=ui clientId=openclaw-tui platform=darwin auth=token
11:43:22 [ws] → hello-ok methods=128 events=24 presence=2 stateVersion=2
11:43:22 [ws] → event health seq=1 clients=1 presenceVersion=2 healthVersion=3
11:43:22 [ws] ⇄ res ✓ agents.list 2ms id=489d01c4…887c
11:43:22 [bedrock-mantle-discovery] Mantle IAM token generation unavailable
Okay, this is actually really helpful.
That foreground run is healthy:
- gateway says ready
- the WS client connects
hello-okreturnshealthevent is sent
So the gateway itself is fine when started manually. That points back to the LaunchAgent/service install as the main problem.
Two notes:
bedrock-mantle-discovery ... unavailableis usually harmless unless you’re using Bedrock.- In this run, Discord is missing from the loaded plugin list, which is a bit weird since it showed up earlier. So there may be a service/env/auth difference too.
What I’d do next
Short-term workaround: if you just need it working now, keep using the foreground gateway.
Proper fix: repair the macOS service install.
brew install node
openclaw doctor --repair
openclaw gateway install --force
openclaw gateway restart
If it still acts weird after that, go one step harder:
openclaw gateway uninstall
openclaw gateway install
openclaw gateway restart
One extra check while the foreground gateway is up
openclaw channels status --probe
Please paste the Discord part of that output, because the missing Discord plugin in this healthy run is the one odd clue left.
So, current diagnosis: not a broken gateway, mostly a broken LaunchAgent/service setup, with a possible Discord auth/config mismatch on top.
11:48:00 [diagnostic] lane task error: lane=main durationMs=10246 error="Error: Context engine "lossless-claw" factory returned an invalid ContextEngine: info.id must match registered id "lossless-claw"."
11:48:00 [diagnostic] lane task error: lane=session:agent:<agent1>:main durationMs=10249 error="Error: Context engine "lossless-claw" factory returned an invalid ContextEngine: info.id must match registered id "lossless-claw"."
11:48:01 [heartbeat] failed: Outbound not configured for channel: discord
Aha, that is a real smoking gun.
This part is broken:
Context engine "lossless-claw" factory returned an invalid ContextEngine:
info.id must match registered id "lossless-claw"
So right now I’d stop chasing ports and treat lossless-claw as the bad component. That plugin is registering an invalid context engine, which can absolutely cause weird runtime behavior.
Fastest unblock
Switch back to the built-in context engine and restart:
openclaw config set plugins.slots.contextEngine legacy
openclaw gateway restart
If you want to be extra safe, also disable the plugin for now:
openclaw plugins disable lossless-claw
openclaw gateway restart
After it’s stable
Then inspect/update the plugin:
openclaw plugins inspect lossless-claw
openclaw plugins update lossless-claw
If it still comes back broken after update, I’d leave it off and file a bug with the plugin maintainer.
Also, this line:
[heartbeat] failed: Outbound not configured for channel: discord
looks like downstream noise, not the root cause.
So my recommendation is:
- set context engine back to
legacy - restart
- check
openclaw gateway status --deepagain
If you paste that next status output, we can confirm whether this was the blocker.
still seeing this error
Restarted LaunchAgent: gui/501/ai.openclaw.gateway
Timed out after 60s waiting for gateway port 18789 to become healthy.
Gateway process is running but port 18789 is still free (startup hang/crash loop or very slow VM startup).
Service runtime: status=running, state=active, pid=68565
Gateway port 18789 status: free.
Gateway restart timed out after 60s waiting for health checks.
Got it. Then the lossless-claw issue is real, but it’s not the thing causing this specific LaunchAgent timeout.
At this point the pattern is:
- manual foreground gateway works
- LaunchAgent gateway does not
So this is now looking very strongly like a service/runtime path or launchd env problem.
Please paste these next, all together:
openclaw gateway status --deep
tail -n 120 ~/.openclaw/logs/gateway.err.log
tail -n 120 ~/.openclaw/logs/gateway.log
which node
which openclaw
node -v
What I’m looking for:
- whether the LaunchAgent is still pointing at
~/.nvm/... - whether launchd stderr shows the real startup failure
- whether your interactive shell and the service are using different Node/OpenClaw binaries
If you need it working right now, the good news is the foreground gateway path already proved healthy, so running:
openclaw gateway --verbose
is a valid temporary workaround while we fix the LaunchAgent.
Service: LaunchAgent (loaded)
File logs: /tmp/openclaw/openclaw-2026-04-15.log
Command: /Users/<user>/.nvm/versions/node/v24.13.1/bin/node /Users/<user>/.nvm/versions/node/v24.13.1/lib/node_modules/openclaw/dist/index.js gateway --port 18789
Service file: ~/Library/LaunchAgents/ai.openclaw.gateway.plist
Service env: OPENCLAW_GATEWAY_PORT=18789
Service config looks out of date or non-standard.
Service config issue: Gateway service embeds OPENCLAW_GATEWAY_TOKEN and should be reinstalled. (Run `openclaw gateway install --force` to remove embedded service token.)
Service config issue: Gateway service PATH includes version managers or package managers; recommend a minimal PATH. (/Users/<user>/.nvm/versions/node/v24.13.1/bin)
Service config issue: Gateway service uses Node from a version manager; it can break after upgrades. (/Users/<user>/.nvm/versions/node/v24.13.1/bin/node)
Recommendation: run "openclaw doctor" (or "openclaw doctor --repair").
Config (cli): ~/.openclaw/openclaw.json
Config (service): ~/.openclaw/openclaw.json
Gateway: bind=loopback (127.0.0.1), port=18789 (service args)
Probe target: ws://127.0.0.1:18789
Dashboard: http://127.0.0.1:18789/
Probe note: Loopback-only gateway; only local clients can connect.
Runtime: running (pid 68565, state active)
Warm-up: launch agents can take a few seconds. Try again shortly.
RPC probe: failed
RPC target: ws://127.0.0.1:18789
timeout
Port 18789 is already in use.
- pid 68565 <user>: openclaw-gateway (127.0.0.1:18789)
- Gateway already running locally. Stop it (openclaw gateway stop) or use a different port.
Listening: 127.0.0.1:18789
Troubles: run openclaw status
Troubleshooting: https://docs.openclaw.ai/troubleshooting
Yep, this confirms it.
The LaunchAgent is still pinned to:
/Users/<user>/.nvm/versions/node/v24.13.1/...
So the service never actually moved off nvm, and that’s the thing I’d fix now.
Recommended fix
Use the official installer so OpenClaw gets its own stable runtime, then reinstall the service:
curl -fsSL --proto '=https' --tlsv1.2 https://openclaw.ai/install-cli.sh | bash
openclaw gateway install --force
openclaw gateway restart
Then verify:
openclaw gateway status --deep
What you want to see is that the Command: line no longer points into ~/.nvm/....
Why this matters
Right now the service is launching from a version-manager Node path, and the audit is still flagging exactly that:
- embedded gateway token
- non-standard PATH
- Node from
nvm
That’s the strongest signal in the whole thread.
Temporary workaround
Until that’s fixed, the manual foreground run is still your reliable path:
openclaw gateway --verbose
Docs:
If you rerun status --deep after the installer + reinstall, paste just the Command: block and I’ll tell you immediately whether it’s finally on the right runtime.
what does curl -fsSL --proto '=https' --tlsv1.2 https://openclaw.ai/install-cli.sh | bash do
It downloads the OpenClaw CLI installer script over HTTPS and runs it with bash.
For install-cli.sh, per the docs, it:
- installs a local Node runtime under
~/.openclaw/tools/node-v... - installs OpenClaw under
~/.openclaw - writes a wrapper at
~/.openclaw/bin/openclaw - if a gateway service from that prefix is already loaded, it does a best-effort:
openclaw gateway install --forceopenclaw gateway restart
That’s why I suggested it here, it’s a clean way to get your LaunchAgent off the current ~/.nvm/... runtime.
The curl flags mean:
-ffail on HTTP errors-sSquiet output, but still show errors
-Lfollow redirects--proto '=https'only allow HTTPS--tlsv1.2require TLS 1.2+
If you want to inspect it before running, totally reasonable:
curl -fsSL --proto '=https' --tlsv1.2 https://openclaw.ai/install-cli.sh -o /tmp/install-cli.sh
less /tmp/install-cli.sh
Or just see its help:
curl -fsSL --proto '=https' --tlsv1.2 https://openclaw.ai/install-cli.sh | bash -s -- --help
should i install inside .openclaw?
Nope, you don’t need to cd into .openclaw first.
Just run it from a normal terminal anywhere, and install-cli.sh will use ~/.openclaw as the default prefix automatically unless you override it with --prefix.
So, short version:
- yes, it installs to
~/.openclawby default - no, you do not need to manually install “inside” it yourself
If you want, the normal command is just:
curl -fsSL --proto '=https' --tlsv1.2 https://openclaw.ai/install-cli.sh | bash
Then check:
openclaw gateway status --deep
and make sure the Command: line is no longer pointing into ~/.nvm/....
but this command won't override my current openclaw config correct?
Correct, it should not reset or overwrite your existing ~/.openclaw/openclaw.json just by default.
For install-cli.sh, the main effects are:
- install/update the OpenClaw runtime under
~/.openclaw - write/update the
~/.openclaw/bin/openclawwrapper - best-effort refresh/restart the gateway service
What it doesn’t do by default is run onboarding and replace your config. install-cli.sh defaults to no onboarding.
The main thing it can change automatically is the service install, not your normal config.
If you want to be extra safe first, back up the config:
cp ~/.openclaw/openclaw.json ~/.openclaw/openclaw.json.bak
So, short answer: service/runtime changes yes, config reset no.
> curl -fsSL --proto '=https' --tlsv1.2 https://openclaw.ai/install-cli.sh | bash
Installing Node 22.22.0 (user-space)...
Installing OpenClaw (latest)...
added 752 packages in 3m
Refreshing loaded gateway service...
Warning: gateway service restart failed; continuing.
OpenClaw installed (OpenClaw 2026.4.14 (323493f)).
Good, that got the new runtime installed. The warning just means the installer couldn’t fully flip the running LaunchAgent automatically.
Now use the new wrapper explicitly, so we don’t accidentally call the old nvm install from your current shell:
~/.openclaw/bin/openclaw --version
~/.openclaw/bin/openclaw gateway install --force
~/.openclaw/bin/openclaw gateway restart
~/.openclaw/bin/openclaw gateway status --deep
What we want to confirm in the final output:
- the
Command:line should no longer point into~/.nvm/... - ideally it should point into something under
~/.openclaw/...
Paste the Command: block from that last status --deep and I’ll sanity-check it.
Restarted LaunchAgent: gui/501/ai.openclaw.gateway
Timed out after 60s waiting for gateway port 18789 to become healthy.
Gateway process is running but port 18789 is still free (startup hang/crash loop or very slow VM startup).
Service runtime: status=running, state=active, pid=77859
Gateway port 18789 status: free.
Gateway restart timed out after 60s waiting for health checks.
Tip: openclaw gateway status --deep
Tip: openclaw doctor
We’re close, but that restart output alone isn’t enough.
What I need now is the new service details after the installer, specifically whether the LaunchAgent is still pointing at ~/.nvm/... or finally moved to ~/.openclaw/....
Please run these exact commands with the new wrapper and paste them:
~/.openclaw/bin/openclaw gateway status --deep
tail -n 120 ~/.openclaw/logs/gateway.err.log
tail -n 120 ~/.openclaw/logs/gateway.log
The key thing I’m looking for is the Command: line in status --deep.
- if it still says
~/.nvm/..., the service didn’t actually flip - if it says
~/.openclaw/..., then we’ve moved past the runtime issue and the stderr log should show the real startup failure
That’ll tell us which branch to take next.
i dont' see Command: logs
No worries, then just paste the full output of:
~/.openclaw/bin/openclaw gateway status --deep
If you specifically want the actual process command line, use one of these:
pgrep -af openclaw
or, with the PID from the restart output:
ps -p 77859 -o command=
(Use the current PID if it changed.)
And I still want the stderr tail too:
tail -n 120 ~/.openclaw/logs/gateway.err.log
So, if easiest, send these 3:
~/.openclaw/bin/openclaw gateway status --deep
pgrep -af openclaw
tail -n 120 ~/.openclaw/logs/gateway.err.log
That’ll give us the missing piece.
Config warnings:\n- plugins.entries.acpx: plugin disabled (disabled in config) but config is present
- plugins.entries.memory-core: plugin disabled (disabled in config) but config is present
- plugins.entries.memory-wiki: plugin disabled (disabled in config) but config is present
- plugins.entries.telegram: plugin disabled (disabled in config) but config is present
- plugins.entries.lossless-claw: plugin disabled (disabled in config) but config is present
🦞 OpenClaw 2026.4.14 (323493f) — Making 'I'll automate that later' happen now.
│
◇
Service: LaunchAgent (loaded)
File logs: /tmp/openclaw/openclaw-2026-04-15.log
Command: /Users/<user>/.openclaw/tools/node-v22.22.0/bin/node /Users/<user>/.openclaw/lib/node_modules/openclaw/dist/index.js gateway --port 18789
Service file: ~/Library/LaunchAgents/ai.openclaw.gateway.plist
Service env: OPENCLAW_GATEWAY_PORT=18789
Service config looks out of date or non-standard.
Service config issue: Gateway service embeds OPENCLAW_GATEWAY_TOKEN and should be reinstalled. (Run `openclaw gateway install --force` to remove embedded service token.)
Service config issue: Gateway service PATH includes version managers or package managers; recommend a minimal PATH. (/Users/<user>/.nvm/versions/node/v24.13.1/bin)
Recommendation: run "openclaw doctor" (or "openclaw doctor --repair").
Config (cli): ~/.openclaw/openclaw.json
Config (service): ~/.openclaw/openclaw.json
Gateway: bind=loopback (127.0.0.1), port=18789 (service args)
Probe target: ws://127.0.0.1:18789
Dashboard: http://127.0.0.1:18789/
Probe note: Loopback-only gateway; only local clients can connect.
Runtime: running (pid 80393, state active)
Warm-up: launch agents can take a few seconds. Try again shortly.
RPC probe: failed
RPC target: ws://127.0.0.1:18789
timeout
Port 18789 is already in use.
- pid 80393 <user>: openclaw-gateway (127.0.0.1:18789)
- Gateway already running locally. Stop it (openclaw gateway stop) or use a different port.
Nice, that’s progress.
The important bit is this:
Command: /Users/<user>/.openclaw/tools/node-v22.22.0/bin/node ...
So the LaunchAgent is finally using the ~/.openclaw runtime, not the old ~/.nvm/... Node anymore.
What’s still wrong:
- service still has embedded
OPENCLAW_GATEWAY_TOKEN - service PATH still includes the old nvm path
- RPC probe still times out even though the port is listening
So the next move is:
~/.openclaw/bin/openclaw doctor --repair
~/.openclaw/bin/openclaw gateway restart
~/.openclaw/bin/openclaw gateway status --deep
If doctor --repair asks to rewrite/fix the service, accept it.
If it’s still broken after that, then I need the actual launchd stderr:
tail -n 120 ~/.openclaw/logs/gateway.err.log
tail -n 120 ~/.openclaw/logs/gateway.log
So, short version: we fixed the runtime path, now we need doctor to clean up the stale service env.
From the last 120 lines of both logs, these are the main issues worth calling out:
~/.openclaw/logs/gateway.err.log
Repeated WebSocket failures:
handshake timeout
closed before connect
a couple with reason=connect failed
Delivery failures to Discord:
Outbound not configured for channel: discord
these are marked as permanent and moved to failed/
Auth problem with Codex:
Token refresh failed: 401
message says the refresh token was already used and you need to sign in again
Model pricing bootstrap timeouts:
pricing bootstrap failed: TimeoutError
One LCM summarization fallback:
no summary model candidates resolved
FALLING BACK TO EMERGENCY TRUNCATION
Repeated config warnings:
several plugins are disabled but still have config present: acpx, memory-core, memory-wiki, telegram, lossless-claw
Non-fatal runtime warnings:
deprecated punycode
experimental SQLite warning
Shutdown warning once:
websocket server close exceeded 1000ms; forcing shutdown continuation
~/.openclaw/logs/gateway.log
Mostly normal lifecycle events: startup, ready, sidecars starting, restarts on signals.
One user-visible transient error:
chat.history unavailable during gateway startup
followed shortly by successful sessions.list, so this looks like a startup timing issue rather than a persistent failure.
Multiple SIGTERM / SIGUSR1 restarts are present, but they look intentional/graceful, not crashes.
Most important items to fix first:
Re-authenticate Codex because the refresh token is invalid/reused.
Investigate why WebSocket connections to 127.0.0.1:18789 are timing out or failing to connect.
Either configure Discord outbound or stop generating Discord deliveries.
Clean up the disabled plugin configs if those warnings are unwanted.
Yep, that’s useful, and I’d rank it a bit differently:
What actually matters first
-
Re-auth Codex
- your default model path has been
openai-codex/gpt-5.4 - the logs show
refresh_token_reused/sign in again - that can cascade into pricing bootstrap failures and model-resolution weirdness
- your default model path has been
-
Then re-run doctor + restart
- now that the service is on
~/.openclaw/..., this is the right cleanup pass
- now that the service is on
What I would mostly ignore for now
- disabled plugin config warnings
punycodedeprecation- experimental SQLite warning
- LCM emergency truncation
- Discord outbound failure, unless Discord delivery is something you actively need right now
That Discord error looks more like downstream noise:
Outbound not configured for channel: discord
My next exact commands
~/.openclaw/bin/openclaw models auth login --provider openai-codex
~/.openclaw/bin/openclaw doctor --repair
~/.openclaw/bin/openclaw gateway restart
~/.openclaw/bin/openclaw gateway status --deep
If doctor --repair offers to rewrite/fix the service, accept it.
So, short version: I’d fix Codex auth first, then doctor/restart, then see what’s left. The pricing/bootstrap and some of the startup weirdness may disappear once auth is clean.
RPC probe: failed
RPC target: ws://127.0.0.1:18789
gateway closed (1006):
Gateway port 18789 is not listening (service appears running).
Last gateway error: - plugins.entries.lossless-claw: plugin disabled (disabled in config) but config is present
Logs: ~/.openclaw/logs/gateway.log
Errors: ~/.openclaw/logs/gateway.err.log
That lossless-claw line is almost certainly not the real cause, it’s just the last warning the status command saw.
The important part is this:
gateway closed (1006)- port
18789is not listening - service still looks “running”
That usually means the gateway is dying during boot or closing before the listener stays up.
At this point I need the raw log lines, not the status summary. Please do a fresh restart and paste the raw tail from both logs:
~/.openclaw/bin/openclaw gateway restart
tail -n 80 ~/.openclaw/logs/gateway.err.log
tail -n 80 ~/.openclaw/logs/gateway.log
Also, one yes/no question: did this succeed earlier?
~/.openclaw/bin/openclaw models auth login --provider openai-codex
If Codex re-auth never completed, that still matters. But right now the key thing is: we need the actual fatal lines from the newest restart, because the status summary is hiding the real failure.
- plugins.entries.lossless-claw: plugin disabled (disabled in config) but config is present
2026-04-15T14:00:38.098-04:00 [ws] closed before connect conn=0d8d248d-7470-4082-80c8-5ba5154372f0 peer=127.0.0.1:63282->127.0.0.1:18789 remote=127.0.0.1 fwd=n/a origin=n/a host=127.0.0.1:18789 ua=n/a code=1006 reason=n/a
2026-04-15T14:04:53.522-04:00 Config warnings:\n- plugins.entries.acpx: plugin disabled (disabled in config) but config is present
- plugins.entries.memory-core: plugin disabled (disabled in config) but config is present
- plugins.entries.memory-wiki: plugin disabled (disabled in config) but config is present
- plugins.entries.telegram: plugin disabled (disabled in config) but config is present
- plugins.entries.lossless-claw: plugin disabled (disabled in config) but config is present
2026-04-15T14:07:08.785-04:00 [model-pricing] pricing bootstrap failed: TimeoutError: The operation was aborted due to timeout
Those lines still look like symptoms, not the actual boot failure:
closed before connect= probe/client symptom- disabled plugin config warnings = noisy but not fatal
pricing bootstrap failed: TimeoutError= annoying, usually not a gateway-killer by itself
So the real blocker still hasn’t shown up in what you pasted.
The one clearly actionable thing from earlier is still Codex auth. If you have not already done this successfully, do it now:
~/.openclaw/bin/openclaw models auth login --provider openai-codex
~/.openclaw/bin/openclaw gateway restart
~/.openclaw/bin/openclaw gateway status --deep
If you already did re-auth Codex, then I need the exact raw tail from the newest restart, not a filtered summary, because the summary keeps hiding the real failure.
Please paste exactly this output, verbatim:
tail -n 80 ~/.openclaw/logs/gateway.err.log
tail -n 80 ~/.openclaw/logs/gateway.log
No filtering this time, even if it’s noisy. We’re missing the actual fatal line.
can we check if openclaw is running with nvm or node
Yes, and based on the status --deep output you pasted, we already did.
Right now the actual LaunchAgent runtime is:
/Users/<user>/.openclaw/tools/node-v22.22.0/bin/node
So it is not running on nvm anymore.
What’s still true is that the service PATH contains an old nvm path:
/Users/<user>/.nvm/versions/node/v24.13.1/bin
So:
- runtime executable:
.openclawNode ✅ - PATH environment: still has nvm residue ⚠️
If you want to confirm again live, run either of these:
~/.openclaw/bin/openclaw gateway status --deep
or, with the current PID:
ps -p <pid> -o command=
So the main issue is no longer “is it using nvm?”
The main issue now is why the gateway still dies/closes during startup even on the new .openclaw runtime.
how to know if openclaw actively processing my run ? where to check for current active run status?
Yep, a few good places:
Quick check
openclaw status
openclaw status --deep
Live gateway health
openclaw health
openclaw health --verbose
Best “is it actively doing something right now?” view
openclaw logs --follow
See recently active sessions
openclaw sessions --active 120 --verbose
If you’re asking from Discord/chat, /status is the quick in-chat view, and /agents shows active runs/bindings when applicable.
So, short version:
- health of gateway →
openclaw status --deep - live activity →
openclaw logs --follow - which sessions were active →
openclaw sessions --active 120 --verbose
{"type":"message","id":"c15fd1ee","parentId":"ba74c9e6","timestamp":"2026-04-20T10:54:38.051Z","message":{"role":"user","content":[{"type":"text","text":"[Mon 2026-04-20 16:22 GMT+5:30] An async command the user already approved has completed"}],"timestamp":1776682478040}}
{"type":"message","id":"c05b1251","parentId":"c15fd1ee","timestamp":"2026-04-20T10:56:33.262Z","message":{"role":"assistant","content":[{"type":"text","text":"The command failed again due to Windows path syntax issues, likely because the `exec \openclaw-demo.txt' -Value 'helloworld'"","host":"node","node":"windows-browser-node","pty":false}}],"api":"openai-completions","provider":"vllm","model":"Qwen/Qwen3-30B-A3B-Instruct-2507","usage":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"totalTokens":0,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"toolUse","timestamp":1776682478052,"responseId":"chatcmpl-1bdc47c5cee940759eafba1cf1f9f87c"}}
{"type":"message","id":"815f8202","parentId":"c05b1251","timestamp":"2026-04-20T10:56:33.466Z","message":{"role":"toolResult","toolCallId":"chatcmpl-tool-3d54c8c8c28d44b7960369d8da11d949","toolName":"exec","content":[{"type":"text","text":"{\n nd this interpreter/runtime command"},"isError":false,"timestamp":1776682593459}}
{"type":"message","id":"10d719e7","parentId":"815f8202","timestamp":"2026-04-20T10:58:30.952Z","message":{"role":"assistant","content":[{"type":"text","text":"The system denied the execution due to security restrictions on `powershell` commands via `exec` on this node — ZIP archive or base64-encoded version for easy transfer?"}],"api":"openai-completions","provider":"vllm","model":"Qwen/Qwen3-30B-A3B-Instruct-2507","usage":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"totalTokens":0,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"stop","timestamp":1776682593467,"responseId":"chatcmpl-f43c24fb2dfc47d5b78b8f7035092f2c"}}
Help me understand each stage of this log
Yep. This log is basically a 4-step chain:
1) Synthetic user event
{"id":"c15fd1ee", ... "role":"user", "text":"An async command ... has completed"}
This is not a normal human message. It’s a system-generated wakeup telling the agent that a previously approved background command finished.
2) Assistant reacts and tries to use a tool
{"id":"c05b1251", "parentId":"c15fd1ee", "role":"assistant", ... "stopReason":"toolUse"}
The assistant looked at that completion event and decided it needed a tool call, here exec.
Important bits:
parentIdpoints to the previous eventstopReason: "toolUse"means “the model paused to call a tool”
provider/modelshows which LLM produced this stephost: "node", node: "windows-browser-node"means it targeted that nodepty: falsemeans no pseudo-terminal
3) Tool result comes back
{"id":"815f8202", "parentId":"c05b1251", "role":"toolResult", "toolName":"exec", ...}
This is the raw reply from the exec tool.
Important bits:
toolCallIdlinks it to the assistant’s tool requesttoolName: "exec"says which tool answeredisError: falsemeans the tool layer didn’t mark it as a hard transport error- but the returned text still looks like a denial/security failure
So “not a tool crash” does not mean “the command succeeded”.
4) Assistant interprets the tool result and replies normally
{"id":"10d719e7", "parentId":"815f8202", "role":"assistant", ... "stopReason":"stop"}
Now the assistant has the tool output, interprets it, and gives the human-facing conclusion:
- execution was denied
- reason: security restrictions on PowerShell via
execon that node - then it suggests alternatives
stopReason: "stop" means this was a normal final answer, not another tool call.
How to read the chain quickly
Think of it as:
- wake event
- assistant decides what to do
- tool runs and returns
- assistant explains result
A few field notes
id= this message’s idparentId= what it is replying to / continuing fromtimestamp= when that message object was recordedresponseId= provider-side response idusageall zero = that backend didn’t report token usage here100% key signalhere is the transition:- assistant says “toolUse”
- toolResult comes back
- final assistant message explains denial
If you want, I can also annotate this exact JSON line by line.
I am trying to understand what is the flow? How the tool works?
If I am interrupting the node run connection in between a tool call exec ? what happens?
Let's consider both the cases (tool-call and tool-result stages), lets say I am interrupting the connection during these stage what will happen? how the node exec timoutSeconds comes into picture
Yep, roughly like this:
Flow for exec host=node
- User message arrives
- Model decides to call
exec - Gateway handles that tool call
- For
host=node, Gateway talks to the node with:system.run.prepare- then
system.run
- Node host spawns the process locally
- Node sends back
node.invoke.result - Gateway turns that into a toolResult
- Model does the next LLM call and writes the final assistant reply
So there are really two boundaries:
- tool call sent to node
- tool result returned from node
If you interrupt the node connection during the tool-call stage
Meaning: the node disconnects before the Gateway gets a result back.
Case A, disconnected before request is delivered
The run fails quickly with something like:
NOT_CONNECTEDfailed to send invoke to node
Case B, disconnected after request was sent, but before result comes back
Gateway keeps a pending invoke in memory.
If the node disconnects here, Gateway rejects that pending invoke with an error like:
node disconnected (system.run)
So from the agent’s point of view, that tool call fails.
If you interrupt during the tool-result stage
This splits into 2 subcases:
Case C, command finished on node, but result has not reached Gateway yet
Then the result can be lost.
Why:
node.invoke.resultsending is effectively best-effort- if the socket dies right there, Gateway may never receive the result
So the agent may see:
- disconnect error, or
node invoke timed out
even though the command may actually have already run on the node.
Case D, Gateway already received the tool result
Then that exec is basically done.
If you disconnect the node after that, it does not affect the finished tool call. It only affects the next node tool call.
What happens to the actual process on the node?
Important nuance:
If the node disconnects mid-run, the child process on the node may still keep running locally.
It is not automatically canceled just because the WebSocket dropped.
What stops it is the command timeout.
How timeoutSeconds / timeout fits in
For the exec tool, the user-facing field is timeout in seconds.
Internally, for host=node:
- that becomes
system.run.params.timeoutMs - the node host uses that timeout to kill the child process
- Gateway waits a little longer for the result, about:
exec timeout + 5 seconds
with a minimum of 10s
So if you set:
timeout = 30
then roughly:
- node kills the process at ~30s
- gateway waits until ~35s for the result to come back