#Gateway has been consistently flagging errors. Although there are no apparent impacts at the moment,

1 messages · Page 1 of 1 (latest)

knotty snow
#

Gateway 問題整理

目前 lossless-claw 已重裝完成且正常載入,OpenClaw config 也有效;剩下的是 gateway 相關異常。

我做了哪些處理

  • 重新安裝 lossless-claw-enhanced,並修正外掛設定 schema 問題
  • 檢查並保留你的遠端連線相關設定,不改這些值:
    • gateway.port = 18789
    • gateway.bind = "loopback"
    • gateway.tailscale.mode = "serve"
    • 原本的 gateway token 值不變
  • 重裝 LaunchAgent:
    • 原本 service 用的是舊 entrypoint dist/entry.js
    • 我重裝後已改成新版 dist/index.js
  • 重啟 gateway service,確認 service 能起來
  • 檢查 gateway log、gateway status、doctor、probe、health
  • 清理 1 個與 gateway 無直接關係的 orphan transcript
    • 9619d250-3a31-4b80-8ea2-70498572f116.jsonl 已封存為 .deleted.20260412-051500

目前觀察到的事實

  • openclaw config validate:通過
  • lossless-claw:正常載入,Plugins: Errors: 0
  • gateway process 實際有啟動
  • 但 CLI 探測仍異常
    • openclaw gateway health 失敗
    • openclaw gateway probe 失敗
    • 錯誤都是 gateway closed (1006 abnormal closure (no close frame))
  • doctor / status 仍有不一致回報
    • 一邊顯示 Gateway not running
    • 另一邊又顯示 Runtime: running

我為何判斷問題還沒解決

  • 因為 gateway health/probe 仍然無法成功握手
  • doctor/status 對 gateway 運行狀態的判讀仍不一致
  • gateway service 還持續被報:
    • Gateway service embeds OPENCLAW_GATEWAY_TOKEN and should be reinstalled

為何我沒有再往下硬修

  • 你的要求是不能破壞目前透過 Tailscale + ssh 的遠端連線能力
  • 我已確認 gateway 實際在跑、Tailscale serve 也有啟動
  • 再往下做的動作會開始碰這些高風險區域:
    • 手動改 LaunchAgent 裡的 OPENCLAW_GATEWAY_TOKEN
    • 改 gateway auth/token 解析方式
    • 深入調整 probe/health 的握手路徑
  • 這些可能讓遠端設備暫時或永久無法連回來,所以我停在保守邊界

目前最可能的根因

  • 比較像是 OpenClaw 2026.4.10 的 gateway service/probe/doctor 行為異常,而不是單純設定寫錯
  • 具體跡象:
    • 我已重裝 gateway install --force
    • entrypoint 已更新
    • 但 installer 仍把 OPENCLAW_GATEWAY_TOKEN 寫回 plist
    • doctor 仍持續要求「reinstall」同一件事
    • 同時實際服務又確實能啟動並接受 control UI 連線
quiet martenBOT
terse bearBOT
#

你的判斷我覺得大致合理,而且停在那個邊界是對的

我查了 docs 和目前程式碼,重點有 3 個:

  1. Runtime: runningRPC probe: failed 不是真矛盾
    docs 也有寫,前者是 supervisor/LaunchAgent 視角,後者才是 CLI 真正去連 Gateway RPC 的結果。
    參考: https://docs.openclaw.ai/help/faqhttps://docs.openclaw.ai/cli/gateway

  2. doctor 現在對 gateway closed ... 的訊息判讀 太粗
    目前程式碼裡,doctor 只要看到錯誤字串含 gateway closed,就會直接報 Gateway not running.
    所以你看到「明明 running,doctor 卻說 not running」, 這個現象本身很像是 doctor/probe 診斷面有問題,不一定代表 process 真的沒跑。

  3. Gateway service embeds OPENCLAW_GATEWAY_TOKEN and should be reinstalled. 這條 像是 service audit / installer 行為有問題
    目前 audit 確實會在 service 仍帶 inline OPENCLAW_GATEWAY_TOKEN 時丟這個 warning。
    如果你已經 openclaw gateway install --force 過,結果還是被重新寫回 plist,那我也會偏向判成 2026.4.10 附近的 bug / regression,不是單純你設定錯。

我建議下一步先只做零風險讀取,不要碰 token rotation / 手改 LaunchAgent:

#
openclaw --version
openclaw gateway status --json
openclaw health --verbose
launchctl getenv OPENCLAW_GATEWAY_TOKEN

如果可以,再補這個也很有價值:

launchctl getenv OPENCLAW_GATEWAY_PASSWORD

我現在最想確認的是兩件事:

  • 你實際版本是不是 2026.4.10
  • launchctl getenv OPENCLAW_GATEWAY_TOKEN 有沒有殘留值覆蓋 config
#

因為 docs 也明寫了,macOS 上如果之前跑過 launchctl setenv OPENCLAW_GATEWAY_TOKEN ...,它會持續覆蓋 config,造成很怪的 probe/auth 行為。
參考: https://docs.openclaw.ai/cli/doctor

所以我的結論是:

  • 先不要再硬修,你停手是對的
  • 目前看起來比較像 diagnostics/service install bug
  • 但還差 openclaw --version + gateway status --json + launchctl getenv ... 這幾個證據,才能更準確判

你把這幾個輸出貼上來,我幫你一起把它收斂成「安全 workaround」或「可直接提 issue 的最小 repro」。

knotty snow
#
 openclaw --version

  OpenClaw 2026.4.10 (44e5b62)

  openclaw gateway status --json

  {
    "logFile": "/tmp/openclaw/openclaw-2026-04-12.log",
    "service": {
      "label": "LaunchAgent",
      "loaded": true,
      "loadedText": "loaded",
      "notLoadedText": "not loaded",
      "command": {
        "programArguments": [
          "/Users/openclawfoox/.local/share/fnm/node-versions/v24.14.0/installation/bin/node",
          "/Users/openclawfoox/.local/share/fnm/node-versions/v24.14.0/installation/lib/node_modules/openclaw/dist/
  index.js",
          "gateway",
          "--port",
          "18789"
        ],
        "environment": {
          "OPENCLAW_GATEWAY_PORT": "18789"
        },
        "sourcePath": "/Users/openclawfoox/Library/LaunchAgents/ai.openclaw.gateway.plist"
      },
      "runtime": {
        "status": "running",
        "state": "active",
        "pid": 33933,
        "cachedLabel": false
      },
      "configAudit": {
        "ok": false,
        "issues": [
          {
            "code": "gateway-token-embedded",
            "message": "Gateway service embeds OPENCLAW_GATEWAY_TOKEN and should be reinstalled.",
            "detail": "Run `openclaw gateway install --force` to remove embedded service token.",
            "level": "recommended"
          }
        ]
      }
    },
    "config": {
      "cli": {
        "path": "/Users/openclawfoox/.openclaw/openclaw.json",
        "exists": true,
        "valid": true,
        "controlUi": {
          "allowedOrigins": [
            "https://felixdemac-mini.tailf46291.ts.net",
            "http://127.0.0.1:18789",
            "http://localhost:18789"
          ]
        }
      },
      ```
分兩段提供,你可以先暫時靜默(1/2)
#
"daemon": {
        "path": "/Users/openclawfoox/.openclaw/openclaw.json",
        "exists": true,
        "valid": true,
        "controlUi": {
          "allowedOrigins": [
            "https://felixdemac-mini.tailf46291.ts.net",
            "http://127.0.0.1:18789",
            "http://localhost:18789"
          ]
        }
      }
    },
    "gateway": {
      "bindMode": "loopback",
      "bindHost": "0.0.0.0",
      "port": 18789,
      "portSource": "service args",
      "probeUrl": "ws://127.0.0.1:18789",
      "probeNote": "Loopback-only gateway; only local clients can connect."
    },
    "port": {
      "port": 18789,
      "status": "busy",
      "listeners": [
        {
          "pid": 33933,
          "command": "node",
          "address": "127.0.0.1:18789"
        }
      ],
      "hints": [
        "Another process is listening on this port."
      ]
    },
    "rpc": {
      "ok": false,
      "error": "gateway closed (1006): ",
      "url": "ws://127.0.0.1:18789"
    },
    "health": {
      "healthy": true,
      "staleGatewayPids": []
    },
    "extraServices": []
  }

  launchctl getenv OPENCLAW_GATEWAY_TOKEN



  補一句重點:launchctl getenv OPENCLAW_GATEWAY_TOKEN 是空的,表示沒有看到你提到的那種 launchctl setenv 殘留值。現在更像
  是 LaunchAgent plist 內嵌 token 與 diagnostics/service install 行為本身有問題。

可以了,這是第二段,資訊已全部提供完畢,你可以回覆了(2/2)

terse bearBOT
#

這份資料很夠了,謝謝你整理得這麼乾淨。
我現在的判斷是:

這非常像 2026.4.10 的 diagnostics / service-install 問題,不像單純設定錯。

幾個關鍵點:

  • launchctl getenv OPENCLAW_GATEWAY_TOKEN 是空的,所以不是那種 launchctl setenv 殘留覆蓋 config。

  • gateway status --json 裡同時出現

    • runtime.status = running
    • health.healthy = true
    • rpc.ok = false / gateway closed (1006)

    這表示 process 活著,但 CLI 的 WS/RPC probe 路徑有問題。

  • doctorgateway closed ... 直接歸類成 Gateway not running,這會造成你看到的誤導性不一致。這部分我查 code 也確認了。

  • 另外補一個小細節: service.command.environment--json 輸出裡是有過濾的,只顯示安全 env keys,所以那裡沒看到 token,不能拿來反證 plist 裡一定沒有 token。configAudit 看的資料比 JSON 顯示的多。

#

所以我會把它拆成兩個問題:

  1. probe / doctor 診斷面有 bug
    尤其是 1006 被粗暴映射成「not running」

  2. service audit / install 對 token embedded 的處理也可能有 bug
    因為你已經 install --force 過,卻還是持續要求做同一件事

我建議的路線

推薦 A, 最安全: 先不要再修 auth/plist,直接保留現況,收斂成 issue。

因為現在:

  • Tailscale serve 在跑
  • 遠端能力沒壞
  • 再往下碰 token / plist 是高風險區

如果你要補最後一個純讀取證據,我只會加這個:

#
plutil -p ~/Library/LaunchAgents/ai.openclaw.gateway.plist | grep OPENCLAW_GATEWAY_TOKEN

這能直接確認 plist 是否真的 inline 了 token,不會改任何東西。
但就算不補,我也覺得已經夠像 bug 了。

我會怎麼寫結論

你現在可以很有把握地說:

  • OpenClaw 2026.4.10 (44e5b62)
  • Gateway service 實際 running,port 也有 listener
  • health.healthy = true
  • 但 CLI RPC probe 對 ws://127.0.0.1:18789 失敗,報 gateway closed (1006)
  • doctor 同時誤報 Gateway not running
  • launchctl getenv OPENCLAW_GATEWAY_TOKEN 為空,排除 launchctl setenv 殘留
  • gateway-token-embedded warning 在 install --force 後仍持續出現
knotty snow
#

好,請提供,我馬上提交

terse bearBOT
#

可以,這份你幾乎可以直接貼。
我幫你整理成 GitHub issue draft 了:

Title

gateway status/doctor disagree on 2026.4.10: runtime is running and health is healthy, but RPC probe closes with 1006 and service audit keeps reporting embedded OPENCLAW_GATEWAY_TOKEN after reinstall

Body

## Summary

On macOS, OpenClaw 2026.4.10 appears to have an inconsistency in gateway diagnostics / service audit.

The Gateway process is actually running, listening on loopback, and Tailscale Serve is active. Control UI connectivity also works. However:
#
- `openclaw gateway probe` / `openclaw gateway health` fail
- `openclaw gateway status --json` shows `runtime.status = running`
- the same JSON also shows `health.healthy = true`
- but `rpc.ok = false` with `gateway closed (1006):`
- `openclaw doctor` reports `Gateway not running`
- service audit keeps reporting `Gateway service embeds OPENCLAW_GATEWAY_TOKEN and should be reinstalled`
- rerunning `openclaw gateway install --force` did not clear that warning

This looks more like a diagnostics / service-install regression than a simple config mistake.

## Version

```bash
openclaw --version
OpenClaw 2026.4.10 (44e5b62)
#

## Environment

- macOS user LaunchAgent install
- Node from fnm:
  - `/Users/openclawfoox/.local/share/fnm/node-versions/v24.14.0/installation/bin/node`
- Gateway config characteristics:
  - `gateway.port = 18789`
  - `gateway.bind = "loopback"`
  - `gateway.tailscale.mode = "serve"`
  - existing gateway token was preserved, not rotated

## What I did

- Reinstalled the gateway service with `openclaw gateway install --force`
#
- Confirmed the LaunchAgent entrypoint was updated from old `dist/entry.js` to new `dist/index.js`
- Restarted the gateway service
- Validated config successfully
- Checked gateway logs, status, doctor, probe, and health
- Did **not** rotate token or hand-edit plist because this machine currently relies on Tailscale + SSH remote access and I wanted to avoid breaking remote connectivity

## Observed behavior

### Gateway is clearly alive

- gateway log shows:
  - `gateway ready`
  - `tailscale serve enabled: https://felixdemac-mini.tailf46291.ts.net/`
  - `webchat connected`
#
### But CLI probes fail

`openclaw gateway status --json` excerpt:

```json
{
  "service": {
    "label": "LaunchAgent",
    "loaded": true,
    "command": {
      "programArguments": [
        "/Users/openclawfoox/.local/share/fnm/node-versions/v24.14.0/installation/bin/node",
        "/Users/openclawfoox/.local/share/fnm/node-versions/v24.14.0/installation/lib/node_modules/openclaw/dist/index.js",
        "gateway",
        "--port",
        "18789"
#

],
"environment": {
"OPENCLAW_GATEWAY_PORT": "18789"
},
"sourcePath": "/Users/openclawfoox/Library/LaunchAgents/ai.openclaw.gateway.plist"
},
"runtime": {
"status": "running",
"state": "active",
"pid": 33933,
"cachedLabel": false
},
"configAudit": {
"ok": false,
"issues": [
{
"code": "gateway-token-embedded",

#

"message": "Gateway service embeds OPENCLAW_GATEWAY_TOKEN and should be reinstalled.",
"detail": "Run openclaw gateway install --force to remove embedded service token.",
"level": "recommended"
}
]
}
},
"gateway": {
"bindMode": "loopback",
"bindHost": "0.0.0.0",
"port": 18789,
"portSource": "service args",
"probeUrl": "ws://127.0.0.1:18789",
"probeNote": "Loopback-only gateway; only local clients can connect."
},
"port": {
"port": 18789,

#

"status": "busy",
"listeners": [
{
"pid": 33933,
"command": "node",
"address": "127.0.0.1:18789"
}
]
},
"rpc": {
"ok": false,
"error": "gateway closed (1006): ",
"url": "ws://127.0.0.1:18789"
},
"health": {
"healthy": true,
"staleGatewayPids": []

#

}
}


### launchctl env override check

I also checked for a stale `launchctl setenv OPENCLAW_GATEWAY_TOKEN` override:

```bash
launchctl getenv OPENCLAW_GATEWAY_TOKEN

Result: empty / unset.

So this does not appear to be caused by a lingering launchctl env override.

#
## Why this looks wrong

There seem to be multiple inconsistencies at once:

1. `runtime.status = running`
2. `health.healthy = true`
3. port listener exists on `127.0.0.1:18789`
4. but RPC probe still fails with `gateway closed (1006):`
5. `doctor` reports `Gateway not running`
6. service audit keeps telling me to reinstall even after reinstall

That combination makes it look like either:

- the probe / doctor classification is wrong for this failure mode, or
- the service audit / install flow for embedded `OPENCLAW_GATEWAY_TOKEN` on macOS LaunchAgent is not converging correctly, or
#
- both

## Expected behavior

One of these should happen:

- if the Gateway is actually healthy enough for local RPC, `rpc.ok` should succeed
- if the Gateway is running but the probe is failing for auth / close-code reasons, diagnostics should report that precisely instead of `Gateway not running`
- if `openclaw gateway install --force` is the correct fix for `gateway-token-embedded`, rerunning it should make the warning disappear
- otherwise the audit message should explain why reinstall is not sufficient

## Actual behavior

- Gateway is operational enough to serve Control UI / Tailscale Serve
- local CLI RPC probe still closes with 1006