#cant start gateway

1 messages · Page 1 of 1 (latest)

silent bridgeBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> âś… Mark Solution

vivid yokeBOT
#

That message means the service supervisor thinks the Gateway process is “running”, but the Gateway never actually bound/listened on :18789 (so health checks can’t connect). That’s almost always a crash on startup or a config/profile mismatch.

Run these and paste the output (they’re safe / read-only):

  1. Version (important)
openclaw --version
  1. Gateway probe details
openclaw gateway status

Look specifically for these lines in the output:

  • Probe target:
  • Listening:
  • Last gateway error:
#
  • and whether it shows Config (cli) vs Config (service) pointing at different places
  1. Pasteable full report (tokens redacted)
openclaw status --all
  1. Logs (this is usually where the real error is)
  • If you can:
    openclaw logs --follow
    
    then in another terminal:
    openclaw gateway restart
    
#
  • On Linux (service):
    journalctl --user -u openclaw-gateway.service -n 200 --no-pager
    
    (or openclaw-gateway-<profile>.service if you use a profile)

Common fixes once we see the above:

  • If Config (cli) ≠ Config (service) → you’re editing one config but the daemon is running another; usually fixed by reinstalling the service from the right profile/env:
    openclaw gateway install --force
    openclaw gateway restart
    
  • If Last gateway error mentions config parse / unknown keys / missing env vars → fix that config entry and restart.

Docs reference (matches this exact symptom): https://docs.openclaw.ai/help/faq#why-does-openclaw-gateway-status-say-runtime-running-but-rpc-probe-failed and log locations: https://docs.openclaw.ai/gateway/troubleshooting#log-locations

pliant berry
#

2026.3.2 /opt/homebrew/opt/node/bin/node /opt/homebrew/Cellar/openclaw-cli/2026.3.2/libexec/lib/node_modules/openclaw/dist/index.js gateway --port 18789
Service file: ~/Library/LaunchAgents/ai.openclaw.gateway.plist
Service env: OPENCLAW_GATEWAY_PORT=18789

Config (cli): ~/.openclaw/openclaw.json
Config (service): ~/.openclaw/openclaw.json

Gateway: bind=loopback (127.0.0.1), port=18789 (service args)
Probe target: ws://127.0.0.1:18789
Dashboard: http://127.0.0.1:18789/
Probe note: Loopback-only gateway; only local clients can connect.

Runtime: stopped (state active)
RPC probe: failed
RPC target: ws://127.0.0.1:18789
gateway closed (1006 abnormal closure (no close frame)): no close reason
Gateway target: ws://127.0.0.1:18789
Source: cli --url
Config: /Users/djryan/.openclaw/openclaw.json

Service is loaded but not running (likely exited immediately).
File logs: /tmp/openclaw/openclaw-2026-03-07.log
Launchd stdout (if installed): /Users/djryan/.openclaw/logs/gateway.log
Launchd stderr (if installed): /Users/djryan/.openclaw/logs/gateway.err.log

#

Gateway target: ws://127.0.0.1:18789
Source: local loopback
Config: /Users/djryan/.openclaw/openclaw.json
Bind: loopback

vivid yokeBOT
#

That openclaw gateway status output basically answers why: launchd is loaded, but the Gateway process exits immediately, so the health check never sees a stable listener on 127.0.0.1:18789.

Next step is just: get the crash reason from logs.

Please paste the last ~100–200 lines of both:

tail -n 200 /Users/djryan/.openclaw/logs/gateway.err.log
tail -n 200 /tmp/openclaw/openclaw-2026-03-07.log

If you want the fastest “show me the stack trace right now” route, run it in the foreground (so the error prints directly):

openclaw gateway stop
openclaw gateway run --port 18789
#

(If it still instantly exits, the terminal output will usually name the exact config key / channel / file that’s crashing it.)

Once you paste the stderr/log tail, I can tell you the exact fix (most often it’s a bad config entry or missing env var, and the log will point to it).

Docs for log locations (matches what your status output shows): https://docs.openclaw.ai/gateway/troubleshooting#log-locations

#

That snippet (Gateway target… Config… Bind…) is just the client/probe header — it doesn’t include the part we need (the actual crash / exception explaining why the launchd service exits).

Can you paste the log tail from the paths your gateway status already pointed to?

tail -n 200 /Users/djryan/.openclaw/logs/gateway.err.log
tail -n 200 /tmp/openclaw/openclaw-2026-03-07.log

If gateway.err.log is empty/nonexistent, run:

ls -la /Users/djryan/.openclaw/logs/

One important macOS gotcha while we’re here: if the Gateway works when you run it in a terminal but dies only as a service, it’s often missing env vars under launchd. Fix is usually putting keys in ~/.openclaw/.env (or enabling env.shellEnv.enabled). Docs: https://docs.openclaw.ai/help/faq#i-started-the-gateway-via-the-service-and-my-env-vars-disappeared-what-now

#

But first: paste the error log tail — it’ll tell us exactly which config/channel is killing the process.

pliant berry
#

Last gateway error: 2026-03-07T18:37:29.525-05:00 Gateway start blocked: set │
│ gateway.mode=local (current: unset) or pass --allow-unconfigured.

vivid yokeBOT
#

Your own status output has the root cause:

Last gateway error: … Gateway start blocked: set gateway.mode=local (current: unset) …

So the service is exiting immediately because local gateway mode isn’t enabled in config.

Fix (pick one):

Option A (recommended): set it via CLI

openclaw config set gateway.mode local
openclaw gateway restart
openclaw gateway status

Option B: edit config file directly

Edit ~/.openclaw/openclaw.json and add:

pliant berry
#

so all my models are gone i need to set my models back up to alibaba cloud coding plan with https://coding-intl.dashscope.aliyuncs.com/v1 and these modes set up like this openclaw models aliases add daily bailian/glm-5
openclaw models aliases add code bailian/qwen3-coder-next
openclaw models aliases add doc bailian/qwen3.5-plus
openclaw models aliases add vision bailian/kimi-k2.5
openclaw models aliases add batch bailian/MiniMax-M2.5
openclaw models aliases add fallback bailian/glm-4.7

vivid yokeBOT
#

If your aliases point at bailian/..., OpenClaw will only accept them if you have a custom provider named bailian configured again (that’s why it feels like “all models are gone”).

1) Make sure the API key is available to the service (launchd)

On macOS the gateway service won’t inherit your shell env, so put this in:

~/.openclaw/.env

DASHSCOPE_API_KEY=your_key_here

(Then restart the gateway after changes.)

Docs: https://docs.openclaw.ai/help/faq#i-started-the-gateway-via-the-service-and-my-env-vars-disappeared-what-now

2) Re-add the provider in ~/.openclaw/openclaw.json

Add something like:

#
{
  models: {
    mode: "merge",
    providers: {
      bailian: {
        baseUrl: "https://coding-intl.dashscope.aliyuncs.com/v1",
        apiKey: "${DASHSCOPE_API_KEY}",
        api: "openai-completions",
        models: [
          { id: "glm-5", name: "daily" },
          { id: "qwen3-coder-next", name: "code" },
          { id: "qwen3.5-plus", name: "doc" },
          { id: "kimi-k2.5", name: "vision" },
          { id: "MiniMax-M2.5", name: "batch" },
          { id: "glm-4.7", name: "fallback" },
#
        ],
      },
    },
  },
}

Note: Alibaba’s documented OpenAI-compatible base URLs are usually the .../compatible-mode/v1 ones (example: https://dashscope-intl.aliyuncs.com/compatible-mode/v1). If coding-intl.../v1 isn’t actually OpenAI-compatible, you’ll see 404/endpoint errors and should switch to the compatible-mode URL. Official doc: https://help.aliyun.com/zh/model-studio/compatibility-of-openai-with-dashscope

3) Restart + re-add your aliases

openclaw gateway restart
openclaw models status

Then your alias commands (exactly as you wrote) should work again:

#
openclaw models aliases add daily    bailian/glm-5
openclaw models aliases add code     bailian/qwen3-coder-next
openclaw models aliases add doc      bailian/qwen3.5-plus
openclaw models aliases add vision   bailian/kimi-k2.5
openclaw models aliases add batch    bailian/MiniMax-M2.5
openclaw models aliases add fallback bailian/glm-4.7

If you paste openclaw models status after step (2), I can tell you immediately whether it’s seeing the provider + baseUrl + “missing auth” vs “ok” (that’s the quickest sanity check).

pliant berry
#

for step one that copy and paste is broken I need it again

vivid yokeBOT
#

Step 1 again (macOS service env):

Create/edit this file on the gateway host:

~/.openclaw/.env

Add one line (no quotes, no spaces around =):

DASHSCOPE_API_KEY=YOUR_REAL_KEY_HERE

Then restart:

openclaw gateway restart
#

Don’t paste the real key into Discord.

pliant berry
#

i dont want to edit anything manusally and mess something up i just need the commands please

vivid yokeBOT
#

Commands-only route (no manual file editing). Run these locally in Terminal. Don’t paste your real key into Discord.

0) (If you haven’t already) ensure the gateway is allowed to start

openclaw config set gateway.mode local

1) Store your DashScope key in config (so launchd has it too)

This writes it into ~/.openclaw/openclaw.json under env.vars:

openclaw config set env.vars.DASHSCOPE_API_KEY "<PASTE_YOUR_KEY_HERE>"

2) Recreate your provider (bailian) pointing at that base URL

#
openclaw config set models.mode merge
openclaw config set models.providers.bailian.baseUrl "https://coding-intl.dashscope.aliyuncs.com/v1"
openclaw config set models.providers.bailian.api "openai-completions"
openclaw config set models.providers.bailian.apiKey '${DASHSCOPE_API_KEY}'

Now set the model list (JSON5; use --strict-json so it doesn’t get mis-parsed):

openclaw config set models.providers.bailian.models '[{id:"glm-5",name:"glm-5"},{id:"qwen3-coder-next",name:"qwen3-coder-next"},{id:"qwen3.5-plus",name:"qwen3.5-plus"},{id:"kimi-k2.5",name:"kimi-k2.5"},{id:"MiniMax-M2.5",name:"MiniMax-M2.5"},{id:"glm-4.7",name:"glm-4.7"}]' --strict-json

3) Validate + restart

openclaw config validate
#
openclaw gateway restart
openclaw models status

4) Re-add your aliases

openclaw models aliases add daily    bailian/glm-5
openclaw models aliases add code     bailian/qwen3-coder-next
openclaw models aliases add doc      bailian/qwen3.5-plus
openclaw models aliases add vision   bailian/kimi-k2.5
openclaw models aliases add batch    bailian/MiniMax-M2.5
openclaw models aliases add fallback bailian/glm-4.7

If openclaw models status shows errors like “endpoint not found”, swap the baseUrl to Alibaba’s documented OpenAI-compatible endpoint (intl):

#
openclaw config set models.providers.bailian.baseUrl "https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
openclaw gateway restart
pliant berry
#

got this error Error: Config validation failed: models.providers.bailian.baseUrl: Invalid input: expected string, received undefined

vivid yokeBOT
#

That validation error means your config currently has a models.providers.bailian object, but its baseUrl ended up missing/undefined (usually from a partial/failed config set).

Fastest “commands-only” fix: set the entire provider object in one shot (so it can’t be half-configured):

openclaw config set models.providers.bailian '{baseUrl:"https://coding-intl.dashscope.aliyuncs.com/v1",api:"openai-completions",apiKey:"${DASHSCOPE_API_KEY}",models:[{id:"glm-5",name:"glm-5"},{id:"qwen3-coder-next",name:"qwen3-coder-next"},{id:"qwen3.5-plus",name:"qwen3.5-plus"},{id:"kimi-k2.5",name:"kimi-k2.5"},{id:"MiniMax-M2.5",name:"MiniMax-M2.5"},{id:"glm-4.7",name:"glm-4.7"}]}' --strict-json

Then:

openclaw config validate
openclaw gateway restart
openclaw models status

If you want to sanity-check what’s in config right now before/after:

#
openclaw config get models.providers.bailian
openclaw config get models.providers.bailian.baseUrl
pliant berry
#

do i put my api key where it says dashscope api key?

vivid yokeBOT
#

Yes — but only in the one place that’s meant to hold the secret, not in the provider JSON blob.

You have two options:

Option A (commands-only, writes key into ~/.openclaw/openclaw.json)

Run this locally (don’t paste the real key into Discord):

openclaw config set env.vars.DASHSCOPE_API_KEY "<PASTE_YOUR_REAL_KEY_HERE>"

Then in the provider config we set apiKey:"${DASHSCOPE_API_KEY}" — that ${…} is just a reference to the env var you set above (you do not paste the key there).

Option B (keeps key out of openclaw.json, uses ~/.openclaw/.env)

Put this line in ~/.openclaw/.env:

#
DASHSCOPE_API_KEY=YOUR_REAL_KEY_HERE

Either way, after setting it:

openclaw gateway restart
openclaw models status
pliant berry
#

default is still some claude opus

vivid yokeBOT
#

To change the default away from Opus, set it explicitly:

openclaw models set daily

(or skip the alias and set the full ref:)

openclaw models set bailian/glm-5

Then confirm:

openclaw models status --plain
#

If it still doesn’t switch, paste the output of:

openclaw models aliases list
openclaw models status
pliant berry
#

it switched but im getting error 500 java.lang.IllegalArgumentException: Illegal group reference

vivid yokeBOT
#

That 500 java.lang.IllegalArgumentException: Illegal group reference is coming from the upstream Alibaba/DashScope server, not from OpenClaw itself. It usually means you’re hitting an endpoint that isn’t actually compatible with the OpenAI-style request OpenClaw is sending (or it’s choking on something like $... in the payload).

First thing to try: switch your base URL to the official OpenAI-compatible DashScope endpoint (/compatible-mode/v1) and retry.

If you’re outside mainland China:

openclaw config set models.providers.bailian.baseUrl "https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
openclaw gateway restart
openclaw models status --probe --probe-provider bailian

(US region endpoint is often https://dashscope-us.aliyuncs.com/compatible-mode/v1.)

If that still throws the same 500, paste the log snippet around it so we can see exactly what request path/body it’s dying on:

#
openclaw logs --limit 200

and copy the lines around “Illegal group reference”.

pliant berry
#

it says status ok

vivid yokeBOT
#

Nice — “status ok” usually means:

  • the Gateway is healthy again, and
  • the provider/auth config is now valid.

Quick final sanity check (this does a real request):

openclaw models status --probe --probe-provider bailian

And/or just send a simple “hi” message to the bot and confirm you get a normal reply (no 500s).

If it’s working now, please right-click the solution message → Apps → Mark Solution so Answer Overflow can index it for the next person.

pliant berry
#

still getting that message 500

vivid yokeBOT
#

If models status is OK but you still get a 500 when you actually chat, we need to figure out whether:

  • the provider is failing on any request, or
  • it only fails when OpenClaw sends the tool schema (common with some OpenAI-compatible endpoints).

Please run this and paste the output:

openclaw models status --probe --probe-provider bailian

Then, right after you trigger a 500 in chat, grab the log tail and paste the few lines around the error:

openclaw logs --limit 200
#

Quick isolation test (no tools)

If you want a fast confirmation that it’s “tools payload breaks this provider”, temporarily disable tools for your glm-5 entry and try one chat message:

openclaw config set models.providers.bailian.models[0].compat.supportsTools false --strict-json
openclaw gateway restart
  • If the 500 goes away: it’s almost certainly the provider choking on OpenClaw’s tool/function schema.
  • If the 500 stays: it’s something else (model id / endpoint / auth), and the logs/probe output will show it.
pliant berry
#

cant be either of those

#

hello

vivid yokeBOT
#

Okay — thanks. Two quick clarifying Qs before I give the cleanest workaround:

  • When you say “cant be either of those”, do you mean you aren’t allowed to change the provider baseUrl and you also aren’t allowed to globally disable tools?
  • Or do you mean you can’t change anything at all (no config edits)?

If you can edit config but can’t change baseUrl or fully disable tools, do this: narrow the tool set only for the Bailian provider so the provider won’t receive the large tool schema (this keeps tools globally, only limits what’s sent to that provider).

Try one of these (pick one and run it, then restart):

  1. Provider-wide minimal profile (sends almost no tools)
openclaw config set 'tools.byProvider["bailian"].profile' "minimal"
openclaw gateway restart
  1. Provider-wide messaging profile (keeps messaging-only tools)
#
openclaw config set 'tools.byProvider["bailian"].profile' "messaging"
openclaw gateway restart
  1. Provider/model-specific minimal (only affects that model)
openclaw config set 'tools.byProvider["bailian/glm-5"].profile' "minimal"
openclaw gateway restart
  1. Explicit small allowlist for that provider (fine-grained)
openclaw config set 'tools.byProvider["bailian"].allow' '["message","session_status"]' --strict-json
openclaw gateway restart
#

Why this helps

  • The Java 500 (Illegal group reference) often comes from the provider choking on the tool/function schema. Narrowing the tools sent to that provider is the least-invasive way to stop the failing request without changing endpoints or globally removing tooling.

After you try one, test the same chat that produced the 500. If it still fails, run:

openclaw models status --probe --probe-provider bailian
openclaw logs --limit 200

and paste the non-secret lines around the error (don’t paste API keys). If you can’t edit config at all, say so and I’ll give the next-best option (fallback model or proxy workaround).

#

Hey — if you’re still getting the Bailian 500, the next clean workaround is: don’t send the big tool schema to that provider (even a “normal chat” includes tool definitions unless you restrict them).

Also: config paths don’t support quoted keys (["bailian"]). Use bracket notation without quotes.

Try this:

openclaw config set tools.profile coding
openclaw config set tools.byProvider[bailian].profile minimal
openclaw gateway restart

Then test a simple message again.

If you still get 500, paste: