#Randomly agent no longer sending replies on any channel

1 messages · Page 1 of 1 (latest)

waxen quest
#

Was messing around and last thing that worked was setting browser: headless: true.

Since then my control Openclaw chat agents no longer send replies (I see "working on reply" UI though) and same pattern on Discord.

The gateway log seems to show healthchecks per reply:

08:21 PM [discord] client initialized as 1475658592928534621; awaiting gateway readiness
08:21 PM [ws] ⇄ res ✓ health 124ms conn=3781ae87…de9b id=4BA6D6CC…48E6
08:22 PM [ws] ⇄ res ✓ health 76ms conn=3781ae87…de9b id=0A062986…4DA7
08:22 PM [ws] ⇄ res ✓ health 56ms cached=true conn=3781ae87…de9b id=FEB114E1…0C1B
08:22 PM [ws] ⇄ res ✓ health 66ms cached=true conn=3781ae87…de9b id=C88EFDC8…35EB
08:23 PM [ws] ⇄ res ✓ health 67ms cached=true conn=3781ae87…de9b id=F8D2987F…DF2B
08:24 PM [ws] ⇄ res ✓ health 62ms cached=true conn=3781ae87…de9b id=BC3CA14B…4D74
08:24 PM [ws] ⇄ res ✓ channels.status 1456ms conn=c5d0b1ea…0b59 id=9c45b630…260c
08:24 PM [ws] ⇄ res ✓ health 61ms cached=true conn=3781ae87…de9b id=3783AFC9…55EF
08:25 PM [ws] ⇄ res ✓ health 57ms cached=true conn=3781ae87…de9b id=2F452931…5598
08:25 PM [ws] ⇄ res ✓ health 66ms cached=true conn=3781ae87…de9b id=B15EBE75…A1B1
08:26 PM [ws] ⇄ res ✓ health 68ms cached=true conn=3781ae87…de9b id=D616F4DA…9C92

Openclaw doctor seems to just zip through; normally if something is wrong it stops and asks me to fix.

How can I fix or diagnose this?

loud forgeBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

frozen flumeBOT
#

420ElectricLove first thing: what does openclaw --version show? Version matters a lot for weird “accepted but no final reply” cases.

The log lines you pasted are mostly WebSocket health/status replies, not the agent turn itself. So I’d diagnose this as “gateway/UI alive, agent run or delivery path not completing.” Fast isolation path:

openclaw --version
openclaw config validate
openclaw status --all
openclaw status --deep
openclaw models status --probe
openclaw channels status --probe
openclaw sessions --all-agents --active 120 --verbose

Then tail logs while sending exactly one short test message in Control UI or Discord:

#
openclaw logs --follow

Also test the agent path without Discord/UI delivery:

openclaw agent --agent main --message "reply exactly: diagnostic ok" --json --timeout 120

How to read the result:

  • If agent --message hangs/fails too → model/auth/tool runtime issue. models status --probe + logs should show the real error.
  • If agent --message replies but Discord/Control don’t → delivery/routing/session/UI issue. Look at channels status --probe and delivery errors in logs.
  • If status says active work is stuck forever, do a safe restart escape hatch:
#
openclaw gateway restart --safe --skip-deferral

Since this started after browser.headless, I’d also temporarily revert it just to remove the variable:

openclaw config unset browser.headless
openclaw config unset agents.defaults.sandbox.browser.headless
openclaw gateway restart --safe --skip-deferral

Headless browser should not normally break plain chat replies unless the agent is stuck trying to initialize/use browser tooling, or the config change exposed another bad config/state issue.

Docs for the above:
https://docs.openclaw.ai/help/faq
https://docs.openclaw.ai/cli/agent

waxen quest
#

I can't even reply in this chat now?

frozen flumeBOT
#

I can see this reply from you, so Discord/thread posting itself is working. If you mean your OpenClaw agent can’t reply in this same Discord thread, that fits the same failure pattern: inbound message is being accepted, UI shows work starting, but the final agent/delivery path never completes.

Two quick things to try now:

openclaw --version
openclaw logs --follow

Then send one message to your bot and watch for the first non-healthcheck error. The important lines are usually around agent, model, delivery, session, or a stack trace — the repeated ws ✓ health lines are just the dashboard/clients checking in.

If it’s stuck with active work forever, restart with the stuck-work escape hatch:

openclaw gateway restart --safe --skip-deferral
#

And since the last known config change was browser headless, temporarily undo it and restart:

openclaw config unset browser.headless
openclaw config unset agents.defaults.sandbox.browser.headless
openclaw gateway restart --safe --skip-deferral

If you paste openclaw --version and the log lines from one fresh attempted reply, I’ll help read them.

waxen quest
#

Something called "clyde" said I violated some rule when I sent it to you

~/.openclaw % openclaw --version
OpenClaw 2026.5.7 (eeef486)

frozen flumeBOT
#

Ah — “Clyde” is Discord’s built-in safety/AutoMod layer, not OpenClaw. It can sometimes flag pasted logs/commands weirdly. Annoying, but it doesn’t mean your gateway changed anything.

You’re on 2026.5.7, so next I’d focus on runtime logs from one fresh attempt. Please run:

openclaw logs --follow

Then send your bot a tiny message like ping. Copy the first error-ish lines around that attempt — anything mentioning agent, model, delivery, session, tool, browser, timeout, or a stack trace. You can redact tokens/IDs.

Also run these two probes:

openclaw models status --probe
openclaw agent --agent main --message "reply exactly: diagnostic ok" --json --timeout 120
#

If the agent --message command hangs/fails, we know it’s not Discord; it’s the agent/model/tool runtime. If it returns diagnostic ok, then the agent works and we look at delivery/channel/session routing.

And if you want the blunt “unstick it” move first:

openclaw gateway restart --safe --skip-deferral

Then test again. I’m suspicious something is stuck in an active agent turn or tool init, not that browser.headless directly broke Discord replies.

waxen quest
#

03:35:18+00:00 warn agent/embedded {"subsystem":"agent/embedded"} empty response detected: runId=efe34490-848f-4106-a52f-feecf50b1e31 sessionId=a20e31ed-df80-4cfd-85df-95a546d5c800 provider=anthropic/claude-sonnet-4-6 — retrying 1/1 with visible-answer continuation
03:35:23+00:00 warn agent/embedded {"subsystem":"agent/embedded"} empty response retries exhausted: runId=efe34490-848f-4106-a52f-feecf50b1e31 sessionId=a20e31ed-df80-4cfd-85df-95a546d5c800 provider=anthropic/claude-sonnet-4-6 attempts=1/1 — surfacing incomplete-turn error
03:35:23+00:00 warn agent/embedded {"subsystem":"agent/embedded"} incomplete turn detected: runId=efe34490-848f-4106-a52f-feecf50b1e31 sessionId=a20e31ed-df80-4cfd-85df-95a546d5c800 stopReason=stop payloads=0 — surfacing error to user
03:35:23+00:00 warn model-fallback/decision {"subsystem":"model-fallback/decision"} {"event":"model_fallback_decision","tags":["error_handling","model_fallback","candidate_failed"],"runId":"efe34490-848f-4106-a52f-feecf50b1e31","sessionId":"a20e31ed-df80-4cfd-85df-95a546d5c800","lane":"main","decision":"candidate_failed","requestedProvider":"anthropic","requestedModel":"claude-sonnet-4-6","candidateProvider":"anthropic","candidateModel":"claude-sonnet-4-6","attempt":1,"total":1,"reason":"format","code":"incomplete_result","errorPreview":"anthropic/claude-sonnet-4-6 ended with an incomplete terminal response","errorHash":"sha256:f7accb6d21ef","fallbackStepType":"fallback_step","fallbackStepFromModel":"anthropic/claude-sonnet-4-6","fallbackStepFromFailureReason":"format","fallbackStepFromFailureDetail":"anthropic/claude-sonnet-4-6 ended with an incomplete terminal response","fallbackStepChainPosition":1,"fallbackStepFinalOutcome":"chain_exhausted","isPrimary":true,"requestedModelMatched":true,"fallbackConfigured":false} model fallback decision
03:35:33+00:00 info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ health 65ms cached=true conn=828da172…b776 id=BD2785F2…E93A

frozen flumeBOT
#

Good, that log is the smoking gun. This is not Discord and probably not browser headless directly.

Your agent is reaching Anthropic, but anthropic/claude-sonnet-4-6 is returning a terminal stop with zero visible assistant payloads:

empty response retries exhausted
incomplete turn detected ... stopReason=stop payloads=0
fallbackConfigured=false

So OpenClaw has nothing to send. That’s why Control UI and Discord both look like “working…” and then no useful reply.

Try this isolation test:

openclaw infer model run --gateway --model anthropic/claude-sonnet-4-6 --prompt "reply exactly: ok" --json
#
openclaw agent --agent main --message "reply exactly: diagnostic ok" --json --timeout 120

Then immediate fix path: switch to any other configured working model, or add a fallback so this doesn’t hard-stop when Sonnet returns empty:

openclaw models status --probe
openclaw models list --plain
openclaw models set <provider/model-that-probes-ok>
# optional but recommended:
openclaw models fallbacks add <another-provider/model-that-probes-ok>
openclaw gateway restart --safe --skip-deferral

I’m deliberately not naming a “best model” here because live model availability changes fast; use models status --probe/models list and #models for current community recommendations.

waxen quest
#

I am just weirded out this would randomly happen.

~/.openclaw % openclaw infer model run --gateway --model anthropic/claude-sonnet-4-6 --prompt "reply exactly: ok" --json
GatewayClientRequestError: FailoverError: anthropic/claude-sonnet-4-6 ended with an incomplete terminal response
~/.openclaw % openclaw agent --agent main --message "reply exactly: diagnostic ok" --json --timeout 120
GatewayClientRequestError: FailoverError: anthropic/claude-sonnet-4-6 ended with an incomplete terminal response

Is it possible that something else changed under the hood? e.g. this should be via an apikey and I feel like the config wizardry loves to try to use the oauth depsite TOS violations. Also feel like antrhopic would be reporting downtime for this to randomly break? What are your thoughts?

frozen flumeBOT
#

Yeah, I’d be weirded out too. But your two tests narrow this a lot:

infer model run --gateway ... => incomplete terminal response
agent ... => incomplete terminal response

That means it’s failing before Discord, before Control UI delivery, before browser tools, and before normal agent complexity. A tiny direct gateway model call to anthropic/claude-sonnet-4-6 is coming back as “finished, but no visible text.”

Could something change under the hood? Yes, a few things are mutable even if you didn’t edit config:

  • Anthropic can change behavior/availability for a specific model or auth lane without a full status-page outage.
  • OpenClaw can rotate/select a different Anthropic auth profile if multiple profiles exist or one got cooled down.
  • A session/model override can pin you to a model/auth route that differs from what you think the default is.
  • Provider-side response shape changes can expose an OpenClaw parser/transport bug.

I would check the auth route next, without dumping secrets:

#
openclaw models auth list --provider anthropic --json
openclaw models auth order get --provider anthropic
openclaw models status --probe --json

Look for whether the active Anthropic profile is API-key-ish vs Claude/OAuth/setup-token-ish, and whether auth.unusableProfiles has anything interesting. Don’t paste raw auth-profiles.json.

If you want to force API-key only, find the API-key profile id from models auth list, then lock the order, e.g.:

openclaw models auth order set --provider anthropic --agent main anthropic:default
openclaw gateway restart --safe --skip-deferral

Use the actual API-key profile id shown on your machine, not blindly anthropic:default if yours differs.

#

My read: this doesn’t smell like “Discord broke” or “browser headless broke chat.” It smells like either Anthropic exact model/auth lane returning empty or OpenClaw 2026.5.7 mishandling that Anthropic response. If locking to the API-key profile still fails on the tiny infer model run, I’d switch models temporarily and file an issue with the exact repro + version: https://github.com/openclaw/openclaw/issues

waxen quest
#

Only really weird thing I saw was as follows - what the heck is anthropic:openai. ???

"probes": {
  "startedAt": 1778557531198,
  "finishedAt": 1778557556833,
  "durationMs": 25635,
  "totalTargets": 8,
  "options": {
    "profileIds": [],
    "timeoutMs": 8000,
    "concurrency": 2,
    "maxTokens": 8
  },
  "results": [
    {
      "provider": "anthropic-openai",
      "label": "env",
      "source": "env",
      "mode": "api_key",
      "status": "no_model",
      "reasonCode": "no_model",
      "error": "No model available for probe"
    },
    {
      "provider": "claude-cli",
      "label": "env",
      "source": "env",
      "mode": "api_key",
      "status": "no_model",
      "reasonCode": "no_model",
      "error": "No model available for probe"
    },
frozen flumeBOT
#

Yeah, that name is cursed-looking 😅

Important distinction: that snippet says:

"provider": "anthropic-openai"

not an auth profile like anthropic:openai.

anthropic-openai looks like an internal/provider-compat entry that knows about ANTHROPIC_API_KEY, but your probe result says:

"status": "no_model",
"error": "No model available for probe"
#

So it is not the thing answering your chats. It found/considered a provider target, but there are no models configured under that provider id, so it skipped probing it. Same with claude-cli in that snippet.

Your actual failing route from the logs/tests is still:

anthropic/claude-sonnet-4-6

So I’d treat anthropic-openai ... no_model as suspicious-looking but probably harmless noise unless your config explicitly references it.

Check whether it’s in your config/state:

openclaw config get models.providers.anthropic-openai
openclaw config get agents.defaults.model
openclaw models auth list --provider anthropic --json
#
openclaw models auth order get --provider anthropic

If models.providers.anthropic-openai is empty/null, ignore that probe row.

What I’m more interested in is the Anthropic profile list/order. If you expected API key usage, lock the API-key profile first/only:

openclaw models auth order set --provider anthropic --agent main <your-api-key-profile-id>
openclaw gateway restart --safe --skip-deferral

Then rerun:

#
openclaw infer model run --gateway --model anthropic/claude-sonnet-4-6 --prompt "reply exactly: ok" --json

If it still returns empty on the API-key profile, this is either Anthropic returning a weird empty response for that model/account/route, or an OpenClaw transport/parser bug for that response.

waxen quest
#

~/.openclaw % openclaw config get models.providers.anthropic-openai
openclaw config get agents.defaults.model
openclaw models auth list --provider anthropic --json

🦞 OpenClaw 2026.5.7 (eeef486)
If something's on fire, I can't extinguish it—but I can write a beautiful postmortem.

Config path not found: models.providers.anthropic-openai

🦞 OpenClaw 2026.5.7 (eeef486)
I'm not saying your workflow is chaotic... I'm just bringing a linter and a helmet.

{
"primary": "anthropic/claude-sonnet-4-6"
}
{
"agentId": "main",
"agentDir": "~/.openclaw/agents/main/agent",
"authStatePath": "~/.openclaw/agents/main/agent/auth-state.json",
"provider": "anthropic",
"profiles": []
}
~/.openclaw % openclaw models auth order get --provider anthropic

🦞 OpenClaw 2026.5.7 (eeef486) — I'm not AI-powered, I'm AI-possessed. Big difference.

Agent: main
Provider: anthropic
Auth state file: ~/.openclaw/agents/main/agent/auth-state.json
Order override: (none)

frozen flumeBOT
#

That’s actually useful: profiles: [] means OpenClaw is not using a saved Anthropic OAuth/auth-profile for main.

If Anthropic calls are still reaching the provider, the credential is almost certainly coming from env/config — usually ANTHROPIC_API_KEY on the gateway host / ~/.openclaw/.env. So your “is it secretly OAuth?” concern looks unlikely from this output.

Also:

Config path not found: models.providers.anthropic-openai

So anthropic-openai is just probe noise from env/provider discovery, not an active configured provider.

The real problem is now very narrow:

agents.defaults.model.primary = "anthropic/claude-sonnet-4-6"
#
profiles = []
fallbacks = none

So one bad response path for that exact Anthropic model via API key breaks every channel.

I’d do one direct Anthropic API sanity check next. This bypasses OpenClaw’s parser entirely:

set -a
[ -f ~/.openclaw/.env ] && . ~/.openclaw/.env
set +a

curl https://api.anthropic.com/v1/messages \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
#
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{"model":"claude-sonnet-4-6","max_tokens":16,"messages":[{"role":"user","content":"reply exactly: ok"}]}'

Interpretation:

  • If curl returns normal text ok → likely OpenClaw bug/regression in Anthropic handling. File with repro + version.
  • If curl returns empty/weird/error → it’s Anthropic/account/model/API-key-side, not OpenClaw delivery.
  • If curl says unknown model / permission / quota → switch primary to a probed-ok model.

Immediate workaround regardless: add a fallback or switch primary:

openclaw models status --probe
openclaw models set <provider/model-that-probes-ok>
#
# or keep Sonnet primary but add a fallback:
openclaw models fallbacks add <provider/model-that-probes-ok>
openclaw gateway restart --safe --skip-deferral

I agree it’s random-feeling, but the evidence says “provider/model/API route started returning no visible content,” not Discord/browser/auth-profile weirdness.

waxen quest
#

~/.openclaw % set -a
[ -f ~/.openclaw/.env ] && . ~/.openclaw/.env
set +a

curl https://api.anthropic.com/v1/messages
-H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01"
-H "content-type: application/json"
-d '{"model":"claude-sonnet-4-6","max_tokens":16,"messages":[{"role":"user","content":"reply exactly: ok"}]}'
{"model":"claude-sonnet-4-6","id":"msg_01ANo9ekG5C6WK2m287NE5Yv","type":"message","role":"assistant","content":[{"type":"text","text":"ok"}],"stop_reason":"end_turn","stop_sequence":null,"stop_details":null,"usage":{"input_tokens":11,"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"cache_creation":{"ephemeral_5m_input_tokens":0,"ephemeral_1h_input_tokens":0},"output_tokens":4,"service_tier":"standard","inference_geo":"global"}}curl: (3) URL rejected: Malformed input to a URL function

frozen flumeBOT
#

That curl result is big: Anthropic API key + model are healthy. The final curl: (3) is just from the accidental \ -H line-continuation/spacing; it already got a valid Anthropic response first.

So now I’d call this an OpenClaw/Gateway Anthropic handling bug or runtime-path regression, not Anthropic downtime and not OAuth. Direct API says:

"content":[{"type":"text","text":"ok"}], "stop_reason":"end_turn"

OpenClaw gateway says the same tiny prompt ended with no visible payload. That’s a very clean repro.

One last useful split before filing:

openclaw infer model run --local --model anthropic/claude-sonnet-4-6 --prompt "reply exactly: ok" --json
openclaw infer model run --gateway --model anthropic/claude-sonnet-4-6 --prompt "reply exactly: ok" --json
#
  • If --local works but --gateway fails → gateway-managed provider/runtime path bug.
  • If both fail but raw curl works → shared OpenClaw Anthropic provider parsing/SDK path bug.

Workaround to get your bot breathing again:

openclaw models set <another-probed-ok-model>
# or add fallback once you pick a working model:
openclaw models fallbacks add <another-probed-ok-model>
openclaw gateway restart --safe --skip-deferral

For the issue, include:

  • OpenClaw 2026.5.7 (eeef486)
  • openclaw infer model run --gateway --model anthropic/claude-sonnet-4-6 --prompt "reply exactly: ok" --json fails with incomplete terminal response
  • raw Anthropic /v1/messages with same API key/model returns text ok
  • models auth list --provider anthropic shows no profiles, so env API key path
#

That should be actionable for maintainers.

waxen quest
#

I seem to be good at finding esoteric bugs that only impact me and won't reproduce. 🙁

~/.openclaw % openclaw infer model run --local --model anthropic/claude-sonnet-4-6 --prompt "reply exactly: ok" --json
openclaw infer model run --gateway --model anthropic/claude-sonnet-4-6 --prompt "reply exactly: ok" --json
{
"ok": true,
"capability": "model.run",
"transport": "local",
"provider": "anthropic",
"model": "claude-sonnet-4-6",
"attempts": [],
"outputs": [
{
"text": "ok",
"mediaUrl": null
}
]
}
GatewayClientRequestError: FailoverError: anthropic/claude-sonnet-4-6 ended with an incomplete terminal response

frozen flumeBOT
#

Oof. I know that feeling — but this one is not a vibes-only ghost bug. You now have a crisp split:

local provider path:   works
running Gateway path:  fails

That means your API key/model are fine, and the installed CLI code can parse Anthropic fine. The bad state is specifically in the currently running Gateway process: stale process, different env/config, runtime state, or a gateway-only regression.

First thing I’d try, because it’s cheap and very plausible:

openclaw gateway restart --safe --skip-deferral
openclaw infer model run --gateway --model anthropic/claude-sonnet-4-6 --prompt "reply exactly: ok" --json

If that fixes it, it was a wedged/stale Gateway runtime after config hot-reload / browser fiddling.

#

If it still fails, check whether the service is actually running the same install/version/config as your CLI:

openclaw gateway status
openclaw health --verbose
openclaw status --all

On macOS especially, I’d look for a stale LaunchAgent / nvm path situation: CLI says 2026.5.7, but the Gateway service can still be launched from an older global Node/npm path if Node versions changed. gateway status usually gives clues about the service command/path.

For a maintainer-quality repro, your current evidence is already strong:

# works
openclaw infer model run --local --model anthropic/claude-sonnet-4-6 --prompt "reply exactly: ok" --json
#
# fails
openclaw infer model run --gateway --model anthropic/claude-sonnet-4-6 --prompt "reply exactly: ok" --json

If restart doesn’t fix it, file that. It’s way more reproducible than “my bot stopped replying.” It says: “Gateway Anthropic path returns incomplete terminal response while local provider path and raw Anthropic API both return text.”

waxen quest
#

~/.openclaw % openclaw gateway restart --safe --skip-deferral
openclaw infer model run --gateway --model anthropic/claude-sonnet-4-6 --prompt "reply exactly: ok" --json
error: unknown option '--skip-deferral'
GatewayClientRequestError: FailoverError: anthropic/claude-sonnet-4-6 ended with an incomplete terminal response

frozen flumeBOT
#

Yep, that one’s on me — I checked newer docs than your installed 2026.5.7. In your build, --skip-deferral doesn’t exist yet. Sorry about that.

Since the restart command errored, the Gateway did not restart, so the follow-up gateway probe was testing the same broken process.

Use the supported restart path for your version:

openclaw gateway restart --safe

If that says it’s waiting/blocked or doesn’t actually restart, check available flags:

openclaw gateway restart --help
openclaw gateway status
#

If --force is listed, the “unstick it now” version is likely:

openclaw gateway restart --force

Then rerun:

openclaw infer model run --gateway --model anthropic/claude-sonnet-4-6 --prompt "reply exactly: ok" --json

The key thing: we need a real Gateway process restart, because your local provider path works and only the running gateway path is bad.

waxen quest
#

btrahan ~/.openclaw % openclaw gateway restart --force

🦞 OpenClaw 2026.5.7 (eeef486) — I'll refactor your busywork like it owes me money.

21:00:24 [restart] killing 1 stale gateway process(es) before restart: 89762
Restarted LaunchAgent: gui/501/ai.openclaw.gateway
btrahan ~/.openclaw % openclaw infer model run --gateway --model anthropic/claude-sonnet-4-6 --prompt "reply exactly: ok" --json
GatewayClientRequestError: FailoverError: anthropic/claude-sonnet-4-6 ended with an incomplete terminal response

frozen flumeBOT
#

That error is useful: it means Discord probably isn’t the first thing broken. The Gateway is able to start, but a gateway-managed model smoke test is ending with no visible assistant text, so channel replies would fail everywhere too.

Please run these in order and paste the output/log excerpt:

openclaw infer model run --local --model anthropic/claude-sonnet-4-6 --prompt "reply exactly: ok" --json
openclaw status --all
openclaw logs --follow

Then trigger one failing message / rerun the --gateway probe while logs are following, and paste the lines around Agent couldn't generate a response, incomplete_result, anthropic, or FailoverError.

How to read it:

  • if --local fails too → Anthropic auth/model/provider issue.
  • if --local succeeds but --gateway fails → Gateway embedded runtime/config/state issue.
  • if model probes succeed but channels don’t reply → then check channel transport with:
waxen quest
#

The output from these commands is far too long to send you over Discord. Can you revise the strategy so I can cut and paste some useful information over to you?

frozen flumeBOT
#

Yep — let’s shrink this to the useful bits.

Please paste only the output from these:

openclaw --version
openclaw gateway status
openclaw infer model run --local --model anthropic/claude-sonnet-4-6 --prompt "reply exactly: ok" --json 2>&1 | tail -80
openclaw infer model run --gateway --model anthropic/claude-sonnet-4-6 --prompt "reply exactly: ok" --json 2>&1 | tail -120

Then grab a focused log slice instead of the full log:

LOG="$(ls -t /tmp/openclaw/openclaw-*.log | head -1)"
echo "LOG=$LOG"
#
grep -Ei "incomplete terminal|Agent couldn't generate|FailoverError|anthropic|claude-sonnet|stopReason|error|429|401|403|500" "$LOG" | tail -120

If that still feels too big, paste just:

  • the --local result
  • the --gateway result
  • the last ~40 lines from the grep

Please quickly redact API keys/tokens if any show up, though OpenClaw logs should usually redact them. 🦞

waxen quest
#

~/.openclaw % openclaw --version
openclaw gateway status
openclaw infer model run --local --model anthropic/claude-sonnet-4-6 --prompt "reply exactly: ok" --json 2>&1 | tail -80
openclaw infer model run --gateway --model anthropic/claude-sonnet-4-6 --prompt "reply exactly: ok" --json 2>&1 | tail -120
OpenClaw 2026.5.7 (eeef486)

🦞 OpenClaw 2026.5.7 (eeef486) — Half butler, half debugger, full crustacean.



Service: LaunchAgent (loaded)
File logs: /tmp/openclaw/openclaw-2026-05-11.log
Command: /opt/homebrew/opt/node/bin/node /opt/homebrew/lib/node_modules/openclaw/dist/index.js gateway --port 18789
Service file: ~/Library/LaunchAgents/ai.openclaw.gateway.plist
Working dir: ~/.openclaw
Service env: OPENCLAW_GATEWAY_PORT=18789

Config (cli): ~/.openclaw/openclaw.json
Config (service): ~/.openclaw/openclaw.json

Gateway: bind=loopback (127.0.0.1), port=18789 (service args)
Probe target: ws://127.0.0.1:18789
Dashboard: http://127.0.0.1:18789/
Probe note: Loopback-only gateway; only local clients can connect.

Runtime: running (pid 91028, state active)
Connectivity probe: ok
Capability: admin-capable

Listening: 127.0.0.1:18789
Troubles: run openclaw status
Troubleshooting: https://docs.openclaw.ai/troubleshooting
{
"ok": true,
"capability": "model.run",
"transport": "local",
"provider": "anthropic",
"model": "claude-sonnet-4-6",
"attempts": [],
"outputs": [
{
"text": "ok",
"mediaUrl": null
}
]
}
GatewayClientRequestError: FailoverError: anthropic/claude-sonnet-4-6 ended with an incomplete terminal response

frozen flumeBOT
#

Perfect, that narrows it a lot: Anthropic auth + the model itself are OK. The break is specifically in the Gateway-managed embedded runtime path.

Next, paste this focused log slice only:

LOG="/tmp/openclaw/openclaw-2026-05-11.log"

grep -Ei "model.run|incomplete terminal|incomplete_result|Agent couldn't generate|FailoverError|anthropic|claude-sonnet|stopReason|finalAssistant|harness|bootstrap|session" "$LOG" | tail -160

And one more small check:

openclaw health --json 2>&1 | tail -80

If the grep output is still huge, paste the last 60 lines only. I’m looking for whether the gateway runtime is returning stop with empty text, getting stuck in bootstrap/session state, or swallowing the final assistant payload.

waxen quest
#

["error_handling","model_fallback","candidate_failed"],"runId":"a76ee978-ca88-4ac3-b037-64c033e209b3","decision":"candidate_failed","requestedProvider":"anthropic","requestedModel":"claude-sonnet-4-6","candidateProvider":"anthropic","candidateModel":"claude-sonnet-4-6","attempt":1,"total":1,"reason":"format","code":"incomplete_result","errorPreview":"anthropic/claude-sonnet-4-6 ended with an incomplete terminal response","errorHash":"sha256:f7accb6d21ef","fallbackStepType":"fallback_step","fallbackStepFromModel":"anthropic/claude-sonnet-4-6","fallbackStepFromFailureReason":"format","fallbackStepFromFailureDetail":"anthropic/claude-sonnet-4-6 ended with an incomplete terminal response","fallbackStepChainPosition":1,"fallbackStepFinalOutcome":"chain_exhausted","isPrimary":true,"requestedModelMatched":true,"fallbackConfigured":false},"2":"model fallback decision","_meta":{"runtime":"node","runtimeVersion":"26.0.0","hostname":"unknown","name":"{"subsystem":"model-fallback/decision"}","parentNames":["openclaw"],"date":"2026-05-12T04:08:39.127Z","logLevelId":4,"logLevelName":"WARN","path":{"fullFilePath":"file:///opt/homebrew/lib/node_modules/openclaw/dist/subsystem-CxWoQXRD.js:149:51","fileName":"subsystem-CxWoQXRD.js","fileNameWithLine":"subsystem-CxWoQXRD.js:149","fileColumn":"51","fileLine":"149","filePath":"/opt/homebrew/lib/node_modules/openclaw/dist/subsystem-CxWoQXRD.js","filePathWithLine":"/opt/homebrew/lib/node_modules/openclaw/dist/subsystem-CxWoQXRD.js:149","method":"logToFile"}},"time":"2026-05-11T21:08:39.127-07:00","hostname":"Mac.bad","message":"model fallback decision","traceId":"e011e06608feffcc107c6c8183a93cad","spanId":"9bf37be74468c2f3","traceFlags":"01"}
{"0":"{"subsystem":"gateway/ws"}","1":"⇄ res ✗ agent errorCode=UNAVAILABLE errorMessage=FailoverError: anthropic/claude-sonnet-4-6 ended with an incomplete terminal response runId=a76ee978-ca88-4ac3-b037-64c033e209b3 error=FailoverError: anthropic/clau

frozen flumeBOT
#

That confirms the classifier is only seeing an empty/invalid terminal payload. One thing jumps out from your log: the Gateway is running on runtimeVersion":"26.0.0" via /opt/homebrew/opt/node/bin/node.

I’d try Node 22 LTS for the gateway. Node 26 is very new, and the failure being local provider OK / gateway embedded runtime empty is exactly the kind of thing I’d want to eliminate first.

First, paste this tiny check:

node --version
/opt/homebrew/opt/node/bin/node --version

If that says v26.x, switch the service to Node 22 and reinstall/restart the LaunchAgent:

brew install node@22
brew unlink node
#
brew link --overwrite --force node@22
hash -r
node --version
openclaw gateway install --force
openclaw gateway restart --force
openclaw gateway status
openclaw infer model run --gateway --model anthropic/claude-sonnet-4-6 --prompt "reply exactly: ok" --json

If you don’t want to relink Homebrew globally, don’t do that yet — paste the two node-version lines and we can choose a less invasive route.

If Node 22 doesn’t fix it, next useful log slice is just this run-id scoped context:

RUN="a76ee978-ca88-4ac3-b037-64c033e209b3"
#
LOG="/tmp/openclaw/openclaw-2026-05-11.log"
grep -F "$RUN" "$LOG" | tail -80

But I’d chase the Node 26 mismatch first.

waxen quest
#

~/.openclaw % node --version
/opt/homebrew/opt/node/bin/node --version
v26.0.0
v26.0.0

Well, if this is the issue, really weird it would show up in the middle of my session like this? been just vibing for hours and then blammo - screeching hault.

frozen flumeBOT
#

Yeah, that does feel weird — but it can happen if the running Gateway was fine for hours, then after openclaw gateway restart --force the LaunchAgent relaunched under whatever /opt/homebrew/opt/node/bin/node points at now. So the break may not have happened “mid-session” so much as “the restart picked up Node 26 and the gateway runtime changed underneath you.”

I’m not saying Node 26 is definitely the root cause, but it’s a very plausible first variable to eliminate. OpenClaw supports Node >=22.16.0; 22 LTS is the boring-good choice here.

Less invasive than relinking all Homebrew node:

brew install node@22
PATH="/opt/homebrew/opt/node@22/bin:$PATH" node --version
PATH="/opt/homebrew/opt/node@22/bin:$PATH" openclaw gateway install --force
openclaw gateway restart --force
openclaw gateway status
openclaw infer model run --gateway --model anthropic/claude-sonnet-4-6 --prompt "reply exactly: ok" --json

After gateway status, check the Command: line. We want it to point at node@22, not /opt/homebrew/opt/node/bin/node if that still resolves to v26.

#

If it still fails on Node 22, then we stop blaming runtime weirdness and dig into the run-id scoped logs.

waxen quest
#

Service: LaunchAgent (loaded)
File logs: /tmp/openclaw/openclaw-2026-05-11.log
Command: /opt/homebrew/opt/node@22/bin/node /opt/homebrew/lib/node_modules/openclaw/dist/index.js gateway --port 18789
Service file: ~/Library/LaunchAgents/ai.openclaw.gateway.plist
Working dir: ~/.openclaw
Service env: OPENCLAW_GATEWAY_PORT=18789

Config (cli): ~/.openclaw/openclaw.json
Config (service): ~/.openclaw/openclaw.json

Gateway: bind=loopback (127.0.0.1), port=18789 (service args)
Probe target: ws://127.0.0.1:18789
Dashboard: http://127.0.0.1:18789/
Probe note: Loopback-only gateway; only local clients can connect.

Runtime: running (pid 91989, state active)
Connectivity probe: ok
Capability: admin-capable

Listening: 127.0.0.1:18789
Troubles: run openclaw status
Troubleshooting: https://docs.openclaw.ai/troubleshooting
{
"ok": true,
"capability": "model.run",
"transport": "gateway",
"provider": "anthropic",
"model": "claude-sonnet-4-6",
"attempts": [],
"outputs": [
{
"text": "ok",
"mediaUrl": null
}
]
}

frozen flumeBOT
#

Nice — that’s the smoking gun. The gateway was launching under Node 26 after restart; reinstalling the LaunchAgent with Node 22 fixed the gateway model path.

Do one real channel test now. If replies are back, you’re done. If a channel still doesn’t answer, it’s a separate channel transport/session issue, so run:

openclaw channels status --probe

and paste only the failing channel’s block.

I’d also keep Node 22 pinned for OpenClaw for now. The weird “worked for hours then died” fits the restart picking up a newer Homebrew node symlink, not Anthropic suddenly breaking mid-session. 🦞

waxen quest
#

Okay, that fixed it. Man, what a bummer that was

frozen flumeBOT
#

Ugh, yeah — that’s a deeply annoying failure mode. Glad we caught it though: after restart, the LaunchAgent picked up Homebrew Node 26, and the Gateway embedded runtime went sideways while local model auth still looked fine.

For future searchers: fix was reinstalling/restarting the gateway service with Node 22:

brew install node@22
PATH="/opt/homebrew/opt/node@22/bin:$PATH" openclaw gateway install --force
openclaw gateway restart --force

Then verify openclaw gateway status shows Command: /opt/homebrew/opt/node@22/bin/node ... and openclaw infer model run --gateway ... returns text.

If this fixed it, please right-click the solution message → Apps → ✅ Mark Solution so Answer Overflow indexes it for the next poor soul who gets crab-walked by Node 26. 🦞

waxen quest
#

Can you please give me a prompt to hand to my openclaw agent to keep carrying this fix forward? Homeboys first note was "qmd is broken - node version mismatch" so he needs helping managing node versions moving forward until this is resolved

frozen flumeBOT
#

Absolutely — hand your agent this:

Context: OpenClaw stopped replying across all channels after a gateway restart. Local model probe worked, but gateway model probe failed with:

GatewayClientRequestError: FailoverError: anthropic/claude-sonnet-4-6 ended with an incomplete terminal response

Diagnosis: Anthropic auth/model were fine. The Gateway LaunchAgent had started under Homebrew Node v26.0.0 (`/opt/homebrew/opt/node/bin/node`). Reinstalling the Gateway LaunchAgent with Node 22 fixed it.

Known-good fix:

brew install node@22
PATH="/opt/homebrew/opt/node@22/bin:$PATH" openclaw gateway install --force
openclaw gateway restart --force
openclaw gateway status
openclaw infer model run --gateway --model anthropic/claude-sonnet-4-6 --prompt "reply exactly: ok" --json
#

Success criteria:
- `openclaw gateway status` shows Command using `/opt/homebrew/opt/node@22/bin/node`
- gateway model probe returns `{ text: "ok" }`
- channels resume replying

Please help me keep OpenClaw pinned to Node 22 LTS until Node 26 compatibility is confirmed. Before any gateway restart/reinstall/update, check:

node --version
/opt/homebrew/opt/node@22/bin/node --version
openclaw gateway status

If the LaunchAgent command points at Node 26 or `/opt/homebrew/opt/node/bin/node`, reinstall the gateway service with Node 22 in PATH. Do not assume `openclaw --version` is enough; the service runtime Node is what matters.

If this reappears, first compare: