#Subagents keep failing and are unreliable
1 messages Β· Page 1 of 1 (latest)
Yeah, there are related issues, especially around sessions_spawn + ACP/Claude Code and βquietβ long-running child sessions timing out. The pattern βmain thread stays active, subagents dieβ also fits proxy/concurrency pressure.
First thing: can you paste:
openclaw --version
openclaw status --all
And from the Discord thread/session:
/subagents list
/subagents info <id>
/subagents log <id> 200 tools
A quick mitigation to try: serialize subagents temporarily:
{
agents: {
defaults: {
subagents: {
maxConcurrent: 1,
runTimeoutSeconds: 900,
},
},
},
}
Subagents use one dedicated subagent queue lane and default maxConcurrent is 8, but docs warn they still share gateway resources, so maxConcurrent is the safety valve. If your βCLI proxy to Claude Codeβ is a single local Claude Code/proxy process, parallel child runs can easily be the thing that tips it over. (docs.openclaw.ai)
Known related bugs I found: ACP child sessions hanging with Claude Code/no output, ACP spawn failures after Task Flow changes, and provider streaming idle drops that show up as subagent timeouts. (github.com)
Also: if this is claude-max-api-proxy, that path is a community OpenAI-compatible proxy over Claude Code, not an officially supported Anthropic/OpenClaw path, so Iβd treat it as more fragile under parallel load. (docs.openclaw.ai)
OpenClaw status --all
Overview
ββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Item β Value β
ββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Version β 2026.4.26 β
β OS β macos 26.2 (arm64) β
β Node β 24.13.1 β
β Config β ~/.openclaw/openclaw.json β
β Dashboard β http://127.0.0.1:18789/ β
β Tailscale exposure β off β
β Channel β stable (default) β
β Update β pnpm Β· up to date Β· npm latest 2026.4.26 β
β Gateway β local Β· ws://127.0.0.1:18789 (local loopback) Β· reachable 75ms Β· auth token Β· Keyans-Mac-mini. β
β β local (192.168.1.166) app 2026.4.26 macos 26.2 β
β Security β Run: openclaw security audit --deep β
β Gateway self β unknown β
β Gateway service β LaunchAgent installed Β· loaded Β· running (pid 3980, state active) β
β Node service β LaunchAgent not installed β
β Agents β 3 total Β· 1 bootstrapping Β· 1 active Β· 352 sessions β
β Secrets β none β
ββββββββββββββββββββββ΄ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Channels
ββββββββββββ¬ββββββββββ¬βββββββββ¬βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Channel β Enabled β State β Detail β
ββββββββββββΌββββββββββΌβββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
ββββββββββββ΄ββββββββββ΄βββββββββ΄βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Agents
βββββββββββββββββββββ¬βββββββββββββββββ¬βββββββββββ¬βββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Agent β Bootstrap file β Sessions β Active β Store β
βββββββββββββββββββββΌβββββββββββββββββΌβββββββββββΌβββββββββββΌββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β main (Vash) β ABSENT β 292 β 1m ago β ~/.openclaw/agents/main/sessions/sessions.json β
β bond (Bond) β ABSENT β 59 β 5h ago β ~/.openclaw/agents/bond/sessions/sessions.json β
β pikabot (PikaBot) β PRESENT β 1 β 44d ago β ~/.openclaw/agents/pikabot/sessions/sessions.json β
βββββββββββββββββββββ΄βββββββββββββββββ΄βββββββββββ΄βββββββββββ΄ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Diagnosis (read-only)
@glacial estuary i have more that I can't paste
Thanks β status helps. This doesnβt look like basic gateway/service breakage; 2026.4.26 is running and reachable. The likely failure is in the child-run/provider path.
Most suspicious match: thereβs an open issue where subagents die as βtimed outβ because the provider/proxy streaming connection goes idle around ~60s; runTimeoutSeconds doesnβt help because the drop is below OpenClawβs outer run timer. That issue specifically involved claude-opus via a custom-localhost proxy, so your Claude Code proxy setup is very close. (github.com)
Two things Iβd try first:
{
agents: {
defaults: {
subagents: {
maxConcurrent: 1,
runTimeoutSeconds: 0, // no OpenClaw outer subagent timeout while testing
},
},
},
}
Subagents default to a dedicated lane with maxConcurrent: 8, and docs explicitly say they still share gateway resources β so with a single Claude Code/proxy backend, serializing is the right first test. (docs.openclaw.ai)
If your proxy is configured as an OpenAI-compatible/custom provider, also consider raising the provider HTTP timeout, replacing <providerId> with the actual provider id:
{
models: {
providers: {
"<providerId>": {
timeoutSeconds: 300,
},
},
},
}
That only helps if OpenClawβs provider/request guard is whatβs aborting; if Claude Code/proxy itself drops idle streams, split tasks smaller or keep concurrency at 1. Provider timeout config is documented under models.providers.*.timeoutSeconds. (docs.openclaw.ai)
Can you paste one failed runβs details?
/subagents list
/subagents info <id>
/subagents log <id> 200 tools
Also useful:
openclaw logs --limit 500 --plain | grep -Ei 'subagent|acp|acpx|timeout|idle|provider|profileFailureReason'
One extra note: your status shows pikabot has BOOTSTRAP present, but if the failing subagents are under main/bond, thatβs probably unrelated.