#Subagents keep failing and are unreliable

1 messages Β· Page 1 of 1 (latest)

fossil leaf
#

All subagents end up timing out or failing to complete a job. The main thread stays active. This is with CLI proxy to claude code opus 4.7. The issue began after the latest update. Any known issues with subagents?

glacial estuaryBOT
#

Yeah, there are related issues, especially around sessions_spawn + ACP/Claude Code and β€œquiet” long-running child sessions timing out. The pattern β€œmain thread stays active, subagents die” also fits proxy/concurrency pressure.

First thing: can you paste:

openclaw --version
openclaw status --all

And from the Discord thread/session:

/subagents list
/subagents info <id>
/subagents log <id> 200 tools
#

A quick mitigation to try: serialize subagents temporarily:

{
  agents: {
    defaults: {
      subagents: {
        maxConcurrent: 1,
        runTimeoutSeconds: 900,
      },
    },
  },
}

Subagents use one dedicated subagent queue lane and default maxConcurrent is 8, but docs warn they still share gateway resources, so maxConcurrent is the safety valve. If your β€œCLI proxy to Claude Code” is a single local Claude Code/proxy process, parallel child runs can easily be the thing that tips it over. (docs.openclaw.ai)

#

Known related bugs I found: ACP child sessions hanging with Claude Code/no output, ACP spawn failures after Task Flow changes, and provider streaming idle drops that show up as subagent timeouts. (github.com)

Also: if this is claude-max-api-proxy, that path is a community OpenAI-compatible proxy over Claude Code, not an officially supported Anthropic/OpenClaw path, so I’d treat it as more fragile under parallel load. (docs.openclaw.ai)

GitHub

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞 - Issues · openclaw/openclaw

fossil leaf
# glacial estuary Known related bugs I found: ACP child sessions hanging with Claude Code/no outpu...

OpenClaw status --all

Overview
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Item β”‚ Value β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Version β”‚ 2026.4.26 β”‚
β”‚ OS β”‚ macos 26.2 (arm64) β”‚
β”‚ Node β”‚ 24.13.1 β”‚
β”‚ Config β”‚ ~/.openclaw/openclaw.json β”‚
β”‚ Dashboard β”‚ http://127.0.0.1:18789/ β”‚
β”‚ Tailscale exposure β”‚ off β”‚
β”‚ Channel β”‚ stable (default) β”‚
β”‚ Update β”‚ pnpm Β· up to date Β· npm latest 2026.4.26 β”‚
β”‚ Gateway β”‚ local Β· ws://127.0.0.1:18789 (local loopback) Β· reachable 75ms Β· auth token Β· Keyans-Mac-mini. β”‚
β”‚ β”‚ local (192.168.1.166) app 2026.4.26 macos 26.2 β”‚
β”‚ Security β”‚ Run: openclaw security audit --deep β”‚
β”‚ Gateway self β”‚ unknown β”‚
β”‚ Gateway service β”‚ LaunchAgent installed Β· loaded Β· running (pid 3980, state active) β”‚
β”‚ Node service β”‚ LaunchAgent not installed β”‚
β”‚ Agents β”‚ 3 total Β· 1 bootstrapping Β· 1 active Β· 352 sessions β”‚
β”‚ Secrets β”‚ none β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Channels
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Channel β”‚ Enabled β”‚ State β”‚ Detail β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Agents
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Agent β”‚ Bootstrap file β”‚ Sessions β”‚ Active β”‚ Store β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ main (Vash) β”‚ ABSENT β”‚ 292 β”‚ 1m ago β”‚ ~/.openclaw/agents/main/sessions/sessions.json β”‚
β”‚ bond (Bond) β”‚ ABSENT β”‚ 59 β”‚ 5h ago β”‚ ~/.openclaw/agents/bond/sessions/sessions.json β”‚
β”‚ pikabot (PikaBot) β”‚ PRESENT β”‚ 1 β”‚ 44d ago β”‚ ~/.openclaw/agents/pikabot/sessions/sessions.json β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Diagnosis (read-only)

#

@glacial estuary i have more that I can't paste

glacial estuaryBOT
#

Thanks β€” status helps. This doesn’t look like basic gateway/service breakage; 2026.4.26 is running and reachable. The likely failure is in the child-run/provider path.

Most suspicious match: there’s an open issue where subagents die as β€œtimed out” because the provider/proxy streaming connection goes idle around ~60s; runTimeoutSeconds doesn’t help because the drop is below OpenClaw’s outer run timer. That issue specifically involved claude-opus via a custom-localhost proxy, so your Claude Code proxy setup is very close. (github.com)

Two things I’d try first:

{
  agents: {
    defaults: {
      subagents: {
        maxConcurrent: 1,
        runTimeoutSeconds: 0, // no OpenClaw outer subagent timeout while testing
      },
    },
  },
GitHub

Problem When a subagent is processing a task that requires extended thinking (e.g., reading large codebases, complex analysis), the LLM provider's streaming connection may have an idle timeout ...

#
}

Subagents default to a dedicated lane with maxConcurrent: 8, and docs explicitly say they still share gateway resources β€” so with a single Claude Code/proxy backend, serializing is the right first test. (docs.openclaw.ai)

If your proxy is configured as an OpenAI-compatible/custom provider, also consider raising the provider HTTP timeout, replacing <providerId> with the actual provider id:

{
  models: {
    providers: {
      "<providerId>": {
        timeoutSeconds: 300,
      },
    },
#
  },
}

That only helps if OpenClaw’s provider/request guard is what’s aborting; if Claude Code/proxy itself drops idle streams, split tasks smaller or keep concurrency at 1. Provider timeout config is documented under models.providers.*.timeoutSeconds. (docs.openclaw.ai)

Can you paste one failed run’s details?

/subagents list
/subagents info <id>
/subagents log <id> 200 tools

Also useful:

#
openclaw logs --limit 500 --plain | grep -Ei 'subagent|acp|acpx|timeout|idle|provider|profileFailureReason'

One extra note: your status shows pikabot has BOOTSTRAP present, but if the failing subagents are under main/bond, that’s probably unrelated.