#Spawned sub agents always fail on selected model

1 messages · Page 1 of 1 (latest)

winter shoreBOT Feb 12, 2026, 6:58 AM

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

summer gladeBOT Feb 12, 2026, 7:06 AM

What you’re seeing is usually OpenClaw doing exactly what it’s designed to do: treat a provider timeout as a failover-worthy failure, put the current auth profile into a cooldown/disabled state, then walk fallbacks.

Key bits of how it works (docs):

Failover is (1) auth-profile rotation inside the provider → (2) model fallback via agents.defaults.model.fallbacks.
https://docs.openclaw.ai/concepts/model-failover
sessions_spawn (sub-agents) still participate in the same failover logic, and your explicit sessions_spawn.model can be ignored if the model isn’t valid/allowed.
https://docs.openclaw.ai/tools/subagents

Settings / states that commonly “trip” failover when you think it shouldn’t

The Anthropic profile is in cooldown/disabled (even if your account has quota)

OpenClaw persists cooldown/disabled state in the agent’s auth-profiles.json under usageStats (cooldownUntil / disabledUntil). A single bad streak can cause “why is it skipping my main provider?” confusion.
Inspect it with:
```
openclaw models status --json
```
and look at auth.unusableProfiles. (This is explicitly called out in the FAQ + provider docs.)

Your “quota limit” errors are actually on the fallback providers
Even if Anthropic is fine, if your agents.defaults.model.fallbacks includes (say) OpenRouter/Gemini/etc and those aren’t authenticated / funded, you’ll see failures that look like “quota” (or “no API key”) when the fallback kicks in.

If you want “Claude only, no wandering”, set fallbacks to empty / remove non-Claude entries.

Your sub-agent model override is being rejected
From the sub-agent docs: if sessions_spawn.model is an invalid value, it’s skipped and the sub-agent runs on the default model (with a warning in the tool result). This can make it look like “I pinned Claude but it still rotated”.

Make sure the model you pass is:

an exact provider/model string (e.g. anthropic/claude-opus-4-5), and
present in agents.defaults.models (that catalog acts like an allowlist for selection).

You’re debugging the wrong step: sub-agent run vs announce step
sessions_spawn is “run sub-agent” then “announce back”. Either of those can hit a timeout/failover. If the announce step fails, it can look like “the sub-agent failed” even if the work basically completed.

Use:

/subagents list

/subagents log <id> [limit] [tools]

Timeouts: what’s actually configurable (and the defaults)

These are the ones that most often matter for this symptom:

Agent runtime timeout (hard abort of a run):
agents.defaults.timeoutSeconds default 600s (10 min)
Docs: https://docs.openclaw.ai/concepts/agent-loop
agent.wait timeout (wait-only, does not stop the run):
default 30s, override via timeoutMs on the wait call
Docs: https://docs.openclaw.ai/concepts/agent-loop
Sub-agent run timeout (optional):
sessions_spawn.runTimeoutSeconds default 0 (no limit)
Docs: https://docs.openclaw.ai/tools/subagents

Other common tool timeouts (usually not the cause of “provider timeout”, but good to know):

tools.web.search.timeoutSeconds default 30
tools.web.fetch.timeoutSeconds default 30
tools.exec.timeoutSec default 1800
media understanding timeouts default 60/60/120 (image/audio/video)

(Those defaults are documented in gateway configuration: https://docs.openclaw.ai/gateway/configuration)

“Recommended” values if you’re using Claude

Start by not changing anything unless you’ve confirmed you’re hitting the agent runtime abort.
If you do long multi-step runs (tools + analysis), bump:
- agents.defaults.timeoutSeconds: 600 → 1200 (or 1800 if you often do long tool chains)
Keep web/tool timeouts near defaults unless you know they’re the bottleneck.

OpenClaw does not document a per-provider HTTP request timeout knob for Anthropic in the main config; the “provider timeout” you see is typically coming from the underlying request/stream timing out or being aborted by the run timeout.

#Spawned sub agents always fail on selected model

Settings / states that commonly “trip” failover when you think it shouldn’t

Timeouts: what’s actually configurable (and the defaults)

“Recommended” values if you’re using Claude

The 3 things I’d like you to paste (so this becomes obvious fast)