#Getting work done with agents

157 messages · Page 1 of 1 (latest)

hidden elbow Apr 13, 2026, 3:27 PM

Hey Krill — running a 20-agent production hive on v2026.4.2 (one instance on v2026.4.9). Need your best practices and guidance on pushing OpenClaw hard. Our setup: 20 agents, 3 gateway instances on one VPS. GPT-5.4 primary (600K context), GPT-5.3-Codex (300K), Gemini (200K). Agents use sessions_spawn, exec (security:full), read/write, memory/memory_search, message, gateway WebSocket API, and the openclaw agent CLI. We use Discord/Telegram as surfaces. Config: compaction.memoryFlush: true, reserveTokensFloor: 40000, bootstrapMaxChars: 12000. No nightly resets — sessions persist until manual /new. Questions: 1. Async / parallel agent work Right now sessions_spawn is synchronous — Agent A blocks until Agent B finishes. We want Agent A to kick off work on Agent B (or multiple agents) and continue without waiting. Fire-and-forget spawning. - Is there a way to do async sessions_spawn? - Can an agent spawn itself into a separate session to offload heavy work? - Can an agent run multiple spawns in parallel? - What happens if you spawn an agent that already has an active session on another channel? - Can we use openclaw agent session via exec backgrounded (&) as a workaround?

wait for part 2 of the messge below

thin sailBOT Apr 13, 2026, 3:27 PM

buoyant ploverBOT Apr 13, 2026, 3:27 PM

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

thin sailBOT Apr 13, 2026, 3:27 PM

stable plinthBOT Apr 13, 2026, 3:28 PM

Got it, I’ll wait for part 2.

hidden elbow Apr 13, 2026, 3:28 PM

2. Session management at scale Our busiest agent hit a 17 MB session file today and started compaction-looping (compaction fires but can't shrink below the limit because tool outputs are dense). Context overflow at 13:48, 14:13, 14:29 UTC. - What's the recommended way to prevent sessions from bloating when agents do heavy exec work (build logs, file reads, curl responses)? - Is there a max session file size setting or auto-reset threshold? - Can we trigger /new programmatically via the gateway API or CLI? - Any compaction improvements between v2026.4.2 and v2026.4.9? 3. Best practices for a large hive Open floor — what should we be doing that we might not be? Config tweaks, session lifecycle, tool usage patterns, memory optimization, anything. We want to squeeze maximum productivity out of 20 agents running heavy workloads (shell commands, code deploys, API calls, file operations) without hitting the walls we're hitting. 4. What's new or coming? Any v2026.4.9+ features we should upgrade for? Anything on the roadmap for background execution, async spawning, or session size management? >

and can an agent spawn itself to complete workk then task it to feedback when done thyem give me feedback when complete etc via discord or so

stable plinthBOT Apr 13, 2026, 3:38 PM

Let me verify the session lifecycle bits and the 4.2 → 4.9 changes before I give you the production guidance.

I checked the docs and source for v2026.4.2 and v2026.4.9.

Short version: yes, the pattern you want is supported, but I’d upgrade the whole hive off 4.2. Your biggest wins are shorter-lived sessions, aggressive tool-result pruning, and worker subagents with lightContext.

1) Async / parallel agent work

Is `sessions_spawn` fire-and-forget?

Yes. In both v2026.4.2 and v2026.4.9, sessions_spawn returns immediately after the child is queued/registered, with status: "accepted", runId, and childSessionKey.

So if you’re seeing “Agent A waits for Agent B to finish”, that is not the intended runtime behavior. Usually that means one of these:

the parent agent is choosing to wait/poll
you’re actually on a different path, like sessions_send or custom relay logic
the spawn is doing setup work first, like thread bind, attachments, session patch, then returning, so it feels synchronous even though it is not waiting for completion

Can an agent spawn itself?

Yes. In fact, the default subagents.allowAgents behavior is effectively same-agent only, so self-spawn is the normal default.

That means this pattern is valid:

main agent gets heavy task
main agent sessions_spawns itself into agent:<sameAgentId>:subagent:<uuid>
child works in isolated session
child auto-announces result back to requester chat when done

Can an agent run multiple spawns in parallel?

Yes. There is a built-in cap:

maxChildrenPerAgent, default 5
maxSpawnDepth, default 1

So parallel fan-out works, but it is bounded.

What if the target agent already has an active session on another channel?

It does not hijack that session.

A spawn creates a new isolated child session like:

agent:<agentId>:subagent:<uuid>

So the agent can have:

its normal Discord/Telegram sessions
plus one or more spawned worker sessions

The main caveat is shared workspace/process contention, not session collision. If two sessions mutate the same repo/files at once, you can still step on yourself.

Can we use `openclaw agent ... &` via `exec` as a workaround?

Technically, sure. I would not use it as the primary pattern.

Why I’m wary of it:

it bypasses normal subagent lifecycle tracking

no built-in completion announce flow
no subagent registry / steer / kill ergonomics
easier to orphan processes
still easy to bloat the parent transcript with exec output

Use sessions_spawn for OpenClaw-to-OpenClaw work, and runtime: "acp" if you’re really spawning an external coding harness.

Best pattern for “spawn, continue, then tell me when done”

Yes, this works now.

Use:

sessions_spawn
optionally thread: true, mode: "session" for a persistent worker thread
optionally lightContext: true for cheaper worker boot
do not poll in a loop
let the built-in completion announce post back when finished

If the parent should just wait for completion, use sessions_yield after spawning.

2) Session management at scale

17 MB session file, compaction looping

That’s not really a “file size” problem, it’s a dense transcript problem.

If your agents dump:

build logs
giant read outputs
raw curl bodies
repeated tool results

then compaction has to summarize a swamp of low-signal text. memoryFlush: true helps preserve durable notes, but it does not solve raw transcript obesity by itself.

Is there a max session file size or auto-reset threshold?

Not a per-transcript one that I found.

What exists:

session.reset.idleMinutes / session.resetByType.* for lifecycle resets
/new and /reset
session.maintenance.* for store/disk hygiene
maxDiskBytes, highWaterBytes, rotateBytes, maxEntries

But those are not “reset active session when transcript hits 17 MB”.

There is auto-reset recovery for certain compaction/context-failure paths in source, but that is failure handling, not a general “reset when big” threshold.

Can `/new` be triggered programmatically?

Yes, via gateway RPC.

If you’re already on the WebSocket API, there is a real RPC:

sessions.reset
params: { key, reason: "new" | "reset" }

That is the clean programmatic equivalent.

Simplest user-level path is still sending /new, but for automation, sessions.reset is the one.

Compaction improvements from 4.2 to 4.9?

Yes, meaningful ones:

lightContext fix for spawned subagents landed in 4.9 (#62264)
- this matters a lot for your hive
pluggable compaction provider registry landed by 4.9 (#56224)
subagent completion delivery got better
- inherit requester delivery target

longer announce timeout
- Telegram thread target normalization
heartbeat transcript truncation race fix landed (#60998)

So yes, 4.9 is materially better than 4.2 for your use case.

3) What I’d change for a 20-agent production hive

A. Stop immortal sessions

Right now, “persist until manual /new” is hurting you.

I’d add idle resets at least for heavy worker surfaces.

Example direction:

{
  session: {
    resetByType: {
      group: { mode: "idle", idleMinutes: 120 },
      thread: { mode: "idle", idleMinutes: 240 }
    }
  }
}

For Discord thread-bound worker sessions, also use:

channels.discord.threadBindings.idleHours
channels.discord.threadBindings.maxAgeHours

B. Turn on aggressive `contextPruning`

This is probably your biggest direct fix for exec / read / curl bloat.

{
  agents: {
    defaults: {
      contextPruning: {
        mode: "cache-ttl",
        ttl: "5m",
        keepLastAssistants: 3,
        minPrunableToolChars: 8000,
        softTrim: { maxChars: 2000, headChars: 800, tailChars: 800 },
        hardClear: { enabled: true, placeholder: "[Old tool result cleared]" }
      }
    }
  }

The default minPrunableToolChars: 50000 is often too forgiving for log-heavy agents.

C. Keep verbose output out of the transcript

This matters a lot.

For heavy shell/API work, prefer:

redirect logs to files
return only exit code + summary + tail
read targeted slices, not whole blobs

Good patterns:

cmd > build.log 2>&1
then only inspect tail -n 80 build.log
curl -sS -o response.json ... then jq only what matters
read with offset/limit, not whole files
use rg, sed -n, jq, python -c to extract signal

D. Use subagents as disposable workers

Main agent should stay relatively clean.

Pattern I’d use:

human-facing orchestrator
many short-lived worker subagents
workers on cheaper model
lightContext: true
strict no-poll guidance

E. Upgrade and unify versions

Mixed 4.2 + 4.9 in one hive is asking for weirdness.

At minimum, get everything to the same version. Honestly I’d go past 4.9 now. The status output here shows 2026.4.12 available.

F. Keep `reserveTokensFloor: 40000`, but don’t expect it to solve this alone

Your current 40000 is reasonable.

But if the transcript is mostly giant tool output, increasing reserve just makes compaction happen earlier, not better.

G. Consider a better compaction model

If the main runtime model is busy doing execution-heavy work, a dedicated compaction model can help:

agents.defaults.compaction.model

That helps summary quality, but it is still secondary to pruning and reset hygiene.

H. Cap fan-out on purpose

I’d keep:

maxChildrenPerAgent: 3 to 5
maxSpawnDepth: 2 max, unless you truly need deeper trees

Deep recursive agent swarms get expensive and messy fast.

4) My concrete recommendations for your setup

If this were my hive, I’d do this first:

Upgrade all gateways to the same 2026.4.x, ideally latest
Enable contextPruning on every heavy worker
Add idle/max-age resets for worker threads/sessions

Use lightContext: true on spawned workers after upgrade
Stop dumping raw logs/API bodies into the transcript
Use self-spawned disposable workers for long jobs
Restrict cross-agent spawning explicitly, don’t use ["*"] unless you really mean it

Answers to your last question

can an agent spawn itself, finish work, then give me feedback via Discord when complete?

Yes. That’s exactly a good fit for sessions_spawn.

Use a child session, let it run, and let the built-in completion announce deliver the result back to the requester thread/channel. That is cleaner than backgrounding openclaw agent yourself.

Docs worth keeping open:

If you want, I can turn this into a recommended openclaw.json snippet for a 20-agent hive, with sane defaults for worker spawning, pruning, idle reset, and Discord thread bindings.

hidden elbow Apr 13, 2026, 3:42 PM

what details or questions ask us to help you helkp us then

Krill — follow-up. Three things:

contextPruning config — You gave us the snippet. Does this go in agents.defaults in openclaw.json, or per-agent override? Can we set different pruning aggressiveness per agent (e.g. aggressive for Tony/Forge, lighter for Banner)?

Upgrade path 4.2 → 4.12 — We have 3 gateways: main hive (4.2), Doc (unknown), Meta You (4.9). What's the cleanest upgrade path? Can we do rolling upgrades or does the whole hive need to go down? Any breaking config changes between 4.2 and 4.12 we should know about?

Recommended openclaw.json snippet — You offered to produce one for a 20-agent hive with sane defaults for worker spawning, pruning, idle reset, and Discord thread bindings. Yes please. That would be the most valuable single artifact you could give us.

also you have free riegn here, we need your help so dont be shy and dont hold back, ask qesutuions as well as you know the best practice setups too to help us

stable plinthBOT Apr 13, 2026, 3:46 PM

I’m going to verify the per-agent pruning override shape and sanity-check the 4.2 → 4.12 upgrade risks before I give you the config artifact.

Yep, here’s the blunt version.

1) `contextPruning` placement

Today, it is a global/default knob, not a per-agent one.

From the current docs/schema I checked:

supported: agents.defaults.contextPruning
not documented/exposed: agents.list[].contextPruning

So for Tony/Forge aggressive, Banner lighter, the answer is:

not in one gateway config today, at least not cleanly via supported config
best current workarounds:
1. pick one global pruning policy that is “aggressive but safe”
2. put the heaviest agents on a separate gateway/config

upstream/patch per-agent pruning support later

If you already have 3 gateways, I’d seriously consider using that split:

worker-heavy gateway: aggressive pruning
doc/analysis gateway: lighter pruning
meta/orchestrator gateway: medium

That’s the cleanest way to get different pruning behavior right now.

2) Upgrade path, 4.2 → 4.12

Can you do rolling upgrades?

Yes, if the gateways are actually separate instances with separate state dirs / session stores / channel ownership.

I would not keep mixed versions around longer than necessary, and I would not have two versions touching the same state/profile/session store at once.

My recommended path

For each gateway, in order:

openclaw --version
openclaw update --dry-run
back up config/state
upgrade one non-critical gateway first
openclaw doctor
openclaw gateway restart
test:
- normal chat reply
- sessions_spawn
- Discord completion announce
- Telegram delivery
- /new or sessions.reset

then move to the next gateway

Does the whole hive need to go down?

Not necessarily.
If these are truly separate gateways, do it one gateway at a time.

Anything breaking between 4.2 and 4.12?

I did not find a big “subagents/session config broke” cliff in the areas you care about.

What I did find as likely upgrade pain:

stricter config/auth validation
gateway bind/auth guardrails are stricter
post-upgrade issues are often:
- wrong gateway.mode
- wrong remote URL target
- non-loopback bind without valid auth

config drift now being enforced

So I’d treat this as a careful rolling upgrade, not a blind one.

Biggest reasons I’d upgrade

For your use case, the meaningful wins are:

lightContext for spawned subagents is properly honored by 4.9+
subagent completion delivery got more reliable
compaction path improved
transcript/heartbeat race fixes landed
latest stable reduces “weird mixed-behavior” risk

Honestly, I would go straight to 4.12, not stop at 4.9.

3) Recommended `openclaw.json` snippet

This is a starting template, not gospel. I tuned it for:

worker spawning
transcript hygiene
session expiry
Discord thread-bound workers
avoiding immortal giant sessions

{
  agents: {
    defaults: {
      // Reduces repeated bootstrap injection on safe continuation turns.
      contextInjection: "continuation-skip",


      // You already run a smaller bootstrap. Keep it.
      bootstrapMaxChars: 12000,

      compaction: {
        memoryFlush: true,
        reserveTokensFloor: 40000,
        // Optional, if compaction quality is weak:
        // model: "provider/model"
      },

      // Global only today, not per-agent.
      contextPruning: {
        mode: "cache-ttl",
        ttl: "5m",


        // Keep recent context intact, trim older noisy tool output.
        keepLastAssistants: 3,
        minPrunableToolChars: 8000,

        softTrimRatio: 0.3,
        hardClearRatio: 0.5,

        softTrim: {
          maxChars: 2000,
          headChars: 800,
          tailChars: 800
        },

        hardClear: {

          enabled: true,
          placeholder: "[Old tool result cleared]"
        }
      },

      subagents: {
        // Global concurrency safety valve. Tune to VPS size.
        maxConcurrent: 6,

        // Prevent one session from exploding into too many children.
        maxChildrenPerAgent: 4,

        // Allow orchestrator -> worker nesting, but not crazy recursion.
        maxSpawnDepth: 2,

        // Clean up idle child sessions automatically.
        archiveAfterMinutes: 60,

        // Default child timeout when omitted in sessions_spawn.
        runTimeoutSeconds: 1800

        // Optional default worker model:
        // model: "provider/model"
      }
    },

    list: [
      {
        id: "main",
        // Keep cross-agent spawning explicit, not wildcard, if possible.

        subagents: {
          allowAgents: ["main", "tony", "forge", "banner"]
        }
      },

      {
        id: "tony",
        subagents: {
          allowAgents: ["tony", "forge"]
          // Optional per-agent worker model:
          // model: "openai/gpt-5.3-codex"
        }
      },

      {

        id: "forge",
        subagents: {
          allowAgents: ["forge", "tony"]
          // model: "openai/gpt-5.3-codex"
        }
      },

      {
        id: "banner",
        subagents: {
          allowAgents: ["banner"]
          // model: "google/..."
        }
      }
    ]

  },

  session: {
    // Kill immortal sessions. This matters.
    resetByType: {
      group: { mode: "idle", idleMinutes: 120 },
      thread: { mode: "idle", idleMinutes: 240 },
      direct: { mode: "idle", idleMinutes: 360 }
    },

    // Optional: be stricter about copying giant parent history into new threads.
    parentForkMaxTokens: 60000,

    maintenance: {
      mode: "enforce",

      pruneAfter: "14d",
      maxEntries: 800,
      rotateBytes: "10mb",
      resetArchiveRetention: "14d",
      maxDiskBytes: "2gb",
      highWaterBytes: "1.5gb"
    },

    threadBindings: {
      enabled: true,
      idleHours: 24,
      maxAgeHours: 72
    }
  },

  channels: {
    discord: {
      threadBindings: {
        enabled: true,
        idleHours: 24,
        maxAgeHours: 72,
        spawnSubagentSessions: true
      }
    }
  }
}

What I would change operationally, beyond config

These matter as much as the JSON:

For heavy `exec`

Do not dump raw logs into the transcript if you can avoid it.

Prefer:

cmd > build.log 2>&1
then inspect only tail -n 80 build.log
summarize to a file or short tool output

For `curl`

Prefer:

curl -sS -o response.json ...
then jq only important fields

For `read`

Prefer:

targeted offset/limit
or rg / sed -n / structured extraction

For subagents

Teach orchestrators this rule:

spawn child
do not poll in a loop
either continue working or sessions_yield
let completion announce come back naturally

That one prompt rule alone will make the hive feel way more async.

My honest recommendation on pruning split

Because pruning is global today, if Tony/Forge are dramatically noisier than Banner, I would do one of these:

Option A, simplest

Use the snippet above globally, accept one middle-ground pruning policy.

Option B, best if you really care

Split agents by gateway:

Gateway A: heavy workers, aggressive pruning
Gateway B: doc/research, lighter pruning
Gateway C: orchestrator/meta, medium pruning

Option C, future

Patch in per-agent contextPruning.

My recommendation: Option B if the workload difference is real.

The 5 questions that would help me tighten this further

If you answer these, I can give you a much sharper v2 config:

Do the 3 gateways share any Discord bot token / Telegram account, or are they fully separate surfaces?
Are the 20 agents all in one gateway config, or split across the 3 gateways?
What are the VPS specs? CPU cores, RAM, disk.
Which agents mutate the same repo/workspace concurrently?
Do any agents actually need old tool output verbatim later, or is summary/tail enough?

If you want, next I can give you:

a rolling upgrade checklist

a hive prompt rule set for orchestrators/workers
and a Tony/Forge/Banner gateway split recommendation based on your actual topology

hidden elbow Apr 13, 2026, 4:00 PM

Krill — answers to your 5 questions:
1. Do the 3 gateways share bot tokens?
No. Fully separate surfaces per gateway. Each gateway has its own Discord bot tokens and Telegram bots. No overlap.
2. How are agents split across gateways?
All 3 gateways run on the same VPS (KVM2 — 187.77.177.182), different Linux users and service units:

Gateway 1 (main, root, port 18789): 18 agents — Tony, Pepper, Happy, Forge, Argus, Loki, Vision, Rhodes, Banner, Hank, Einstein, Sammy, Grant, Cara, Voice, Simba, Mason, Clay. Currently on v2026.4.2.
Gateway 2 (Doc, /home/doc, separate systemd): 2 agents — Doc, Intel. Version unknown (likely 4.2 or older).
Gateway 3 (Meta You, /home/metayou, port 18810): 3 agents — James, Atlas, Blaire. Currently on v2026.4.9 (isolated binary, independently upgradeable).
Total: 23 agents across 3 gateways on one VPS. A 4th instance (Jordan/Sabain, port 18800, Docker) runs 3 client agents — separate, ignore it.

3. VPS specs?
KVM2 (the one that matters): Current utilisation ~35% disk, ~34% RAM. Running Node v22.22.0, Linux 6.8.0. Exact CPU/RAM total not documented — can get Tony to report nproc and free -h if you need it for tuning maxConcurrent.
KVM4 (72.61.18.103): 4 vCPU, 16GB RAM, 200GB NVMe — websites only, no agents.
4. Which agents mutate the same repo/workspace concurrently?

Tony + Forge both touch james-agent-surface and mission_control_v2_app. Standing rule: Forge builds, Tony deploys — but overlap happens.
Mason + Clay share the web build workspace. Mason specs, Clay executes via SSH to KVM4.
Everyone else operates in isolated workspaces.

5 and more coming below, wiat before replying for the next message

5. Do any agents need old tool output verbatim?
No. Summary/tail is enough for all agents. Heavy research (Einstein) goes to Notion pages (external memory, Tier 5) — not kept in transcript. Aggressive pruning is safe across the board.
Bonus context for your pruning split recommendation:
The noisiest agents by far are Tony (exec-heavy orchestrator, builds, deploys, shell work) and Forge (dev, Claude Code CLI, codebase reads/writes). Banner is a thinking partner — mostly conversation, minimal tool output. The business ops agents (Pepper, Happy, Loki, etc.) are moderate — mostly Notion reads/writes and web searches.
Given all 3 gateways are on one VPS, Option A (single global policy) is probably the pragmatic call for now. Tony and Forge are on the main gateway with 16 other agents — splitting them to a separate gateway means re-binding Discord channels and re-wiring MQ, which is a bigger lift than the pruning benefit.
Yes please to all three: rolling upgrade checklist, hive prompt rule set for orchestrators/workers, and the gateway split recommendation.

stable plinthBOT Apr 13, 2026, 4:01 PM

Got it, I’ll wait for #5 and the rest.

Perfect, that’s enough context to be opinionated.

My recommendation, based on your topology

For now, keep the 3-gateway layout as-is.
I would not rebind Tony/Forge to a new gateway yet.

Your pragmatic move is:

upgrade all 3 gateways to 4.12
apply one aggressive pruning policy on the main gateway
add idle/max-age session resets
tighten orchestrator/worker prompt rules
add a shared-workspace lock discipline for Tony/Forge and Mason/Clay

That gets you most of the gain without re-wiring surfaces.

1) Rolling upgrade checklist

Because your gateways are on separate users, separate services, separate tokens, you can do this cleanly with rolling upgrades.

Order I would use

Phase 1, canary

Gateway 3, Meta You, 4.9 → 4.12

already closest to target
isolated
only 3 agents
best low-risk canary

Phase 2, small secondary

Gateway 2, Doc, unknown → 4.12

small blast radius
confirms older instance behavior

Phase 3, main hive

Gateway 1, main, 4.2 → 4.12

do this last
biggest surface area
most likely to expose behavior drift

Per-gateway checklist

For each gateway:

A. Inventory first

Run:

openclaw --version
openclaw status
openclaw doctor
openclaw update --dry-run

If it’s a package install, use the normal package-manager upgrade flow. If it’s source-managed, openclaw update is fine.

B. Back up before touching it

Back up:

openclaw.json
state dir / sessions
any workspace-local AGENTS/system prompt files
service unit/env overrides

C. Upgrade binary first, do not change config yet

This matters.

Do version upgrade first, restart, validate.
Then do config changes like pruning/reset/thread bindings in a second pass.

That way if something breaks, you know whether it was version or config.

D. Restart

openclaw gateway restart

E. Validate the exact things you care about

On each upgraded gateway, test:

normal user reply

sessions_spawn returns immediately
child completion announce posts back correctly
/new or sessions.reset works
one heavy exec run
one Discord thread-bound worker flow
one Telegram delivery flow
openclaw logs --follow stays boring

F. Only then apply config changes

After the gateway passes on stock behavior:

aggressive contextPruning
resetByType
thread binding idle/maxAge
any subagent concurrency tuning

Then restart again.

What I would watch for during upgrade

The likely gotchas are not “subagents broke”, they’re:

stricter config validation
wrong gateway.mode / remote URL confusion
bind/auth guardrails
service env mismatch after upgrade

Docs worth keeping handy:

2) Hive prompt rule set, orchestrators + workers

I’d put something like this into the main hive’s agent guidance.

Orchestrator rules

Use this for Tony, Pepper-style coordinators, Mason, etc.

## Hive orchestration rules

- `sessions_spawn` is fire-and-forget.
- After calling `sessions_spawn`, do not wait for the child to finish unless explicitly asked.
- Do not poll `sessions_list`, `sessions_history`, or `subagents list` in a loop.
- If waiting for completion is actually required, use `sessions_yield` and let completion arrive as the next event.
- Prefer `lightContext: true` for spawned workers unless the child truly needs full session history.
- Prefer one worker per distinct task chunk, not one giant do-everything child.


## Transcript hygiene

- Never paste raw build logs, giant command output, or full API responses into chat unless explicitly asked.
- For heavy shell work, redirect output to a file and report only: status, key findings, and the last relevant lines.
- For HTTP/API work, save payloads to a file and extract only the fields that matter.
- For large reads, inspect targeted slices, not whole files.

## Shared workspace safety

- Never let two agents write to the same checkout at the same time.
- If a workspace is shared, either:
  - use a lock file / explicit ownership handoff, or
  - use separate worktrees/checkouts.
- Tony deploys, Forge builds. If Forge is actively modifying or building the shared repo, Tony should not mutate it.

- Mason plans/specs, Clay executes. One writer at a time.

## Completion style

- Child workers should return concise result summaries:
  - what changed
  - where it changed
  - success/failure
  - blockers
  - next recommended step
- Do not return raw noise when a summary is enough.

Worker rules

Good for Forge-ish spawned workers.

## Worker session rules

- You are a disposable worker session.
- Finish the assigned task and return a concise result.
- Do not busy-poll for parent status.
- Do not ask to keep working when the task is already clear.
- Prefer writing artifacts/logs to files over dumping them into the transcript.
- If the task touches a shared repo, verify ownership before making changes.

3) Tony / Forge / Mason / Clay safety rule

This is the part I’m most worried about, honestly.

Your biggest operational risk is not model context, it’s concurrent mutation of the same checkout.

I would enforce this immediately

Tony + Forge

Forge: build / code / branch / artifact work
Tony: deploy / release / ops
No concurrent writes to the same checkout
handoff should be explicit:
- lock file
- artifact path
- commit/worktree reference
- “Tony may deploy now” marker

Mason + Clay

Mason should not write in the same execution checkout Clay is using
Clay should execute from a stable handoff target, not Mason’s actively changing workspace

If you do nothing else, fix this. It will save you more pain than another 20k reserve tokens.

4) Gateway split recommendation

Right now

I agree with you, Option A is correct.

Keep:

Gateway 1 = main hive, aggressive pruning
Gateway 2 = doc/intel, same or slightly lighter later
Gateway 3 = meta, same defaults for now

No surface rebinding yet.

When I would revisit a split

Only if one of these becomes true:

Tony/Forge account for most compaction incidents
main gateway CPU/RAM starts spiking during build waves
Discord/Telegram latency gets noticeably worse during heavy worker bursts
main gateway logs show repeated congestion from exec/subagent load

If you eventually split later

The first candidates to move are:

Tony
Forge
Mason
Clay

That would create a dedicated execution gateway and leave business/chat agents on the main hive.

But I would not do that yet.

5) My tuning recommendation, right now

Because aggressive pruning is safe for all of you, I’d use it broadly.

Main gateway

Use the aggressive policy from earlier, and I’d actually be comfortable going a bit harder:

ttl: "5m"
minPrunableToolChars: 8000
keep hardClear enabled

If Tony/Forge still bloat sessions after that, the next move is not “less pruning”, it’s:

stricter transcript hygiene in prompts
more file-backed logs
faster session turnover

Session lifecycle

For the main hive, I’d absolutely enable:

group idle reset
thread idle reset
Discord thread maxAgeHours

Immortal worker threads are killing you.

6) One extra config tweak I do like for you

This one is worth keeping if you haven’t already:

{
  agents: {
    defaults: {
      contextInjection: "continuation-skip"
    }
  }
}

It reduces repeated bootstrap reinjection on safe continuation turns, which helps exactly the kind of long-running orchestrator sessions you have.

7) What I want from you next, if you want a tighter v3

Two tiny things would let me tune concurrency more confidently:

from KVM2:

nproc
free -h

if you have it, rough answer to:

how many heavy jobs run at once during peak, 2, 5, 10+?

With that, I can give you a more confident recommendation for:

agents.defaults.maxConcurrent
agents.defaults.subagents.maxConcurrent
maxChildrenPerAgent

If you want, next I can turn this into a copy-paste rollout plan with:

“upgrade gateway 3”
“validate”
“apply config”
“restart”
“move to gateway 2”
“move to gateway 1”

basically a runbook your team can execute line by line.

hidden elbow Apr 13, 2026, 8:12 PM

Hey Krill, here's what you asked for.

1. KVM2 specs:

`

nproc: 2

total used free shared buff/cache available

Mem: 7.8Gi 3.7Gi 627Mi 1.1Mi 3.7Gi 4.0Gi

Swap: 0B 0B 0B

`

2. Peak concurrency:

During build waves (Forge building + Clay SSH'ing to KVM4 + Tony deploying + Argus monitoring + a standup runner), we hit roughly 5–8 concurrent agent sessions with 2–3 heavy exec jobs at the same time. Not 10+. Normal steady-state is lower — maybe 3–4 active sessions.

3. Yes please — turn the upgrade plan into a copy-paste rollout runbook we can execute line by line. The phased order you gave (Meta You → Doc → Main) is exactly right.

Also requesting all three deliverables:

Rolling upgrade runbook (line-by-line)

Hive prompt rule set (orchestrator + worker templates)

Gateway split recommendation (when to revisit)

One version check: you noted Meta You at 4.9 — we'll confirm with openclaw --version on that instance before starting Phase 1. It may still be on 4.5.

stable plinthBOT Apr 13, 2026, 8:14 PM

That hardware changes my recommendation a bit.

On 2 vCPU, 7.8 GiB RAM, no swap, I would be conservative, not ambitious.

Revised tuning for KVM2

I’d dial my earlier numbers back to roughly:

agents.defaults.maxConcurrent: 3
agents.defaults.subagents.maxConcurrent: 4
agents.defaults.subagents.maxChildrenPerAgent: 3

Reason: on a 2-core box, your bottleneck is not just OpenClaw logic, it’s overlapping exec, builds, SSH, and Node processes fighting for the same tiny CPU budget.

Also, I’m a bit worried about 0 swap. Even a modest swapfile would make this box much less brittle during build waves.

1) Copy-paste rollout runbook

I’m splitting this into:

binary upgrade pass
config rollout pass

Do not combine them. Upgrade binaries first, validate, then change config.

Phase 0, preflight, once

0.1 Freeze the blast radius

During the rollout window:

avoid big deploys

avoid long Forge builds
avoid Mason/Clay overlapping jobs
avoid manual prompt experiments on Tony

0.2 Save current config/state

Run this once per gateway user and save it somewhere safe.

Meta You

sudo -iu metayou bash -lc '
set -e
date
openclaw --version || true
openclaw status || true
openclaw doctor || true
cp -av ~/.openclaw ~/.openclaw.backup.$(date +%Y%m%d-%H%M%S) || true

Doc

sudo -iu doc bash -lc '
set -e
date
openclaw --version || true
openclaw status || true
openclaw doctor || true
cp -av ~/.openclaw ~/.openclaw.backup.$(date +%Y%m%d-%H%M%S) || true
'

Main

bash -lc '
set -e
date
openclaw --version || true
openclaw status || true
openclaw doctor || true
cp -av ~/.openclaw ~/.openclaw.backup.$(date +%Y%m%d-%H%M%S) || true
'

Phase 1, Meta You canary

1.1 Inspect current version

sudo -iu metayou bash -lc '
set -e
openclaw --version
openclaw status
openclaw doctor
openclaw update --dry-run || true
'

1.2 Upgrade

If openclaw update --dry-run looks sane and this install supports it:

sudo -iu metayou bash -lc '
set -e

openclaw update --yes --no-restart
openclaw gateway restart
sleep 5
openclaw --version
openclaw status
'

If this instance is package-managed and openclaw update is not the right path, use your normal package-manager update flow for that user, then:

sudo -iu metayou bash -lc '
set -e
openclaw gateway restart
sleep 5

openclaw --version
openclaw status
'

1.3 Smoke test

Test these manually on Meta You:

normal chat reply
one sessions_spawn
one child completion announce
one /new
one Telegram/Discord delivery path

If Meta You looks good, continue.

Phase 2, Doc gateway

2.1 Inspect current version

sudo -iu doc bash -lc '
set -e
openclaw --version
openclaw status
openclaw doctor
openclaw update --dry-run || true
'

2.2 Upgrade

sudo -iu doc bash -lc '
set -e
openclaw update --yes --no-restart || true
openclaw gateway restart
sleep 5
openclaw --version
openclaw status
'

If package-managed, use that upgrade path, then restart.

2.3 Smoke test

Same checks:

normal reply

sessions_spawn
completion announce
/new
one actual doc/intel workflow

If clean, continue.

Phase 3, Main gateway

3.1 Inspect current version

bash -lc '
set -e
openclaw --version

openclaw status
openclaw doctor
openclaw update --dry-run || true
'

3.2 Upgrade

bash -lc '
set -e
openclaw update --yes --no-restart || true
openclaw gateway restart
sleep 5
openclaw --version
openclaw status

If package-managed, use that upgrade path, then restart.

3.3 Main smoke test

Before config changes, validate:

Tony normal reply
Forge normal reply
one sessions_spawn
one child result announce
one heavy-ish exec
one /new
one Discord thread-bound worker flow
one Telegram delivery flow

If that all works, move to config rollout.

Phase 4, config rollout

4.1 Main gateway config, conservative for 2 vCPU

Apply this style of tuning on the main gateway first.

{
  agents: {
    defaults: {
      maxConcurrent: 3,
      contextInjection: "continuation-skip",
      bootstrapMaxChars: 12000,


      compaction: {
        memoryFlush: true,
        reserveTokensFloor: 40000
      },

      contextPruning: {
        mode: "cache-ttl",
        ttl: "5m",
        keepLastAssistants: 3,
        minPrunableToolChars: 8000,
        softTrimRatio: 0.3,
        hardClearRatio: 0.5,
        softTrim: {
          maxChars: 2000,

          headChars: 800,
          tailChars: 800
        },
        hardClear: {
          enabled: true,
          placeholder: "[Old tool result cleared]"
        }
      },

      subagents: {
        maxConcurrent: 4,
        maxChildrenPerAgent: 3,
        maxSpawnDepth: 2,
        archiveAfterMinutes: 60,
        runTimeoutSeconds: 1800

      }
    }
  },

  session: {
    resetByType: {
      group: { mode: "idle", idleMinutes: 120 },
      thread: { mode: "idle", idleMinutes: 240 },
      direct: { mode: "idle", idleMinutes: 360 }
    },
    parentForkMaxTokens: 60000,
    maintenance: {
      mode: "enforce",
      pruneAfter: "14d",
      maxEntries: 800,

      rotateBytes: "10mb",
      resetArchiveRetention: "14d",
      maxDiskBytes: "2gb",
      highWaterBytes: "1.5gb"
    },
    threadBindings: {
      enabled: true,
      idleHours: 24,
      maxAgeHours: 72
    }
  },

  channels: {
    discord: {
      threadBindings: {

        enabled: true,
        idleHours: 24,
        maxAgeHours: 72,
        spawnSubagentSessions: true
      }
    }
  }
}

4.2 Restart main after config

bash -lc '
set -e
openclaw gateway restart

sleep 5
openclaw status
'

4.3 Re-test main

Watch specifically for:

Tony and Forge no longer stuffing giant tool output into context
sessions_spawn feeling async again
less compaction churn
lower prompt growth over time

4.4 Roll same config shape to Meta You and Doc

You can keep the same structure there.
If those gateways are lighter, you can later loosen concurrency, but I would start uniform.

2) Hive prompt rule set, copy-paste templates

Orchestrator template

Use this for Tony, Pepper, Mason-type agents.

## Orchestration rules

- `sessions_spawn` is fire-and-forget.
- After spawning a worker, continue with other useful work unless explicitly told to wait.
- Do not poll `sessions_list`, `sessions_history`, or `subagents list` in loops.
- If waiting for completion is actually required, use `sessions_yield`.
- Prefer `lightContext: true` for spawned workers unless the worker truly needs full session history.
- Prefer small focused workers over one giant worker.


## Transcript hygiene

- Never dump raw build logs, full API payloads, or giant file contents into the transcript unless explicitly asked.
- For shell work, redirect verbose output to a file and report only status, key findings, and the final relevant lines.
- For HTTP/API work, save payloads to a file and extract only the fields that matter.
- For large reads, inspect targeted slices, not whole files.

## Shared repo safety

- Never allow two agents to write to the same checkout at the same time.
- If a workspace is shared, use either:
  - a lock file / ownership handoff, or
  - separate worktrees/checkouts.
- Tony deploys, Forge builds.

- Mason plans/specs, Clay executes.
- If another agent owns the shared workspace, do not mutate it.

## Result format

Return concise summaries:
- what changed
- where it changed
- success/failure
- blockers
- next step

Worker template

Use this for Forge-ish spawned worker sessions.

## Worker rules

- You are a disposable worker session.
- Finish the assigned task and return a concise result.
- Do not busy-poll for parent status.
- Prefer writing artifacts and logs to files over dumping them into the transcript.
- If the task touches a shared repo, verify ownership before writing.
- Return summary, changed files, blockers, and recommended next step.

Shared workspace handoff template

If you want a lightweight human-readable handoff convention, use a file like:

# WORKSPACE_LOCK.md


Current owner: Forge
Task: build verification for mission_control_v2_app
Started: 2026-04-13T20:00Z
Expected release handoff: Tony after build artifact is ready
Status: active

That sounds almost stupidly simple, but it works.

3) Gateway split recommendation, when to revisit

My current recommendation

Do not split Tony/Forge off yet.

Why:

you already have 3 gateways
all 3 live on the same 2-core box
the real resource bottleneck is the host, not just routing shape
surface rebinding cost is real
you’ll get more benefit right now from pruning + reset discipline + concurrency caps

Revisit only if one of these becomes true

Trigger A

Tony/Forge still dominate compaction / overflow incidents after pruning and reset fixes.

Trigger B

Main gateway becomes visibly laggy during build waves.

Trigger C

You move a worker gateway to a different host or give KVM2 materially more resources.

Trigger D

Tony/Forge/Mason/Clay need tighter operational isolation than the current shared main hive can give.

If you do split later

First candidates:

Tony
Forge
Mason
Clay

But I’d only do it when you can also improve compute isolation, not just shuffle configs on the same 2-core VPS.

4) My final opinionated take

For this exact box, the winning stack is:

4.12 everywhere
aggressive pruning
shorter-lived sessions
strict no-log-dumps prompt rules
conservative concurrency
shared-workspace ownership discipline

Not more fan-out.
Not more concurrency.
Not another gateway split on the same tiny host.

That’s the part I feel strongest about.

Docs worth bookmarking:

If this helps, I can do one more pass and turn the config + prompt templates into a clean “Main Gateway v1” bundle with:

final openclaw.json block
final orchestrator prompt
final worker prompt
final shared-workspace rule block

all ready to paste.

hidden elbow Apr 13, 2026, 8:22 PM

If we upgrade KVM2 to 4 vCPU / 16 GB or even to a KVM 8 8 vCPU cores
32 GB RAM
400 GB NVMe disk space
32 TB bandwidth

how would you adjust the concurrency numbers?" — he'll probably go back to his original maxConcurrent: 6 recommendation, which is where you want to be.

as we will be growing instances too and agents, what do you think

stable plinthBOT Apr 13, 2026, 8:22 PM

Yeah, on bigger boxes I’d loosen it, but not linearly.

Short version:

• 4 vCPU / 16 GB: I’d move to about **maxConcurrent: 4**, **subagents.maxConcurrent: 6**, **maxChildrenPerAgent: 4**
• 8 vCPU / 32 GB: I’d move to about **maxConcurrent: 6**, **subagents.maxConcurrent: 8**, **maxChildrenPerAgent: 5**

So yes, on 8 / 32, my earlier “6” becomes pretty sane again.

My table

| Host size      | agents.defaults.maxConcurrent | agents.defaults.subagents.maxConcurrent | maxChildrenPerAgent |
| -------------- | ----------------------------- | --------------------------------------- | ------------------- |
| 2 vCPU / 8 GB  | 3                             | 4                                       | 3                   |
| 4 vCPU / 16 GB | 4                             | 6                                       | 4                   |

| 8 vCPU / 32 GB | 6                             | 8                                       | 5                   |

Why I still wouldn’t go crazy

Your real bottleneck is not just “agent sessions”, it’s:

• builds
• exec
• SSH
• code tooling
• concurrent writes to shared repos

So even on 8 / 32, I would not jump straight to 10+ top-level concurrent runs unless you’ve also:

• separated worker vs chat roles better
• reduced shared-workspace contention

• kept transcript hygiene tight
• added some swap
• observed stable behavior for a few days

What matters most for concurrency

In practice, the order is:

CPU cores
RAM
swap exists or not
how many heavy exec/build jobs overlap
whether browser/Chrome-style tools are active

Disk and bandwidth are nice, but they are not what sets these concurrency caps.

If you upgrade and also grow agents

Then I’d start thinking in gateway roles, not just bigger numbers:

• Gateway A, orchestrator/chat
lower concurrency, cleaner sessions
• Gateway B, workers/execution
higher concurrency, more subagents, more exec
• Gateway C, docs/research/meta
lighter, cheaper, more memory-oriented

That’s where scaling starts to feel clean instead of chaotic.

My opinionated take

If you go to 4 / 16

Good upgrade, worth doing.
I’d still keep the current topology and just relax caps a bit.

If you go to 8 / 32

Now it becomes worth revisiting a real worker gateway split later.
That’s the point where Tony/Forge/Mason/Clay on a more execution-oriented gateway starts making sense.

How I’d raise limits safely

Don’t jump all at once.

After upgrade:

set the new caps
run for 48 hours
watch:
• p95 reply latency
• compaction frequency
• CPU saturation during build waves

• memory pressure
• queue/backlog feel
4. only then bump again

So if you land on 8 / 32, I’d start at:

• maxConcurrent: 5
• subagents.maxConcurrent: 7
• maxChildrenPerAgent: 4

and only move to 6 / 8 / 5 after it proves stable.

If you want, I can turn that into a capacity-planning matrix next, like:

• current KVM2
• 4/16 target
• 8/32 target

• recommended gateway roles
• recommended caps per gateway role

hidden elbow Apr 13, 2026, 8:31 PM

Going KVM 8 (8 vCPU / 32 GB). Please send the Main Gateway v1 bundle with caps tuned for that box. We'll start conservative per your advice (maxConcurrent: 5, subagents: 7, children: 4) and step up to 6/8/5 after 48h observation. And yes — adding a swapfile.

stable plinthBOT Apr 13, 2026, 8:32 PM

Nice, that’s a much healthier target box.

Here’s the Main Gateway v1 bundle for KVM 8 (8 vCPU / 32 GB), tuned to start conservative at 5 / 7 / 4, then step to 6 / 8 / 5 after 48h if stable.

Main Gateway v1

1) `openclaw.json` merge block

Use this as the main gateway baseline.

{
  agents: {
    defaults: {

      // Start conservative on KVM 8. After 48h stable, bump to 6.
      maxConcurrent: 5,

      contextInjection: "continuation-skip",
      bootstrapMaxChars: 12000,

      compaction: {
        memoryFlush: true,
        reserveTokensFloor: 40000
      },

      contextPruning: {
        mode: "cache-ttl",
        ttl: "5m",

        keepLastAssistants: 3,
        minPrunableToolChars: 8000,

        softTrimRatio: 0.3,
        hardClearRatio: 0.5,

        softTrim: {
          maxChars: 2000,
          headChars: 800,
          tailChars: 800
        },

        hardClear: {
          enabled: true,
          placeholder: "[Old tool result cleared]"

        }
      },

      subagents: {
        // Start conservative on KVM 8. After 48h stable, bump to 8.
        maxConcurrent: 7,

        // Start conservative. After 48h stable, bump to 5.
        maxChildrenPerAgent: 4,

        maxSpawnDepth: 2,
        archiveAfterMinutes: 60,
        runTimeoutSeconds: 1800
      }
    },


    list: [
      {
        id: "tony",
        subagents: {
          allowAgents: ["tony", "forge"],
          model: "openai/gpt-5.3-codex"
        }
      },
      {
        id: "forge",
        subagents: {
          allowAgents: ["forge", "tony"],
          model: "openai/gpt-5.3-codex"
        }

      },
      {
        id: "mason",
        subagents: {
          allowAgents: ["mason", "clay"],
          model: "openai/gpt-5.3-codex"
        }
      },
      {
        id: "clay",
        subagents: {
          allowAgents: ["clay", "mason"],
          model: "openai/gpt-5.3-codex"
        }
      },

      {
        id: "banner",
        subagents: {
          allowAgents: ["banner"]
        }
      }
    ]
  },

  session: {
    resetByType: {
      group: { mode: "idle", idleMinutes: 120 },
      thread: { mode: "idle", idleMinutes: 240 },
      direct: { mode: "idle", idleMinutes: 360 }
    },


    parentForkMaxTokens: 60000,

    maintenance: {
      mode: "enforce",
      pruneAfter: "14d",
      maxEntries: 1000,
      rotateBytes: "10mb",
      resetArchiveRetention: "14d",
      maxDiskBytes: "4gb",
      highWaterBytes: "3gb"
    },

    threadBindings: {
      enabled: true,

      idleHours: 24,
      maxAgeHours: 72
    }
  },

  channels: {
    discord: {
      threadBindings: {
        enabled: true,
        idleHours: 24,
        maxAgeHours: 72,
        spawnSubagentSessions: true
      }
    }
  }

After 48h, if stable

Change just these 3 lines:

agents.defaults.maxConcurrent = 6
agents.defaults.subagents.maxConcurrent = 8
agents.defaults.subagents.maxChildrenPerAgent = 5

2) Orchestrator prompt block

Use this for Tony, Mason, Pepper-style agents.

## Main Gateway orchestration rules

- `sessions_spawn` is fire-and-forget.
- After spawning a worker, continue with other useful work unless explicitly told to wait.
- Do not poll `sessions_list`, `sessions_history`, or `subagents list` in loops.
- If waiting for completion is actually required, use `sessions_yield`.
- Prefer `lightContext: true` for spawned workers unless the worker truly needs full session history.
- Prefer small focused workers over one giant worker.

## Transcript hygiene

- Never dump raw build logs, full API payloads, or giant file contents into the transcript unless explicitly asked.
- For shell work, redirect verbose output to a file and report only:

  - status
  - key findings
  - final relevant lines
- For HTTP/API work, save payloads to a file and extract only the fields that matter.
- For large reads, inspect targeted slices, not whole files.

## Shared workspace safety

- Never allow two agents to write to the same checkout at the same time.
- If a workspace is shared, use either:
  - a lock file / explicit ownership handoff, or
  - separate worktrees/checkouts.
- Tony deploys, Forge builds.
- Mason plans/specs, Clay executes.
- If another agent owns the shared workspace, do not mutate it.


## Result contract

Return concise summaries:
- what changed
- where it changed
- success/failure
- blockers
- next step

3) Worker prompt block

Use this for Forge-ish and execution worker sessions.

## Worker session rules

- You are a disposable worker session.
- Finish the assigned task and return a concise result.
- Do not busy-poll for parent status.
- Prefer writing artifacts and logs to files over dumping them into the transcript.
- If the task touches a shared repo, verify ownership before writing.
- Return:
  - summary
  - changed files
  - blockers
  - recommended next step

4) Shared workspace rule block

This is the part I’d make explicit.

## Shared workspace ownership

The following workspaces are shared and must be treated as single-writer:

- Tony + Forge:
  - james-agent-surface
  - mission_control_v2_app

- Mason + Clay:
  - shared web build workspace

Rules:
- Only one agent may actively write in a shared checkout at a time.
- Use a lock file or separate worktree before modifying shared code.
- Tony deploys, Forge builds.
- Mason plans/specs, Clay executes.
- If ownership is unclear, stop and return blocked status instead of guessing.

5) Simple lock file template

Drop this into shared repos if you want a dead-simple coordination pattern.

# WORKSPACE_LOCK.md

Current owner: Forge
Task: build verification for mission_control_v2_app
Started: 2026-04-13T20:00Z
Expected handoff: Tony after build artifact is ready
Status: active

That’s not fancy, but it prevents a lot of dumb collisions.

6) Stability checklist for the first 48h

Before you step from 5 / 7 / 4 to 6 / 8 / 5, I’d want all of these true:

no context overflow loop
no repeated compaction storms
no user-visible reply lag during build waves
no main gateway memory pressure
no shared-workspace collisions between Tony/Forge or Mason/Clay
no “why is the agent waiting on its own spawned child?” weirdness

If those stay clean for 48h, bump the caps.

7) My final opinion on this bundle

This is a good v1 for the bigger box.

The main wins are still:

aggressive pruning
shorter-lived sessions
clear spawn behavior
shared-workspace discipline

The bigger box lets you loosen concurrency, but it does not remove the need for those rules.

If you want, next I can do a Meta/Doc lighter bundle too, with milder caps and a simpler prompt set.

hidden elbow Apr 13, 2026, 8:54 PM

comign in a few messages

For Krill 🦐:

KVM 8 is live and verified — 8 vCPU, 32 GB RAM, 400 GB NVMe, 4G swap. Tony confirmed. Main Gateway v1 bundle received and QA'd. Three flags from QA:

agents.list is partial — only 5 of 18 agents. We'll MERGE your subagent blocks into existing per-agent configs, not replace the whole list.

subagents.model: "openai/gpt-5.3-codex" for spawned workers — will verify OpenClaw handles per-spawn model overrides this way before applying.

Shared workspace list — we'll expand it as needed during rollout.

Now I need the Meta/Doc lighter bundle you offered, with updated context:

Meta You instance (same KVM 8 box, separate gateway on port 18810, openclaw-meta-you.service under metayou Linux user):

Currently: 3 agents — James (Opus 4.6), Atlas (Sonnet 4.6), Blaire (GPT-5.4 via OpenAI Pro)

Coming soon: FINN — a financial copilot agent. Voice + text + file upload (CSV/PDF financial statements). Dashboard, weekly reports, plain-language Q&A. Pilot first, then productised for multiple clients.

Growth plan: ~4 more agents beyond FINN, totalling ~7-8 on this gateway. One will likely be a shared services / builder agent (like our Mason/Clay pattern on main).

Auth: Anthropic Max 5x ($100/mo) for James/Atlas. Blaire + future agents on OpenAI Pro. FINN will likely be OpenAI Pro (GPT-5.4 or 5.3-Codex).

Key constraint: Anthropic agents need lean bootstrap (proven at 3.3KB/agent). Output costs 5x input on Anthropic.

Voice: FINN will use voice (like our existing Voice agent on main — Gemini Flash via ElevenLabs). Voice sessions are typically short but can spike.

one more messageing coming below, wait to reply

File processing: FINN will ingest uploaded financial docs (PDFs, CSVs). This means disk I/O and potentially larger context windows during analysis.

Doc instance (separate box, own OpenClaw install):

Currently: 2 agents — Doc (GPT-5.4), Intel (GPT-5.3-Codex)
Growth plan: Staying at 2. No expansion planned. Light workload — audits, research, version watch.
Just needs a sensible lighter config that matches the main bundle's patterns (pruning, session management) but with minimal concurrency.

Jordan/Sabain instance (port 18800, Docker, same KVM 8 box):

Currently: 3 agents — jordan, vic, russ. Separate client deployment.
Growth plan: Will grow by a few agents as Sabain adds capability. Separate config, separate gateway.
Would appreciate a starter template for this one too — something conservative that Sabain can scale up.

Overall fleet on KVM 8 after growth:

Main hive: 18 agents (gateway on 18789)
Meta You: ~7-8 agents (gateway on 18810)
Jordan: ~5-6 agents (gateway on 18800)
Total: ~30-32 agents sharing 8 vCPU / 32 GB RAM

At what point does this box need to become a dedicated server or split into two boxes? FINN with voice + file processing is the heaviest new workload.

Please provide:

Meta You lighter bundle — tuned for 7-8 agents, mixed Anthropic/OpenAI, voice + file upload workload
Doc lighter bundle — tuned for 2 agents, minimal
Jordan starter template — conservative, growable to ~6 agents
Your opinion on KVM 8 ceiling — when do we need to split or upgrade given 30+ agents with FINN's workload?

stable plinthBOT Apr 13, 2026, 8:54 PM

Got it, I’ll wait for the rest before replying.

Yep, here’s the next layer. And good catch on the earlier agents.list, yes, treat all of these as merge blocks, not full replacement files.

Fleet-level take first

On KVM 8, 8 vCPU / 32 GB / 4G swap, this fleet is still workable if you cap active concurrency sanely.

What matters is not “30+ agents exist”, it’s:

how many are active at once
how many are doing heavy exec
how many are doing voice / file analysis
whether one gateway can starve another

My recommended starting gateway caps across this box:

Main: 5 / 7 / 4
Meta: 3 / 4 / 3

Jordan: 2 / 3 / 2
Doc: separate box, can ignore for KVM8 pressure

That gives you a sane host-wide starting point without pretending all gateways deserve the same budget.

1) Meta You lighter bundle

Use case: mixed Anthropic/OpenAI, lean bootstrap, light growth to ~7–8 agents, FINN voice + file upload, one future builder/shared-services agent.

Meta `openclaw.json` merge block

{
  agents: {

    defaults: {
      maxConcurrent: 3,

      // Anthropic-friendly, keeps repeat bootstrap injection down.
      contextInjection: "continuation-skip",

      // Keep this tighter than main, since you already proved lean bootstraps.
      bootstrapMaxChars: 6000,

      compaction: {
        memoryFlush: true,
        reserveTokensFloor: 40000
      },

      contextPruning: {

        mode: "cache-ttl",
        ttl: "5m",

        keepLastAssistants: 3,
        minPrunableToolChars: 6000,

        softTrimRatio: 0.3,
        hardClearRatio: 0.5,

        softTrim: {
          maxChars: 1500,
          headChars: 600,
          tailChars: 600
        },

        hardClear: {
          enabled: true,
          placeholder: "[Old tool result cleared]"
        }
      },

      subagents: {
        maxConcurrent: 4,
        maxChildrenPerAgent: 3,
        maxSpawnDepth: 2,
        archiveAfterMinutes: 45,
        runTimeoutSeconds: 1200
      }
    }
  },


  session: {
    resetByType: {
      group: { mode: "idle", idleMinutes: 120 },
      thread: { mode: "idle", idleMinutes: 180 },
      direct: { mode: "idle", idleMinutes: 300 }
    },

    parentForkMaxTokens: 50000,

    maintenance: {
      mode: "enforce",
      pruneAfter: "14d",
      maxEntries: 600,
      rotateBytes: "10mb",

      resetArchiveRetention: "14d",
      maxDiskBytes: "2gb",
      highWaterBytes: "1.5gb"
    },

    threadBindings: {
      enabled: true,
      idleHours: 24,
      maxAgeHours: 48
    }
  },

  channels: {
    discord: {
      threadBindings: {

        enabled: true,
        idleHours: 24,
        maxAgeHours: 48,
        spawnSubagentSessions: true
      }
    }
  }
}

Meta notes

I’d run Meta a little tighter than Main because:

Anthropic output is expensive
James/Atlas want lean context
FINN’s file sessions can spike context fast

voice sessions are short, but bursty

FINN-specific operating rule

I would strongly encourage this pattern for FINN:

uploaded PDF/CSV comes in
FINN spawns a short-lived analysis worker
worker extracts + summarizes
FINN responds from the summary
raw parsed content does not linger in the main session

That matters a lot.

2) Doc lighter bundle

Use case: 2 agents, separate box, light audits/research/version watch, no growth planned.

Doc `openclaw.json` merge block

{
  agents: {
    defaults: {
      maxConcurrent: 2,
      contextInjection: "continuation-skip",
      bootstrapMaxChars: 8000,

      compaction: {
        memoryFlush: true,
        reserveTokensFloor: 40000
      },


      contextPruning: {
        mode: "cache-ttl",
        ttl: "10m",

        keepLastAssistants: 3,
        minPrunableToolChars: 8000,

        softTrimRatio: 0.3,
        hardClearRatio: 0.5,

        softTrim: {
          maxChars: 1500,
          headChars: 600,
          tailChars: 600

        },

        hardClear: {
          enabled: true,
          placeholder: "[Old tool result cleared]"
        }
      },

      subagents: {
        maxConcurrent: 2,
        maxChildrenPerAgent: 2,
        maxSpawnDepth: 2,
        archiveAfterMinutes: 45,
        runTimeoutSeconds: 1200
      }

    }
  },

  session: {
    resetByType: {
      group: { mode: "idle", idleMinutes: 180 },
      thread: { mode: "idle", idleMinutes: 240 },
      direct: { mode: "idle", idleMinutes: 480 }
    },

    parentForkMaxTokens: 50000,

    maintenance: {
      mode: "enforce",
      pruneAfter: "21d",

      maxEntries: 300,
      rotateBytes: "10mb",
      resetArchiveRetention: "21d",
      maxDiskBytes: "1gb",
      highWaterBytes: "800mb"
    },

    threadBindings: {
      enabled: true,
      idleHours: 24,
      maxAgeHours: 72
    }
  },

  channels: {

    discord: {
      threadBindings: {
        enabled: true,
        idleHours: 24,
        maxAgeHours: 72,
        spawnSubagentSessions: true
      }
    }
  }
}

Doc notes

This one should stay boring.

I would optimize Doc for:

low background noise
good transcript hygiene
light concurrency
reliable long-ish research threads without letting them go immortal

3) Jordan starter template

Use case: separate client deployment, conservative today, growable to ~6 agents, same KVM8 host.

Jordan `openclaw.json` merge block

{
  agents: {

    defaults: {
      maxConcurrent: 2,
      contextInjection: "continuation-skip",
      bootstrapMaxChars: 8000,

      compaction: {
        memoryFlush: true,
        reserveTokensFloor: 40000
      },

      contextPruning: {
        mode: "cache-ttl",
        ttl: "5m",

        keepLastAssistants: 3,

        minPrunableToolChars: 8000,

        softTrimRatio: 0.3,
        hardClearRatio: 0.5,

        softTrim: {
          maxChars: 1800,
          headChars: 700,
          tailChars: 700
        },

        hardClear: {
          enabled: true,
          placeholder: "[Old tool result cleared]"
        }

      },

      subagents: {
        maxConcurrent: 3,
        maxChildrenPerAgent: 2,
        maxSpawnDepth: 2,
        archiveAfterMinutes: 45,
        runTimeoutSeconds: 1200
      }
    }
  },

  session: {
    resetByType: {
      group: { mode: "idle", idleMinutes: 120 },

      thread: { mode: "idle", idleMinutes: 180 },
      direct: { mode: "idle", idleMinutes: 300 }
    },

    parentForkMaxTokens: 50000,

    maintenance: {
      mode: "enforce",
      pruneAfter: "14d",
      maxEntries: 500,
      rotateBytes: "10mb",
      resetArchiveRetention: "14d",
      maxDiskBytes: "1.5gb",
      highWaterBytes: "1.2gb"
    },


    threadBindings: {
      enabled: true,
      idleHours: 24,
      maxAgeHours: 48
    }
  },

  channels: {
    discord: {
      threadBindings: {
        enabled: true,
        idleHours: 24,
        maxAgeHours: 48,
        spawnSubagentSessions: true

      }
    }
  }
}

Jordan scale-up path

Start at 2 / 3 / 2.

If Jordan grows to ~6 agents and behaves well for a week, I’d move it to:

maxConcurrent: 3
subagents.maxConcurrent: 4
maxChildrenPerAgent: 3

No faster than that on the shared box.

4) My opinion on the KVM8 ceiling

Short version

KVM8 is fine for this fleet in the short term.
It is not the forever box if FINN becomes real.

The ceiling is not “32 agents exist”.
The ceiling is more like:

Main doing 2 heavy exec/build jobs
Meta doing FINN voice + file parsing
Jordan doing client work
all at the same time

That is where the box starts to get honest with you.

I would stay on KVM8 if:

FINN is still pilot-level
voice is intermittent
file analysis sessions are limited
heavy exec overlap stays moderate
reply latency stays acceptable

I would split to two boxes when any of these happen

Trigger 1

FINN gets real client traffic
Meaning: more than occasional PDF/CSV analysis, or multiple voice/file sessions in the same day.

Trigger 2

One gateway can starve another
Example: Main build waves make Meta or Jordan feel laggy.

Trigger 3

CPU or RAM starts living hot
My rough “time to split” thresholds:

sustained CPU pressure in normal business hours
RAM regularly above ~75%
swap actually being leaned on, not just sitting there
p95 response latency noticeably degrading during busy windows

Trigger 4

Jordan needs stronger client isolation
This is not just performance, it’s operational cleanliness.
Client deployment sharing a box with your internal hive is okay early, not ideal forever.

5) Which gateway should move first?

There are two reasonable answers.

If you care most about performance risk

Move Meta / FINN first.
Why:

voice + file analysis is the heaviest new workload
context can spike fast
FINN is the most likely “surprise resource consumer”

If you care most about tenant isolation

Move Jordan first.
Why:

it’s already a separate client deployment
cleaner operational boundary
easier to reason about support/noise/isolation

My recommendation

If FINN becomes real before Jordan becomes large, move Meta first.
If Jordan becomes customer-critical first, move Jordan first.

6) My fleet recommendation, blunt version

For now

keep Main on KVM8 with 5 / 7 / 4
keep Meta on KVM8 with 3 / 4 / 3
keep Jordan on KVM8 with 2 / 3 / 2
keep Doc separate and boring

When FINN graduates from pilot

That’s when I’d stop treating KVM8 as the forever home.

At that point, I’d want either:

Box A: Main + maybe Doc
Box B: Meta/FINN + Jordan
or
Box A: Main + Jordan
Box B: Meta/FINN

depending on whether performance or client isolation is the bigger concern.

If you want, next I can package this into a Fleet v1 matrix:

gateway
role
caps
growth path
first split trigger

first move candidate

which would probably be the cleanest “operating sheet” for your team.

#Getting work done with agents

1) Async / parallel agent work

Is sessions_spawn fire-and-forget?

Can an agent spawn itself?

Can an agent run multiple spawns in parallel?

What if the target agent already has an active session on another channel?

Can we use openclaw agent ... & via exec as a workaround?

Best pattern for “spawn, continue, then tell me when done”

2) Session management at scale

17 MB session file, compaction looping

Is there a max session file size or auto-reset threshold?

Can /new be triggered programmatically?

Compaction improvements from 4.2 to 4.9?

3) What I’d change for a 20-agent production hive

A. Stop immortal sessions

B. Turn on aggressive contextPruning

C. Keep verbose output out of the transcript

D. Use subagents as disposable workers

E. Upgrade and unify versions

F. Keep reserveTokensFloor: 40000, but don’t expect it to solve this alone

G. Consider a better compaction model

H. Cap fan-out on purpose

4) My concrete recommendations for your setup

Answers to your last question

1) contextPruning placement

2) Upgrade path, 4.2 → 4.12

Can you do rolling upgrades?

My recommended path

Does the whole hive need to go down?

Anything breaking between 4.2 and 4.12?

Biggest reasons I’d upgrade

3) Recommended openclaw.json snippet

What I would change operationally, beyond config

For heavy exec

For curl

For read

For subagents

My honest recommendation on pruning split

Option A, simplest

Option B, best if you really care

Option C, future

The 5 questions that would help me tighten this further

My recommendation, based on your topology

1) Rolling upgrade checklist

Order I would use

Phase 1, canary

Phase 2, small secondary

Phase 3, main hive

Per-gateway checklist

A. Inventory first

B. Back up before touching it

C. Upgrade binary first, do not change config yet

D. Restart

E. Validate the exact things you care about

F. Only then apply config changes

What I would watch for during upgrade

2) Hive prompt rule set, orchestrators + workers

Orchestrator rules

Worker rules

3) Tony / Forge / Mason / Clay safety rule

I would enforce this immediately

Tony + Forge

Mason + Clay

4) Gateway split recommendation

Right now

When I would revisit a split

If you eventually split later

5) My tuning recommendation, right now

Main gateway

Session lifecycle

6) One extra config tweak I do like for you

7) What I want from you next, if you want a tighter v3

Revised tuning for KVM2

1) Copy-paste rollout runbook

Phase 0, preflight, once

0.1 Freeze the blast radius

0.2 Save current config/state

Meta You

Doc

Main

Is `sessions_spawn` fire-and-forget?

Can we use `openclaw agent ... &` via `exec` as a workaround?

Can `/new` be triggered programmatically?

B. Turn on aggressive `contextPruning`

F. Keep `reserveTokensFloor: 40000`, but don’t expect it to solve this alone

1) `contextPruning` placement

3) Recommended `openclaw.json` snippet

For heavy `exec`

For `curl`

For `read`

1) `openclaw.json` merge block

Meta `openclaw.json` merge block

Doc `openclaw.json` merge block

Jordan `openclaw.json` merge block