#Best way to use Claude Code with OpenClaw

1 messages · Page 1 of 1 (latest)

hardy sundial
#

What's the best way to use Claude Code as a coding subagent? I don't think OpenClaw can use TUI well, though I could be mistaken? Would it be best for the main "orchestrator" agent to spawn a second "coding manager" agent that can then spawn different instances of Claude Code/Codex?

nimble fjordBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> âś… Mark Solution

merry badgerBOT
#

OpenClaw can work with Claude Code/Codex fine, as long as you treat them as interactive CLIs and run them via exec with a PTY. The mistake is trying to “drive the TUI” like a GUI app; the winning pattern is: launch CLI → let it do edits in a workdir → stream logs back → answer prompts if it asks.

1) The core pattern (works for Claude Code)

Run Claude Code in a real project dir (or a scratch dir) with PTY:

# one-shot / mostly non-interactive
bash pty:true workdir:~/your-project command:"claude 'Implement X. Run tests. Summarize changes.'"

# long-running in background (recommended)
bash pty:true workdir:~/your-project background:true command:"claude 'Refactor Y. Keep behavior. Run tests.'"

Then monitor / interact:

#
process action:log sessionId:<id>
process action:poll sessionId:<id>
process action:submit sessionId:<id> data:"yes"   # answer prompts

Why PTY matters: without it, these CLIs often hang or output becomes unusable.

2) Do you need a “coding manager” sub-agent?

Usually: no. The main orchestrator can just spawn multiple background Claude/Codex processes and track them via process.

A “coding manager” agent becomes useful when you want:

  • Parallelism with coordination (N coding runs + one agent summarizing/merging results)
  • A stricter “coding-only” instruction set (style, tests, commit discipline)
  • To keep the main agent’s context clean (planner stays planner)

But it’s optional overhead. Start simple: orchestrator spawns coding CLIs directly.

#

3) Parallel work: isolate with worktrees (strongly recommended)

If you plan to run multiple Claude Code/Codex instances in parallel, don’t point them all at the same working tree. Use git worktrees (or separate branches/dirs) so they don’t fight over files.

4) Practical tip: prefer “one-shot prompts”, not “TUI steering”

Even though OpenClaw can send keystrokes to an interactive session, it’s more reliable to phrase tasks so Claude Code can complete them without back-and-forth:

  • “If you need a choice, stop and ask one question.”
  • “Run tests; if failing, fix; otherwise summarize.”

If you share how you want to invoke Claude Code (interactive vs one-shot) and whether you want parallel runs, I can suggest a clean orchestration layout (single agent vs manager + workers).

hardy sundial
merry badgerBOT
#

You’re right to be skeptical of “start a fresh CLI process for every prompt” if you mean new Claude Code instance each time — you’ll lose conversation state and it’ll tend to re-scan/re-orient.

Two ways to avoid that (pick based on how interactive you want things):

A) Keep a single long-lived Claude Code process per task/worktree

Instead of claude '…' (which exits), start Claude Code in its interactive mode and keep it running in the background. Then OpenClaw just streams logs and sends the next instruction into the same PTY session. That preserves Claude Code’s internal conversation context, so it doesn’t “rediscover” anything each turn.

This is usually the best “coding subagent” shape:

  • 1 worktree = 1 running Claude Code session
  • Orchestrator sends short deltas (“now implement X”, “fix failing test Y”), not full re-prompts

B) If you do one-shot, make it stateful via a “context pack” file

When you restart the process each time, you can still make it efficient by having the first run produce artifacts that future runs reference:

  1. First run: “read docs + map the repo, then write CONTEXT.md (architecture, key dirs/files, how to run tests)”
  2. Subsequent runs: “Use CONTEXT.md, only open files you need, don’t re-scan the whole repo.”
#

That way the “rediscovery” cost is mostly paid once, and later prompts stay small.

Do you need a “coding manager” agent?

Only if you want extra orchestration (multiple parallel worktrees/sessions, merging results, enforcing conventions). Otherwise, the main orchestrator can directly manage multiple long-lived Claude Code PTY sessions.

If you tell me whether you want (1) one coding session at a time vs parallel, and (2) interactive Claude Code UI vs non-interactive runs, I can sketch the cleanest topology.

#

If you literally mean “new Claude Code process per turn”, yep — that’s inefficient, because you throw away the agent’s conversational state and it tends to re-orient / re-scan.

Two solid patterns that avoid “rediscovery”:

  1. Long‑lived Claude Code session (recommended)
  • Start one Claude Code instance in a given workdir (often one per git worktree/branch)
  • Keep it running (PTY + background), and have the orchestrator send follow-up instructions into the same session
  • That preserves Claude Code’s internal context, so subsequent prompts are deltas (“now fix failing test X”), not “re-learn repo”
  1. Stateless runs, but with a durable repo briefing
    If you do want one-shot processes, make the first run produce artifacts the next runs can rely on:
  • CONTEXT.md (architecture + key entrypoints + how to run tests)
  • PLAN.md / TODO.md (current task state, decisions made)
    Then every subsequent run starts with: “Use CONTEXT.md/PLAN.md; don’t re-scan the repo; only open files relevant to this change.”

In practice, (1) is best for “Claude Code as a coding subagent”, and (2) is best for “fire-and-forget batch jobs” where you don’t want to babysit an interactive session.

hardy sundial
#

Since I am an actual programmer, and the tasks I'm assigning it will about half the time be pretty complicated, maybe a solution for both?

like orchestrator (Cerebras)-> codex if its a short task
but orchestrator (Cerebras -> Manager (Cerebras) -> code managers (claude code/codex)?

Also, isn't it really token intensive to stream a TUI? I could be wrong.

I don't think one shotting is on the table btw

merry badgerBOT
#

A “both” setup makes sense, but I’d frame it slightly differently:

1) Don’t choose between one-shot vs persistent — choose persistent sessions + optional quick workers

If one-shot is off the table (reasonable), you can still get the “fast path” by keeping two long-lived coding sessions around:

  • Fast executor (Codex CLI or Claude Code) for small edits / quick refactors
  • Deep executor (Claude Code or Codex) for bigger multi-step work

They can both live in their own git worktree so they don’t stomp each other.

The orchestrator just routes tasks to one of the existing sessions instead of spinning up a fresh process each turn.

2) Orchestrator → Manager → Coding sessions: yes, if you want clean separation

A dedicated “coding manager” agent is useful when you want the orchestrator to stay high-level and not get spammed with terminal output.

Manager responsibilities:

  • create/select worktrees
#
  • start/stop the long-lived claude / codex PTY processes
  • poll logs occasionally
  • summarize + report deltas back to orchestrator
  • maintain a tiny “handoff” file per worktree (STATUS.md, PLAN.md, “what I changed / what’s next”)

That last bullet is huge: it keeps continuity even if anything restarts.

3) “Isn’t streaming a TUI token intensive?”

It can be, but only if you treat it like “live screen streaming into the LLM”.

In practice you don’t do that. You do:

  • run the TUI in the background (PTY)
  • only pull logs when needed (e.g., milestone complete, question asked, error)
  • pull small chunks (offset/limit), and have the manager summarize

So the cost driver is usually Claude Code/Codex doing the work, not OpenClaw “watching the UI”. OpenClaw only spends tokens on whatever terminal output you choose to feed back into the reasoning loop.

#

4) A concrete routing heuristic (works well IRL)

  • If it’s “edit 1–3 files / obvious change / no big design”: send to fast executor session
  • If it’s “touches many modules / needs planning / likely test iteration”: send to deep executor session
  • If you want parallel: spin a new worktree + start another coding session, but keep the manager as the single point of contact.

If you want, describe your typical “complicated” task (refactor? new feature? bug hunt?) and how many parallel threads you expect, and I’ll suggest a clean topology (number of worktrees/sessions, and what artifacts to have them write so context never gets lost).

hardy sundial
#

I would imagine a "simple task" would be like write me a quick python file to scrape soemthing... maybe this could go up to modifying a few files.

A "long" task could range anywhere from adding a (long) feature to making a new project entiely. We need a "manager" when multiple agents run in parallel, right? To help resolve merge conflicts and orchestrate? Or no

merry badgerBOT
#

You don’t need a manager for parallelism, but you usually want (a) isolation and (b) an integration step. A “manager” is basically the component that makes (b) not suck.

Simple vs long tasks (your definitions make sense)

  • Simple (new script / touch a few files): one coding session is fine. You can still run it in a worktree/branch, but it’s not mandatory.
  • Long (big feature / new project): keep a persistent coding session, and have it maintain PLAN.md + STATUS.md (or similar) so state survives interruptions and the orchestrator stays oriented.

Parallel runs: what you actually need

What you must have is separate working directories (git worktrees/branches) per agent/session. That alone prevents most chaos.

A “manager” is optional, but becomes very useful when you have 2+ workers because it can:

  • create/track worktrees (one per worker)
  • throttle log ingestion (only pull relevant chunks, summarize)
  • keep a global plan (“who owns what”)
  • do the integration loop: merge/rebase worker branches one-by-one + run tests

Merge conflicts: can the manager solve them?

Sometimes, but the best strategy is to avoid conflicts rather than “get good at resolving them”:

#
  • split work by module/area (worker A touches API, worker B touches UI, etc.)
  • agree on “owned files” per worker
  • merge frequently (small diffs), don’t let branches drift for hours

When conflicts still happen, a manager can attempt resolution, but IMO you still want:

  • a final “integration” pass (tests + sanity check) by either you or a dedicated integration session

A clean topology that fits what you’re proposing

  • Orchestrator (fast/cheap model): plans, splits tasks, decides which worker gets what
  • Manager (same cheap model): sets up worktrees, starts/stops coding sessions, summarizes progress, integrates merges
  • Workers (Claude Code/Codex, persistent PTY sessions): do the actual coding inside their own worktree

If you usually only run one coding session at a time, you can skip the manager. If you expect parallel work even occasionally, the manager pays for itself quickly just by keeping things organized and keeping the orchestrator from drowning in terminal output.

hardy sundial
#

Just so we get our vocabulary right, "orchestrator" is just referring to the main openclaw model/agent, I guess that's layer 0?

The impelementation you mention won't always work for agents, because tasks could involve multiple agents touching main.go, for example

merry badgerBOT
#

Yep — in this thread, “orchestrator” = the top-level OpenClaw agent that receives your request and decides what tools/sub-agents to run. Calling it “layer 0” is a good mental model.

On the “multiple agents touching main.go” point: you’re also right. “Just split by module” is a nice-to-have, but real work often has shared choke points (entrypoints, routing tables, DI wiring, etc.). The trick is to change the collaboration rule, not pretend the choke points don’t exist.

What works in practice: single-writer for shared files

When a file is a hotspot (main.go, router.ts, AppDelegate, etc.), make it owned by exactly one session at a time:

  • Workers can still run in parallel, but they either:
    1. avoid editing main.go (they add new packages/files + tell the owner what hook to add), or
    2. produce instructions/diffs for main.go, and the owner applies them.

This keeps parallelism while eliminating the worst merge-conflict failure mode (two LLMs “helpfully” rewriting the same block differently).

Alternative if you really want concurrent edits: “integration queue”

If multiple sessions must edit main.go, you can still do it, but serialize the integration:

  • Each worker works in its own worktree/branch
#
  • The manager (or a designated “integrator” session) merges/cherry-picks one worker at a time, runs tests, fixes conflicts, then proceeds

Conflicts will still happen, but they’re handled in one place, with one set of eyes/context, instead of becoming a distributed mess.

Practical heuristic

  • Parallelize creation (new files, new packages, new components)
  • Serialize wiring (entrypoints + registries + global config)

That gets you most of the speedup while keeping the repo coherent.

If you tell me your typical “shared file” hotspots (just Go main/router, or lots of cross-cutting edits?), I can suggest a concrete ownership scheme (which files should be single-writer, and what artifacts workers should output so the integrator can wire things fast).

hardy sundial
merry badgerBOT
#

Yep — you can wire this into OpenClaw pretty cleanly. Think in terms of (1) OpenClaw agents/sub-agents for planning + coordination and (2) long-lived CLI processes for the actual “coding sessions.”

1) The simplest integration (no special config): “spawn a code-manager sub-agent”

Use OpenClaw Sub-Agents as the “manager”, and have that manager run Claude Code/Codex as background PTY processes (one per worktree). This gives you parallelism + continuity without turning the main chat into terminal soup.

Docs: https://docs.openclaw.ai/tools/subagents

Key limitation to know up front: sub-agents can’t spawn more sub-agents (no nested fan-out). So your “manager” shouldn’t try to create worker subagents — it should create worker processes (Claude Code/Codex) via exec/process.

2) Make it reliable: use worktrees + an “integrator” rule (hot files like main.go)

In OpenClaw terms, you implement your earlier strategy like this:

  • For parallel work: manager creates one git worktree per worker.
  • Start one long-lived Claude Code/Codex process per worktree (PTY + background).
  • Adopt a single-writer/integrator rule for choke-point files (like main.go):
    • Workers avoid editing main.go directly; they output “integration instructions”
    • The integrator branch/session applies those changes and merges workers one-by-one + runs tests
#

This is the part that makes “multiple agents touch main.go” not explode.

3) Token cost: don’t “stream the TUI”, sample logs + write status files

It is token-expensive if you continuously ingest full-screen redraw output.

Instead:

  • only read logs on events (agent asks a question, tests fail, milestone done)
  • pull small chunks (tail-ish) rather than full screens
  • have the manager maintain STATUS.md / PLAN.md in the repo and just read those most of the time

That way OpenClaw isn’t paying tokens to watch a terminal repaint.

4) If you want this as a “real” component: make a dedicated agent + skill

If you want it to feel first-class/reusable (instead of re-prompting the pattern), do both:

  • Create a dedicated agent (e.g. code-manager) with the tools you want enabled/disabled (and a cheaper model if you like).
  • Add a workspace or managed skill that teaches the exact worktree/PTY/integration workflow.
hardy sundial
#

Mac Mini (local). Yes, Claude Code and Codex CLI will be running on the same machine. How do you determine when an agent asks questions or completes instead of just logging that it wrote files.

And yes, I want this as a real component... I work on programming projects all day

merry badgerBOT
#

On a local Mac mini where Claude Code/Codex run on the same box, you basically have two signaling channels back into OpenClaw:

1) “Process exited” (automatic, reliable)

If you start Claude/Codex via OpenClaw exec in background, OpenClaw will emit a system event when it exits (default: tools.exec.notifyOnExit: true). That’s your clean “done” signal for anything that can terminate.

Docs: https://docs.openclaw.ai/tools/exec (see notifyOnExit) + https://docs.openclaw.ai/gateway/background-process

But for a long-lived TUI session that stays running, “exit” never happens—so you need an explicit protocol.

2) “In-band events” (you define a protocol; this is how you detect questions/milestones)

There’s no universal, reliable way for OpenClaw to know “Claude is asking a question” vs “Claude printed stuff” just by looking at stdout. TUIs aren’t structured.

What works in practice is: teach your coding session to self-report:

  • When it needs input, it must print a sentinel like:
    • OPENCLAW_NEEDS_INPUT: <question>
  • When it completes a milestone, print:
#
  • OPENCLAW_DONE: <summary>
  • And optionally, to wake OpenClaw immediately, have it run:
    • openclaw system event --mode now --text "Coder needs input: <question>"
    • openclaw system event --mode now --text "Coder done: <summary>"

Docs for the wake mechanism: https://docs.openclaw.ai/cli/system (openclaw system event)

That gives you a deterministic “question asked / done” signal without streaming the whole TUI.

Is TUI streaming token intensive?

Only if you keep calling process log and feeding big terminal dumps back into the model. The cheap way is:

  • don’t “live stream”
  • poll logs only on events (or in small tail chunks)
  • make the coder write STATUS.md / PLAN.md so the manager reads files instead of terminal noise

Turning this into a “real component” in OpenClaw

In OpenClaw, the durable way to productize a workflow is:

#
  1. Create a dedicated agent (your “code-manager” brain)
    Docs: https://docs.openclaw.ai/concepts/multi-agent
    CLI helper: openclaw agents add code-manager

  2. Use sub-agents so your main chat can spawn a manager run without blocking
    Docs: https://docs.openclaw.ai/tools/subagents
    (Also note: sub-agents can’t spawn more sub-agents, so the manager should spawn CLI processes, not more subagent runs.)

  3. Add a skill that encodes your protocol (worktrees, single-writer for main.go, sentinel events, etc.)
    Docs: https://docs.openclaw.ai/tools/skills

If you want, I can sketch the exact “manager runbook” I’d put into a SKILL.md (including the sentinel strings + the single-writer/integration queue rules for hotspot files like main.go).

hardy sundial
#

I'm not profficient with OpenClaw yet, so to be clear: when we make the dedicated agent, do we instruct it to use sub agents and the skill?

merry badgerBOT
#

You don’t really “instruct the dedicated agent to use sub-agents” as a requirement — you use the dedicated agent in two different roles, and they’re independent:

1) Dedicated agent = a persona + workspace + defaults

The code-manager agent is where you put the durable behavior:

  • its own AGENTS.md (“always use git worktrees; single-writer for main.go; maintain PLAN.md/STATUS.md; use PTY for Claude/Codex; emit OPENCLAW_DONE / OPENCLAW_NEEDS_INPUT…”)
  • optionally a skill that encodes that workflow (so it’s packaged / reusable, and shows up in the agent’s skill list automatically)

So: yes, the dedicated agent should “have the skill available” (installed into its workspace skills folder or shared ~/.openclaw/skills). Once installed, you typically don’t need to keep reminding it — OpenClaw injects eligible skills into the prompt.

Skills docs: https://docs.openclaw.ai/tools/skills

2) Sub-agents = how the main orchestrator delegates without blocking

Your main (layer 0) orchestrator agent uses sessions_spawn to run a background turn under the code-manager agent when you want coding work done.

So: yes, the orchestrator is what uses sub-agents to hand work to the code-manager (so your main chat stays responsive).

#

Sub-agents docs: https://docs.openclaw.ai/tools/subagents

The one config bit you’ll likely need

By default an agent can only spawn sub-agents under itself. To let main spawn a run under code-manager, set:

{
  agents: {
    list: [
      {
        id: "main",
        subagents: { allowAgents: ["code-manager"] },
      },
      { id: "code-manager" },
    ],
  },
#
}

(From the sub-agents doc: agents.list[].subagents.allowAgents.)

Important constraint (so you design it right)

Sub-agents cannot spawn sub-agents (no nesting). So your code-manager agent should not try to spawn worker subagents. For parallelism it should spawn CLI processes (Claude Code / Codex) via exec + process (PTY, background), and coordinate them itself.

If you want, I can outline the “minimal practical setup” in order (create agent → install skill → add allowAgents → what to put in code-manager’s AGENTS.md), so you can copy/paste it.

hardy sundial
#

Wait, i think i may have found a better way to do this. At least for codex, it supports an SDK. So couldn't we create a plugin for calling it? # Codex SDK

If you use Codex through the Codex CLI, the IDE extension, or Codex Web, you can also control it programmatically.

Use the SDK when you need to:

  • Control Codex as part of your CI/CD pipeline
  • Create your own agent that can engage with Codex to perform complex engineering tasks
  • Build Codex into your own internal tools and workflows
  • Integrate Codex within your own application

TypeScript library

The TypeScript library provides a way to control Codex from within your application that is more comprehensive and flexible than non-interactive mode.

Use the library server-side; it requires Node.js 18 or later.

Installation

To get started, install the Codex SDK using npm:

npm install @openai/codex-sdk

Usage

Start a thread with Codex and run it with your prompt.



const codex = new Codex();
const thread = codex.startThread();
const result = await thread.run(
  "Make a plan to diagnose and fix the CI failures"
);

console.log(result);

Call run() again to continue on the same thread, or resume a past thread by providing a thread ID.

// running the same thread
const result = await thread.run("Implement the plan");

console.log(result);

// resuming past thread

const threadId = "<thread-id>";
const thread2 = codex.resumeThread(threadId);
const result2 = await thread2.run("Pick up where you left off");

console.log(result2);

For more details, check out the TypeScript repo.

merry badgerBOT
#

Yep — for Codex specifically, a plugin tool wrapping the Codex SDK can be a cleaner integration than driving the TUI.

It buys you two big things:

  1. Persistent state without a long-lived terminal (keep a threadId, call run() again)
  2. Structured “needs input vs done” without guessing from terminal output (you can require JSON output and parse it)

Claude Code doesn’t have an equivalent SDK (at least publicly), so you’d still run Claude Code via PTY. But a hybrid is totally reasonable: Codex via SDK tool, Claude via CLI tool.

How you’d integrate this into OpenClaw (high level)

A) Create a dedicated agent (your “component boundary”)

Make a code-manager agent and put your durable workflow rules there (AGENTS.md + skills). Your main/orchestrator spawns it as a sub-agent when you want coding work done.

Docs: https://docs.openclaw.ai/concepts/multi-agent
Sub-agent spawn rules: https://docs.openclaw.ai/tools/subagents

#

B) Add a plugin that registers an optional tool (e.g. codex_sdk_run)

OpenClaw plugins can register agent tools via JSON schema.

Docs:

You’d implement a tool roughly like:

  • input: { repoId?, worktreePath?, threadId?, prompt }
  • output: { threadId, status: "needs_input"|"done"|"progress", question?, summary?, nextSteps? }

and inside the tool you:

  • startThread() if no threadId
  • resumeThread(threadId) otherwise
  • call thread.run(...)
#

C) Make “questions/done” deterministic (this answers your earlier question)

Instead of trying to infer “is this a question” from logs, you force the SDK run to comply:

In the prompt you send via the tool, require something like:

“Respond with ONLY valid JSON:
{ "status": "needs_input" | "done", "question": "...?", "summary": "...", "next": ["..."] }”

Then your plugin parses JSON. If parsing fails, return {status:"progress", rawText: ...}.

That’s vastly more reliable (and cheaper) than streaming a terminal.

One important config detail

If your plugin tool is registered as optional: true (recommended), you must allow it for the agent that should use it (e.g. code-manager) via tool allowlists. From the docs you can allow a specific tool name, the plugin id, or group:plugins.

Docs: https://docs.openclaw.ai/plugins/agent-tools (see “Optional tool (opt-in)”)

#

How this fits with your “real component” goal

  • Agent (code-manager) = your long-lived policies + workspace + skills
  • Skill = “how to run parallel worktrees / single-writer main.go / integration queue”
  • Plugin tool (codex_sdk_run) = the programmatic Codex “worker” with persistent threads + structured status

If you want, I can give you a minimal folder skeleton for the plugin (openclaw.plugin.json, index.ts, and the config snippets to (1) enable the plugin, (2) allow the tool only for code-manager, (3) allow main to spawn sub-agents under code-manager).

#

Yep — for Codex specifically, a plugin tool wrapping @openai/codex-sdk is a legitimately nicer integration than “drive the TUI and parse stdout”.

One key nuance from the SDK README though: the SDK is a structured wrapper around the codex CLI (it spawns the CLI and exchanges JSONL over stdin/stdout). So it’s not “pure API”, but you still get the big wins: structured events, thread continuity, and no TUI repaint spam.

Why this is better (practically)

The SDK gives you:

  • Persistent threads (resumable via threadId, stored under ~/.codex/sessions)
  • runStreamed() → an async stream of structured events like:
    • item.completed (often includes tool/file-change-ish artifacts)
    • turn.completed (hard “done” signal + usage)
  • Structured output via outputSchema so you can force:
    • { status: "ok" | "action_required", summary: "...", question?: "..." }
      This is the clean answer to “how do I detect questions vs progress?”

So your OpenClaw agent/tool doesn’t have to guess whether Codex is asking something — the schema forces it to say so.

#

How it fits into the OpenClaw architecture you were building

  • Keep the code-manager agent (policies, worktrees, single-writer rules, integration queue).
  • Add a plugin tool like codex_sdk_turn.
  • The manager uses codex_sdk_turn for Codex work, and uses exec+pty only for Claude Code (since Claude Code is still CLI/TUI-first).

What the plugin tool would look like (conceptually)

Tool input:

  • prompt
  • threadId? (optional; resume if present)
  • workingDirectory (worktree path)
  • outputSchema? (optional but recommended)

Tool output:

  • threadId (always)
  • status: "ok" | "action_required"
  • summary
  • question?
#
  • maybe items / eventsSummary / changedFiles (whatever you want to surface)

And internally it runs something like:

  • const codex = new Codex()
  • const thread = threadId ? codex.resumeThread(threadId) : codex.startThread({ workingDirectory })
  • await thread.runStreamed(prompt, { outputSchema })
  • aggregate events until turn.completed

OpenClaw specifics you’ll need to wire

If you want, I can paste a minimal plugin skeleton (openclaw.plugin.json + index.ts) and the exact JSON5 config snippets to:

  1. load the plugin from ~/.openclaw/extensions/...
  2. enable it
  3. allow the tool only for code-manager (not globally)
#
  1. allow main to spawn sub-agent runs under code-manager
hardy sundial
#

Interestingly enough, boith Codex and Claude Code actually have subagent support built in. So maybe we could just have instructions that say: For any UI design/frontend taks, use claude, and for everything else use codex.

merry badgerBOT
#

That can work, with one important clarification:

  • Claude Code/Codex “subagents” are internal to those tools. OpenClaw won’t see them as separate workers it can route/monitor/merge—OpenClaw just sees one running Codex/Claude process (or one SDK thread) doing “whatever it does internally”.

So the sane hybrid is:

Recommended shape: OpenClaw picks the runner, runner uses its own subagents

  1. OpenClaw (or your code-manager agent) decides which engine owns this task:
    • UI/UX/front-end heavy → Claude Code
    • backend/infra/general → Codex
  2. Within that engine, let it use its built-in subagent/decomposition features however it wants.

That keeps orchestration simple while still letting Codex/Claude do multi-step work efficiently.

But “UI vs everything else” needs one more rule: shared-file ownership

Even if you route by domain, you’ll still hit hotspots (main.go, shared config, routers). So add one extra policy:

#
  • If a task will touch shared choke-point files, designate an integrator/owner for those files (often “the backend/Codex side”), and have the other side propose changes as instructions/diffs, not direct edits.

Otherwise you’ll still spend time resolving “two agents rewrote the same block differently”.

Practical routing heuristic (more robust than “UI vs everything”)

Instead of guessing by task wording, route by what’s being changed:

  • If it’s mostly web/, ui/, frontend/, .css, .tsx, component libs → Claude Code
  • If it’s cmd/, pkg/, .go, infra, CI, DB, services → Codex
  • If it spans both: split into two worktrees and merge sequentially (integrator merges one branch at a time + runs tests)

Token/cost note

This doesn’t magically reduce model usage—internal subagents still cost tokens. The big savings vs a TUI-driven approach is that you avoid feeding tons of terminal repaint output back into OpenClaw.

If you want, I can give you a copy/paste “routing + ownership” section you can drop into your code-manager agent’s AGENTS.md so it consistently applies these rules.

hardy sundial
#

Also, how do plugins work? Will the code-manager agent "see" the plugin? I'll expose different endpoints for talking to a specific thread, listing threads, etc. Should I also add a webhook or whatever for when messages are done? also, when i said seperae by task, I meant like the agent wil ask codex to do most things, and codex then can spawn its own subagents. same thing with claude. I'm sure we can figure out some sort of worktree for the two agents it they work together... but within the Codex and Claude CLIs, subagents have automatically handled worktrees

merry badgerBOT
#

Plugins are code that runs inside the Gateway process (TypeScript module loaded at runtime). They can register tools (what the LLM “sees”), plus optional CLI commands/RPC/services.

Will the code-manager agent “see” the plugin?

Yes — if the plugin is enabled and the tool is allowed for that agent.

  • If you register your tool as required (default), it’ll show up for all agents (unless you deny it).
  • If you register it as optional (recommended for side-effecty stuff), then you must explicitly allow it in config via the agent’s tool allowlist.

Docs:

So the shape is:

  1. enable plugin → restart gateway
  2. allow tool for code-manager (and maybe deny for main if you want)

“Different endpoints”: tool API design

#

In OpenClaw you usually expose this as either:

  • one tool with an action field (run_turn, list_threads, resume, get_status, stop), or
  • several tools (codex_thread_run, codex_thread_list, …)

Either is fine. One-tool-with-action is simpler to wire into allowlists.

Do you need a webhook to know when it’s done?

Not necessarily:

  • If your tool call is synchronous, “done” is just the tool returning.
  • If you want it async/background, do a start → poll pattern:
    • codex_start(...) -> { jobId/threadId }
    • codex_poll(jobId) -> { status: running|needs_input|done, ... }

That’s usually cleaner than webhooks in OpenClaw-land.

If you do want push notifications, the OpenClaw-native mechanism is a system event + heartbeat wake (so the agent notices immediately), but I’d start with polling because it’s simpler and very reliable.

#

Your “separate by task” idea (Codex/Claude internal subagents)

That’s a good plan. Just keep in mind:

  • OpenClaw treats “Codex is running” as one worker, even if Codex internally spawns subagents.
  • Same for Claude Code.

So you still want one top-level rule to prevent repo chaos when both engines collaborate:

  • don’t run both against the same working tree simultaneously
  • use worktrees (one for “Codex engine”, one for “Claude engine”), then merge sequentially

Even if the CLIs manage internal worktrees, having OpenClaw enforce “engine-level isolation” avoids the worst collisions.

If you want, I can suggest an MVP plugin tool surface (exact schema + start/poll/runTurn) that matches the Codex SDK capabilities (threadId, runStreamed(), outputSchema) and fits nicely into a code-manager agent workflow.

hardy sundial
#

How do we tell the main agent to spawn a code manager agent? Will the main agent be able to find out what the code manager is doing to communicate it back to me on telegram? Does the code manager dissolve itself once the task is done?

merry badgerBOT
#

To make “main → code-manager” work in OpenClaw you use sub-agents, spawned under a dedicated agentId.

1) How do you tell the main agent to spawn code-manager?

Two parts:

A) Create the agent

openclaw agents add code-manager

B) Allow main to spawn sub-agents under code-manager

In config (JSON5), on the main agent:

{
  agents: {
#
    list: [
      {
        id: "main",
        subagents: { allowAgents: ["code-manager"] },
      },
      { id: "code-manager" },
    ],
  },
}

Docs: https://docs.openclaw.ai/tools/subagents (see “Cross-Agent Spawning”)

C) Actually spawn it (from chat)

Then you just ask in chat something like:

#

“Spawn a sub-agent under code-manager to implement X in repo Y. Use Codex SDK tool if available; maintain STATUS.md.”

The main agent will call sessions_spawn(..., agentId:"code-manager") behind the scenes.

2) Can main see what code-manager is doing?

Yes, in two ways:

  • Automatically at the end: sub-agents announce their result back to the requester chat when done.
  • While it’s running: you (or the main agent) can use:
    • /subagents list
    • /subagents log <id> [limit] [tools]
    • /subagents info <id>

Docs: https://docs.openclaw.ai/tools/subagents (section “Managing Sub-Agents”)

If you want nice progress updates (not raw logs), have code-manager write/update a STATUS.md and you can ask main to fetch/quote it periodically.

#

3) “Communicate it back to me on Telegram”

By default, the sub-agent’s announcement goes back to the same chat/thread that started it.

So:

  • If you spawn from Telegram, you’ll get the result on Telegram.
  • If you spawn from Discord, you’ll get the result on Discord.

Cross-posting from Discord → Telegram is possible, but it’s an extra explicit step (main would send a Telegram message after it receives the announce).

4) Does code-manager dissolve itself when done?

Sub-agent sessions are auto-archived after a while (default 60 minutes). They don’t vanish instantly, and transcripts are preserved.

You can also request immediate cleanup via the cleanup option in sessions_spawn (defaults to "keep"). Docs: https://docs.openclaw.ai/tools/subagents (see parameters + “Auto-Archive”).

hardy sundial
#

Then, there are also default instructions that always need to pass to code manager

merry badgerBOT
#

Yes — this is exactly the kind of thing that fits well as a skill, with one caveat about how it triggers:

What a skill can/can’t do

  • A skill is primarily instruction text injected into the agent’s prompt. It can strongly nudge the agent to always route “build stuff” through your code-manager flow.
  • But it won’t deterministically auto-run just because you said “make something” in natural language. The model still chooses what to do.
  • If you want it to be deterministic, use an explicit slash command (e.g. /make …) or a plugin command.

Recommended: make a /make skill (user-invocable)

Create a skill named make that you invoke like:

/make build a scraper that…

In SKILL.md, set user-invocable: true and write instructions like:

  • spawn a sub-agent under code-manager (sessions_spawn with agentId:"code-manager")
  • pass the user’s request verbatim
  • require the manager to maintain PLAN.md + STATUS.md
  • use Codex via your plugin tool (SDK) for non-UI tasks, Claude Code for UI tasks
#
  • enforce worktree isolation + single-writer rules for hotspots

Docs (skills): https://docs.openclaw.ai/tools/skills
Docs (subagents): https://docs.openclaw.ai/tools/subagents

If you want “when I say make X” to always trigger

That’s more of a plugin command than a skill.

Plugins can register auto-reply commands that run without invoking the LLM (good for routing / dispatch), or register a tool and then the agent uses it.

Docs (plugins): https://docs.openclaw.ai/tools/plugin
Docs (plugin commands + tools): https://docs.openclaw.ai/plugins/agent-tools

Quick question so I can point you at the cleanest option

Do you want the trigger to be:

  1. Explicit: you type /make … (most reliable, easiest), or
  2. Implicit: any message containing “make/build/create …” should route automatically (needs a plugin command + stricter parsing rules)?
#

An error occurred while processing your request. You can retry your request, or contact us through our help center at help.openai.com if the error persists. Please include the request ID req_910798b977554d54a14e889948c995a5 in your message.

hardy sundial
#

Implicit.

merry badgerBOT
#

Implicit is doable, but not via a skill alone.

A skill is “instructions the model can follow”, not a guaranteed message router. OpenClaw currently doesn’t have a built-in “on every inbound message, if it matches regex X then spawn agent Y” hook (the hooks system today is mostly /new//reset/startup, not arbitrary message routing).

The reliable way to make it implicit: route your Telegram chat to code-manager (bindings)

If what you want is: “when I message on Telegram, it behaves like my builder/programming agent automatically”, the cleanest setup is:

  • Keep a main agent for other stuff (Discord, general chat, etc.)
  • Make Telegram → code-manager by default using multi-agent bindings

Docs: https://docs.openclaw.ai/concepts/multi-agent

Conceptually your config ends up like:

{
#
  agents: {
    list: [
      { id: "main", default: true },
      { id: "code-manager" },
    ],
  },
  bindings: [
    { agentId: "code-manager", match: { channel: "telegram" } },
    { agentId: "main", match: { channel: "discord" } },
  ],
}

That makes it fully implicit: no spawning needed. Every Telegram message lands in the code-manager agent.

“Default instructions that always need to pass to code-manager”

#

Put those defaults in code-manager’s workspace (so they’re always present), e.g. in its AGENTS.md and/or in a skill installed into that agent’s workspace skills/.

So you don’t have to “pass them” each time—the agent just is that way.

Docs: https://docs.openclaw.ai/tools/skills

If you really want “same Telegram chat sometimes main, sometimes code-manager”

That’s the hard part to do deterministically without an explicit trigger, because once a message is routed to one agent, the other agent doesn’t automatically get a turn.

In practice people solve this with one of:

  • an explicit command (/make ...) (most reliable), or
  • separate chats/surfaces, or
  • accept that “main routes to code-manager” will be model-driven, not guaranteed.

If you tell me whether your Telegram is only you (no groups), routing the whole Telegram channel to code-manager is usually the simplest and it feels exactly like “implicit”.

hardy sundial
#

I can't message the main agent and ask it to build something, which will then spawn the second agent? Also, how do we give the code-manager agent "instructions" since everytime it needs context on how to properly use tools, to write to STATUS.md, etc

merry badgerBOT
#

Yes, you can message the main agent (“build X”), and have it spawn a code-manager sub-agent to do the work.

There are two separate things to set up: routing/spawning, and persistent instructions.


1) Main agent → spawn code-manager (from the same Telegram chat)

A) Add a code-manager agent

openclaw agents add code-manager

B) Allow cross-agent sub-agent spawning

In ~/.openclaw/openclaw.json (JSON5), allow main to spawn under code-manager:

#
{
  agents: {
    list: [
      { id: "main", default: true, subagents: { allowAgents: ["code-manager"] } },
      { id: "code-manager" },
    ],
  },
}

Now when you tell main “build X”, you also tell it (once) in main’s instructions:

“For any build/coding request: spawn a sub-agent under code-manager and delegate the full task. Do not implement directly.”

That makes it mostly implicit in practice (still model-driven, but it’s the standard way).

#

How does main know what code-manager is doing?

  • When the sub-agent finishes, it announces back and main posts the summary to your Telegram chat automatically.
  • While it’s running, you can use /subagents list + /subagents log … to inspect it (or ask main to do that and summarize).

Docs: https://docs.openclaw.ai/tools/subagents

Continuity note: if you want a single ongoing code-manager run you can continue to talk to, don’t keep spawning fresh ones—spawn one long-running sub-agent and then use /subagents send <id> ... for follow-ups.


2) How to give code-manager “instructions” so it always behaves the same

Each agent has its own workspace (with its own AGENTS.md, skills, etc.). The code-manager agent will automatically “know the rules” every turn if you put them in its workspace files.

Do this:

  1. Find the code-manager workspace path (it’s shown by openclaw agents list; by default it’s something like ~/.openclaw/workspace-code-manager).
#
  1. Put your durable rules in:

<code-manager-workspace>/AGENTS.md, e.g.

  • Always create/operate in git worktrees for parallel work
  • Always maintain STATUS.md + PLAN.md in the repo (or per-worktree)
  • Always start by reading STATUS.md/PLAN.md before acting
  • Use Codex for backend/infra, Claude Code for UI
  • Single-writer rule for hotspot files (main.go, routers, etc.)

Optionally also install a skill into the code-manager workspace (<workspace>/skills/...) if you want it packaged as a reusable procedure, but for “always-on policy”, AGENTS.md is the workhorse.

Docs (skills + where they load from): https://docs.openclaw.ai/tools/skills


About “implicit”

If you mean “I never want to think about spawning at all”, the most deterministic setup is actually to bind your Telegram DM to code-manager (multi-agent bindings). But if you want to keep “main” as the chat you talk to, then the best you can do is: strong default instructions in main to always delegate coding work to code-manager.

#

If you want, paste what you want your “default instructions” to be (bullet list), and I’ll turn it into a clean AGENTS.md block for both:

  • main (delegation rules)
  • code-manager (execution + STATUS.md/PLAN.md + worktree rules)
hardy sundial
#

So for the main agents agent.md I tell it for any billing or coding requests on the sub-agent, and then in the code managers agent.md, I tell it all the instructions for writing code.

merry badgerBOT
#

Yes — that’s the right mental model.

main/AGENTS.md (dispatcher / product-owner brain)

Goal: recognize “build/coding” requests and delegate.

Put rules like:

  • For any request that involves writing/modifying code, creating a project, debugging a repo, refactors, CI fixes → spawn a sub-agent under code-manager and pass the full task + constraints.
  • Main stays high-level: clarifies requirements, sets acceptance criteria, asks 1–3 key questions, then delegates.
  • Main summarizes results back to you when the code-manager announces completion.

(You can also add exceptions like “tiny one-liners are OK in main”, but if you want consistency, delegate everything.)

code-manager/AGENTS.md (execution brain)

Goal: actually do the engineering work consistently.

This is where you encode:

  • worktree policy (one worktree per parallel stream)