New to openclaw, and have set it up on a mac studio that is also running ollama - both natively, along with some other stuff running in containers. Openclaw is set up as local only to ollama and ollama has a handful of llm's running in it. I'm connected Telegram to chat with it. What I'd like to do is have an agent manage my tasks by spawning them off and using the appropriate llm for the task, and making sure they get done vs doing them itself so it stays available for continued chatting and potential new tasks. I've done some preliminary research and it seems that this can be accomplished via sub-agents or session_spawns. I'm not exactly sure which is better or more appropriate, but it seems sub-agent? Or is there a more appropriate way to do this? Although the docs discuss what each is, I didn't really see which was more appropriate or how I would set this up. Any help, or links about this, would be appreciated.
#multi llm usage with manager agent
1 messages · Page 1 of 1 (latest)
Yep, for what you described, I’d use sub-agents.
The clean mental model is:
- sub-agent = the pattern/runtime
sessions_spawn= the tool the agent uses to launch one/subagents spawn= the human-facing command
So it’s not really “sub-agents vs sessions_spawn”. sessions_spawn is how sub-agents get created.
What I’d recommend for your setup:
- Keep one main manager agent on Telegram
- Create specialized worker agents with different default models
- Let the manager spawn those workers for long/heavy tasks
- Have workers announce results back when done
That gives you:
- a responsive main chat
- per-task model choice
- cleaner separation of roles
Best pattern for multi-LLM routing
If you only want “offload long work”, a single agent + subagents is enough.
If you want “use the right model for the right job”, use multi-agent + subagents:
agents.list[].modelfor each worker’s default modelsubagents.allowAgentsso the manager can target them- optionally
subagents.requireAgentId: trueso the manager must choose a named worker explicitly
Minimal idea:
{
agents: {
defaults: {
subagents: {
maxSpawnDepth: 1,
runTimeoutSeconds: 900,
},
},
list: [
{
id: "main",
default: true,
model: "ollama/your-manager-model",
subagents: {
allowAgents: ["research", "writer", "planner"],
requireAgentId: true,
},
},
{ id: "research", model: "ollama/your-research-model" },
{ id: "writer", model: "ollama/your-writing-model" },
{ id: "planner", model: "ollama/your-planning-model" },
],
},
}
A couple important gotchas:
- You probably do not need
maxSpawnDepth: 2unless you want a spawned sub-agent to spawn more workers. For most setups, keep it at1. - Telegram is fine for background sub-agent completions, but persistent thread-bound subagent sessions are currently Discord-only.
- With Ollama/local models, tool calling can be hit-or-miss. If the manager fails to spawn reliably, that’s often the model, not your config. I’d keep the manager on your most reliable tool-calling model.
Docs:
- https://docs.openclaw.ai/tools/subagents
- https://docs.openclaw.ai/concepts/multi-agent
- https://docs.openclaw.ai/gateway/configuration-reference
- https://docs.openclaw.ai/help/faq
If you want, I can sketch a working openclaw.json example for a Telegram manager + 2 or 3 Ollama workers.
That would be most helpful, here's some details -
These are the LLM's I have available on ollama:
- gemma4:latest
- qwen3.5:27b
- qwen3-coder-nest
- gpt-oss:latest
- deepseek-r1:14b
These are the tasks I'm looking to accomplish. They don't all need a separate agent, but the chat manager should always be available and able to spawn a new agent for a new task of up to 3 concurrent tasks. I'd also like to keep the 3 most recently used LLM always loaded to reduce load times when possible. If that's better done through ollam config then I can do that. Is there any other info you need?
Here's the tasks -
- Chat Manager - connected to telegram, respond to chats, spawn appropriate agents for the tasks provided
- SWE / Coding assistant - from scripting help to creation of of fully functional apps
- News and Market data agent - provide daily reports on macro markets and news related to a watchlist of stocks
- Day Trading Technical and Trade Analysis - connect to TradingView (api) and TradeStation (MCP) and review and analyze past trades, and watch specific stocks in real time for trade setups.
- Travel Agent - trip planning, flight and accommodation finder
- Medical / Health advisor - review labs and health related data and provide solution, look into ongoing research and medical journals for possible solutions
- Career Coach - keep an eye on job openings, re-write resume to match job description, possibly manage linkedin
I’m checking the Ollama side too so I don’t hand-wave the keep-loaded bit.
Yep, that’s enough to sketch a solid setup.
For your goal, I’d use:
- one main Telegram-facing manager
- 2 to 3 worker agents
- native sub-agents for delegation
sessions_spawnunder the hood, but you mostly think in terms of sub-agents
Recommended shape
- main: stays chat-available, decides where work goes
- coder: code / file / repo tasks
- reasoner: deeper analysis / planning / multi-step thinking
- fast: lightweight summaries / drafting / triage
For max 3 concurrent tasks, cap both:
agents.defaults.subagents.maxConcurrent: 3agents.defaults.subagents.maxChildrenPerAgent: 3
Config skeleton
If your Ollama connection already works, keep it. The important part is the agents + subagents section.
{
models: {
providers: {
ollama: {
baseUrl: "http://127.0.0.1:11434",
apiKey: "ollama-local",
api: "ollama", // important: native Ollama API, not /v1
},
},
},
agents: {
defaults: {
subagents: {
maxConcurrent: 3,
maxChildrenPerAgent: 3,
runTimeoutSeconds: 1800,
archiveAfterMinutes: 60,
},
},
list: [
{
id: "main",
default: true,
model: "ollama/<your-most-reliable-tool-calling-model>",
subagents: {
allowAgents: ["coder", "reasoner", "fast"],
requireAgentId: true,
},
},
{
id: "coder",
model: "ollama/qwen3-coder-nest",
},
{
id: "reasoner",
model: "ollama/qwen3.5:27b", // or test deepseek-r1:14b here
},
{
id: "fast",
model: "ollama/gemma4:latest",
},
],
},
}
The part people miss: teach the manager to delegate
In the main agent’s AGENTS.md, add something like:
When a task is more than a quick reply, delegate it with a sub-agent instead of doing it inline.
Keep the main chat responsive.
Use:
- `coder` for code, files, debugging, repo changes
- `reasoner` for multi-step analysis, planning, comparison, decision support
- `fast` for lightweight summaries, drafting, extraction, triage
Never exceed 3 active child tasks at once.
When delegating:
- choose an explicit `agentId`
- give the child a concrete outcome
- wait for completion announcements instead of polling
- summarize results back to the user clearly
That policy matters a lot. Without it, the manager may still try to do too much itself.
Ollama side, for “keep 3 warm”
That part is better handled in Ollama, not OpenClaw.
From Ollama’s docs/FAQ:
- models stay loaded for 5 minutes by default
- use
keep_alive/OLLAMA_KEEP_ALIVEto keep them loaded longer - use
OLLAMA_MAX_LOADED_MODELS=3to allow up to 3 loaded at once - check what’s resident with
ollama ps
On macOS, Ollama docs say to set env vars with launchctl, then restart Ollama, for example:
launchctl setenv OLLAMA_KEEP_ALIVE 30m
launchctl setenv OLLAMA_MAX_LOADED_MODELS 3
Small caveat: that’s not quite “pin exactly these 3 forever”. It’s more “allow up to 3 to stay warm”. Exact residency still depends on memory pressure and usage.
Two practical notes
-
Use native Ollama API, not
/v1
OpenClaw docs are pretty explicit that tool calling is more reliable withapi: "ollama"and no/v1. -
Manager model matters a lot
For local-only setups, the manager should be your most reliable tool-caller, not just your favorite chat model. If delegation feels flaky, that’s often the model.
Docs:
- https://docs.openclaw.ai/tools/subagents
- https://docs.openclaw.ai/concepts/multi-agent
- https://docs.openclaw.ai/providers/ollama
- https://docs.openclaw.ai/gateway/configuration-reference
- Ollama FAQ: https://docs.ollama.com/faq
If you want, next I can turn this into a drop-in openclaw.json example using your exact model names and a manager AGENTS.md block you can paste directly.
Nice, this is enough to make it concrete.
I would not start with one worker per bullet. I’d start with 5 agents total and split later only if one role gets messy.
I’d group your roles like this
-
main
Telegram-facing chat manager, stays available, delegates everything non-trivial -
coder
SWE / scripting / app-building / debugging -
research
travel, general research, career drafting, job-search analysis, resume tailoring -
markets
daily macro/news watchlists, trade review, technical analysis
important: analysis only, not autonomous trading
- health
lab review, paper/journal research, summarization
important: research assistant, not diagnosis/treatment authority
That gives you clean boundaries without turning your setup into agent spaghetti.
Model layout I’d try first
This is the part I’d keep simple:
- main →
ollama/gpt-oss:latest - coder →
ollama/qwen3-coder-nest - research / markets / health →
ollama/qwen3.5:27b
That keeps your 3 core hot models pretty sensible:
gpt-oss:latestqwen3-coder-nestqwen3.5:27b
Then treat:
gemma4:latestas optional lightweight draft/summarize workerdeepseek-r1:14bas experimental, not core, unless it proves more useful in your actual workflow
Drop-in config shape
{
models: {
providers: {
ollama: {
baseUrl: "http://127.0.0.1:11434",
apiKey: "ollama-local",
api: "ollama",
},
},
},
agents: {
defaults: {
subagents: {
maxConcurrent: 3,
maxChildrenPerAgent: 3,
runTimeoutSeconds: 1800,
archiveAfterMinutes: 60,
},
},
list: [
{
id: "main",
default: true,
model: "ollama/gpt-oss:latest",
subagents: {
allowAgents: ["coder", "research", "markets", "health"],
requireAgentId: true,
},
},
{
id: "coder",
model: "ollama/qwen3-coder-nest",
},
{
id: "research",
model: "ollama/qwen3.5:27b",
},
{
id: "markets",
model: "ollama/qwen3.5:27b",
},
{
id: "health",
model: "ollama/qwen3.5:27b",
},
],
},
}
What should be sub-agents vs cron
This part matters a lot:
Good fit for sub-agents
- coding help
- app building
- travel planning
- resume tailoring
- one-off market/trade analysis
- one-off health/lab/paper review
Better fit for cron / standing orders
- daily market + news report
- job opening watch
- recurring watchlist scans
- anything that should happen on a schedule without you asking
Not ideal as a long-running sub-agent
- “watch these stocks in real time forever”
That’s better as:
- scheduled polling / cron
- webhook/event-driven integration
- or an external market-data pipeline that feeds OpenClaw
Main agent prompt policy
In main/AGENTS.md, I’d add something like:
You are the Telegram-facing manager.
Stay available for chat. Do not spend long turns doing deep work yourself.
When a task is non-trivial, delegate with a sub-agent.
Routing:
- coder: code, scripts, repos, debugging, app building
- research: travel, web research, job search, resume tailoring, general drafting
- markets: macro/news reports, watchlist summaries, technical review, trade post-mortems
- health: labs, paper review, journal search, health-data summarization
Rules:
- never exceed 3 active child tasks
- use explicit agentId when spawning
- give children concrete outcomes
- wait for child completion announcements instead of polling
- summarize results back to the user clearly
Hard limits:
- markets may analyze, but must not place trades or claim certainty
- health may summarize and research, but must not present itself as a doctor or emergency authority
- external profile/application/posting actions require user approval
Two safety/quality tweaks I’d strongly recommend
- Markets: keep it read-only unless you explicitly, really want execution later
- Health: make it cite sources and frame output as research/support, not medical advice
Those two domains get risky fast if you make them too autonomous.
Ollama side
For the “keep 3 warm” part, yes, that’s Ollama-side, not OpenClaw-side.
Per Ollama docs:
OLLAMA_MAX_LOADED_MODELS=3OLLAMA_KEEP_ALIVE=30m(or whatever warm window you want)- check residency with
ollama ps
One nuance: that gives you “keep up to 3 warm”, not “pin exactly these 3 forever”.
My honest recommendation
Start with exactly this: