#We need help debugging an OpenClaw companion setup where Bryan handles Telegram image turns, but som

342 messages ¡ Page 1 of 1 (latest)

strange adder
#

We need help debugging an OpenClaw companion setup where Bryan handles Telegram image turns, but some supplement turns stall after image receipt.

Current setup:

OpenClaw gateway on Telegram
custom hooks enabled
companion-server backend
model currently google/gemini-3-flash-preview
images now mostly go through the normal native path, not the old heavy doc path
What the logs show on April 16, 2026 around 00:59-01:00 Rome:

a 2-image turn arrived and went through visual-ingress ... bypass
Bryan sent one reply successfully
then more image turns arrived normally
after that, a supplement path tried to run:
openclaw supplement lookup "NOW Ashwagandha 450mg" && openclaw supplement lookup "Healthygift MBP Milk Protein Extract"
that failed with:
EBUSY: resource busy or locked ... exec-approvals.json
after that, no matching Telegram send appears for the later image turns
Question:

why is a supplement/image turn still trying to use an exec-based shell lookup path at all
and can that exec-approvals.json lock failure leave the turn hanging / suppress the final reply
Another clue from startup logs:

OpenClaw warns that tools.allow contains unknown entries like log_meal, companion_message, tts, pdf
so tool exposure/config may also be mismatched on this runtime/model lane
What we need to understand:

whether this is a bad tool registration / model capability mismatch
whether managed exec approvals can deadlock or block image turns after an EBUSY
how to fully remove remaining shell-style supplement lookup from the runtime path

atomic bisonBOT
#

Need to locate the current docs paths, then I’ll trace the supplement/exec path in source.

#

I found the core exec-approval path. Now I’m checking whether the supplement lookup is coming from custom hooks/plugin code rather than OpenClaw core, because upstream doesn’t appear to define openclaw supplement lookup at all.

#

First, please paste openclaw --version. A lot of this behavior has moved around recently.

Short answer, I think this is runtime/tool registration mismatch first, not “Gemini image turns are broken”.

  • I cannot find openclaw supplement lookup anywhere in upstream OpenClaw core/docs/source. So if that exact shell command appears, it is almost certainly coming from a custom hook, plugin, prompt, or companion-server fallback, not the native image path itself.
  • The warning about tools.allow containing unknown entries is real signal:
    • log_meal, companion_message look like custom plugin tools that are not actually loaded on this lane
    • pdf and tts are built-ins, but the warning means they are not present in this runtime/config/provider lane
  • That usually means Bryan is running with a tool policy that still expects tools which this runtime never registered, so something is falling back to shell.

About the exec-approvals.json error:

  • In core, host exec approval resolution does write ~/.openclaw/exec-approvals.json.
  • If that write/rename hits EBUSY, the exec path should fail fast, not enter a special OpenClaw deadlock.
  • But yes, that can absolutely leave the turn with no final Telegram send if your supplement flow treats the exec result as required and does not recover from the tool error.

So I’d frame it like this:

#
  1. Why is it still shelling out?
    Because some custom supplement path is still active. Upstream OpenClaw is not inventing openclaw supplement lookup on its own.

  2. Can the lock failure suppress the reply?
    Yes, for that turn. Not because approvals are globally deadlocked, but because the exec attempt throws before the supplement step completes.

  3. Will it poison later image turns?
    Not by itself, unless those later turns share the same post-image supplement pipeline / worker / exception path.

How I’d remove it fully:

  • Search your custom code/prompts/hooks for:
    • openclaw supplement lookup
    • supplement lookup
    • log_meal
    • companion_message
  • Fix the runtime so those custom tools are actually registered, or remove them from tools.allow on this lane.
#
  • If Bryan should never shell out here, deny exec for that agent/lane. That blocks model-driven exec, though it will not stop an out-of-band hook that shells directly.

Useful docs:

If you paste these, I can narrow it down pretty quickly:

openclaw --version
openclaw status --all
openclaw sandbox explain --json

and the Bryan tools.allow / custom hook snippet.

#

First, please paste openclaw --version, because some of this moved recently.

What I found from upstream code/docs:

  • openclaw supplement lookup is not an upstream OpenClaw core path. I can’t find that command anywhere in current source/docs, so that shell call is almost certainly coming from a custom hook, companion-server path, prompt, or plugin fallback.
  • The tools.allow warning is real signal, not noise:
    • log_meal, companion_message look like custom plugin tools that are not actually loaded
    • pdf and tts are built-ins, but the warning means they are not present in that runtime/config lane
  • So I’d prioritize tool registration/runtime mismatch over “Gemini image support” as the root cause.

On the exec-approvals.json failure:

  • Core exec approval resolution does touch ~/.openclaw/exec-approvals.json.
  • If that write/rename hits EBUSY, the exec path should fail fast, not enter a special approvals deadlock.
  • But yes, that can absolutely suppress the final Telegram reply for that turn if the supplement step is treated as required and the error is not caught.

So my read is:

#
  1. Why is it still shelling out?
    Because some custom supplement path is still alive. Native image ingress is not the thing inventing that CLI call.

  2. Can EBUSY leave the turn hanging?
    Yes, for that turn. Not because approvals are permanently deadlocked, but because the supplement exec throws before the turn finishes.

  3. Can it affect later image turns?
    Only if those later turns share the same post-image supplement pipeline, worker, or uncaught exception path. The EBUSY itself does not look like a global OpenClaw deadlock.

How to fully remove the shell-style supplement path:

  • Search custom code/prompts/hooks for:
    • openclaw supplement lookup
    • supplement lookup
    • log_meal
    • companion_message
  • Either:
#
  • make those custom tools actually register on this runtime, or
    • remove them from tools.allow for this lane
  • If Bryan should never shell out here, deny exec on that agent/lane. That blocks model-driven exec, though it will not stop an out-of-band hook that shells directly.

Useful docs:

If you paste these, I can narrow it down fast:

openclaw --version
openclaw status --all
openclaw sandbox explain --json
#

and the Bryan tools.allow plus the custom hook/plugin snippet that handles supplements.

strange adder
#

Thanks. Here are the exact details from the live Bryan host.

openclaw --version
OpenClaw 2026.4.14 (323493f)

openclaw status --all

  • Config: ~/.openclaw-companion/openclaw.json
  • Telegram channel OK
  • 1 agent total, 4 sessions
  • Plugins flagged as hook-only include:
    auth-gate
    companion-blood-work-parser
    companion-visual-router
    companion-interaction-checker
    companion-auto-memory
    companion-memory-flush
    companion-memory-consolidator
    companion-feedback-tracker

openclaw sandbox explain --json

  • sandbox.mode = off
  • sandbox.tools.allow includes exec/process/read/write/edit/apply_patch/image/...
  • elevated.enabled = false

Bryan config relevant bits:

  • primary model: google/gemini-3-flash-preview
  • agent tools.profile = coding
  • agent alsoAllow = ["pdf","tts","companion_message","log_meal"]
  • top-level alsoAllow = ["message"]
  • exec.security = allowlist
  • exec.ask = on-miss
  • elevated.enabled = false

Important source findings:

  • I cannot find the literal command openclaw supplement lookup anywhere in the repo.
  • Current supplement flow in our source is typed-tool based:
    • lookup_supplement_product
    • upsert_supplement_stack
  • rule-injector + supplement-analysis skill both say to use those tools, not shell exec.

Wait for next message i gotta split answer when i say answer

#

Important runtime clue:

  • startup warns that some allowlist entries are unknown on this lane:
    • tts
    • pdf
    • companion_message
    • log_meal
    • top-level message

Important deployment detail:

  • exec approvals are bind-mounted read-only:
    ./exec-approvals.json:/root/.openclaw/exec-approvals.json:ro

Failure we saw in logs:

  • image turns arrived normally on bypass
  • then supplement step tried:
    openclaw supplement lookup "NOW Ashwagandha 450mg" && openclaw supplement lookup "Healthygift MBP Milk Protein Extract"
  • then failed with:
    EBUSY ... exec-approvals.json
  • after that there was no matching final Telegram send for those later image turns

So my current guess is:

  1. some custom hook/plugin/fallback path is still generating the shell supplement lookup
  2. tool registration on this runtime lane is mismatched
  3. the read-only bind mount on exec-approvals.json may be contributing to the EBUSY failure

ANSWER

atomic bisonBOT
#

Got it, I’ll wait. Send the next part and say answer when you want me to synthesize it.

#

Yep, I think your 3-part guess is basically right.

My read, in order of confidence:

  1. This is mainly a runtime/tool-registration mismatch, not a Gemini image capability issue.
    The strongest clues are:

    • lookup_supplement_product / upsert_supplement_stack exist in your source,
    • but this live lane warns that pdf, tts, companion_message, log_meal, and even top-level message are unknown,
    • and Bryan is on tools.profile = coding, which exposes exec.

    That combination is exactly how you end up with, “intended typed tools are missing, so something falls back to shell”.

  2. Some custom path is still generating the shell supplement lookup.
    Since openclaw supplement lookup is not in upstream and not in your repo search, it is almost certainly coming from one of:

    • a custom hook/plugin fallback,
    • a prompt/rule injector fragment,
    • companion-server glue code,
#
  • or a stale generated/runtime artifact not in the searched source tree.
  1. The read-only bind mount on exec-approvals.json is very likely contributing directly to the failure.
    Core exec approval handling does write that file. With:

    ./exec-approvals.json:/root/.openclaw/exec-approvals.json:ro
    

    host exec on this lane is fundamentally in a bad state. EBUSY instead of EROFS can happen because of container/bind-mount behavior, but either way, the important part is: OpenClaw expects that file to be writable if host exec approvals are in play.

So, to your specific questions:

Why is a supplement/image turn still trying exec?

Because the intended supplement tools are not actually available on this runtime lane, and Bryan is allowed to use coding/runtime tools. Missing tool + coding profile + exec allowed = shell fallback is very plausible.

Can the exec-approvals.json failure hang or suppress the final reply?

#

Yes, for that turn.
I do not think this is a core “managed approvals deadlock”. I think it is a fail-fast exec error that aborts the supplement step, and your turn pipeline then never reaches the final Telegram send.

Can it affect later image turns too?

Also yes, if:

  • those later turns share the same supplement/post-image pipeline, or
  • the failing step aborts the whole turn worker/session path.

It does not look like a global image ingress problem by itself.

What I would change first

1. Stop using host exec as a fallback on Bryan

Fastest safety fix:

  • deny exec on Bryan, or
  • move Bryan off tools.profile = coding
#

If supplement handling should be typed-tool only, Bryan should not have an easy shell escape hatch here.

2. Fix the actual tool surface on this lane

Your live runtime should not be warning that the expected tools are unknown.

I would treat this as a hard bug:

  • why are companion_message / log_meal not registered?
  • why is even core message unknown on this lane?
  • are the supplement tools actually exposed in the effective tool catalog for Bryan?

Until the warnings are gone, I would not trust this lane.

3. Remove the read-only mount if any exec remains

If you keep any host exec path at all, do not mount:

#
/root/.openclaw/exec-approvals.json

read-only.

If you want immutable behavior, the better fix is:

  • disable/deny exec on this lane, not
  • leave exec enabled but make its approvals store unwritable.

Practical diagnosis summary

I’d summarize the root cause as:

Bryan is running on a coding-flavored runtime where the intended companion/supplement tools are not actually registered, so some custom fallback emits a shell openclaw supplement lookup ... command. That shell path then hits a read-only/wedged exec-approvals.json mount, causing exec failure, and the turn dies before final Telegram send.

That fits all of your evidence pretty cleanly.

strange adder
#

Here is the exact Bryan registration block for the supplement tools.

Source:

  • /Users/fey/Documents/Diji/docker/extensions/rule-injector/index.js:420
  • /Users/fey/Documents/Diji/docker/extensions/rule-injector/index.js:478

Code:

api.registerTool((ctx) => ({
name: 'lookup_supplement_product',
description: 'Resolve a supplement from product name or front-label text using the internal search-backed lookup. Ask for the back label only if the result says needsBackLabel.',
parameters: {
type: 'object',
properties: {
productName: { type: 'string' },
brand: { type: 'string' },
frontLabelText: { type: 'string' },
},
},
async execute(_toolCallId, params) {
const result = await postInternalJson('/api/internal/supplement-lookup', {
productName: typeof params?.productName === 'string' ? params.productName : '',
brand: typeof params?.brand === 'string' ? params.brand : '',
frontLabelText: typeof params?.frontLabelText === 'string' ? params.frontLabelText : '',
});
return {
content: [{
type: 'text',
text: Supplement lookup complete. Confidence ${Math.round((Number(result?.confidence || 0)) * 100)}%. needsBackLabel=${result?.needsBackLabel === true ? 'true' : 'false'}.,
}],
result,
};
},
}));

Split in nex message wait for me to say answer to answer

#

api.registerTool((ctx) => ({
name: 'upsert_supplement_stack',
description: "Persist one supplement into the active user's per-user structured supplement stack.",
parameters: {
type: 'object',
properties: {
name: { type: 'string' },
brand: { type: 'string' },
servingSize: { type: 'string' },
ingredients: { ... },
fillers: { type: 'array', items: { type: 'string' } },
flagged: { type: 'array', items: { type: 'string' } },
verdict: { type: 'string' },
notes: { type: 'string' },
timing: { type: 'string' },
lookupConfidence: { type: 'number' },
source: { type: 'string' },
},
required: ['name'],
},
async execute(_toolCallId, params) {
const user = resolveUser(ctx);
if (!user?.id || user.id === 'guest') throw new Error('Could not resolve registered user for upsert_supplement_stack.');
const result = await postInternalJson(/api/internal/supplement-stack/${encodeURIComponent(user.id)}, params || {});
return {
content: [{
type: 'text',
text: Structured supplement stack updated for ${user.id}.,
}],
result,
};
},
}));

Also, test coverage confirms these factories are supposed to register:

  • /Users/fey/Documents/Diji/docker/extensions/tests/rule-injector.test.ts:341
  • /Users/fey/Documents/Diji/docker/extensions/tests/rule-injector.test.ts:343

So source says these tools should exist, but live runtime is still warning about unknown tools and the agent is shelling out with openclaw supplement lookup .... That mismatch is the bug.

#

ANSWER

atomic bisonBOT
#

Got it, I’ll wait for the next part. Send answer when you want me to synthesize it.

#

Yep, that’s very helpful. I’m still holding, send answer when you want the synthesis.

#

Nice, this makes the bug much clearer.

I think there are two separate problems, and together they explain the behavior:

1. The supplement tools are not actually usable on Bryan’s effective tool policy

This is the big missing link:

  • Bryan is on tools.profile = coding
  • alsoAllow only adds:
    • pdf
    • tts
    • companion_message
    • log_meal
  • top-level alsoAllow adds:
    • message
#

But the actual supplement tools you showed are:

  • lookup_supplement_product
  • upsert_supplement_stack

Those names are not in the allowlist.

So even if rule-injector successfully registers them, Bryan still cannot call them under that profile unless you explicitly allow them.

That means the live behavior can be:

  • tools exist in source,
  • tests confirm registration,
  • but the effective runtime policy still does not expose them to the model.

And then, because Bryan is on a coding profile with exec available, some fallback path shells out instead.

#

That is a much tighter explanation than “Gemini is being weird”.

2. The exec fallback is broken on this deployment

This part is also real:

  • exec.security = allowlist
  • exec.ask = on-miss
  • ~/.openclaw/exec-approvals.json is bind-mounted read-only
./exec-approvals.json:/root/.openclaw/exec-approvals.json:ro

Core exec approval handling expects that file to be writable.

So once the bad shell fallback happens, it hits a second bug:

#
  • OpenClaw tries to touch/update the approvals file
  • the mount is read-only / container-locked
  • you get EBUSY
  • the supplement step dies
  • the turn never reaches final Telegram send

So yes, that failure can absolutely suppress the final reply for that turn. I still would not call it a core “deadlock”, more like fatal exec-path failure in a shared turn pipeline.

So the likely root cause chain is

  1. rule-injector defines the right supplement tools in source
  2. Bryan’s live tool policy does not actually allow those tool names
  3. some custom fallback then emits openclaw supplement lookup ...
  4. exec approvals are unwritable on this host
  5. exec fails, turn aborts, no final Telegram send

What I would change first

#

A. Explicitly allow the real supplement tools

Add these exact names to Bryan’s alsoAllow:

  • lookup_supplement_product
  • upsert_supplement_stack

If you want the typed supplement path, those names need to be in the effective allowlist.

B. Stop giving Bryan a coding profile if this is a companion lane

coding is a bad fit here because it makes shell fallback much easier.

If Bryan is supposed to do companion/image/supplement work, I’d use a narrower tool surface, not coding + exec.

C. Deny exec on Bryan

If shell supplement lookup is not intended, block it explicitly.

That turns this from “silent bad fallback” into “clear policy error”.

#

D. Remove the read-only mount, or remove exec entirely

If any host exec remains, do not mount exec-approvals.json read-only.

If you want zero host exec, then deny exec and the approvals file stops mattering for this lane.

E. Clean stale allowlist entries

These warnings are still a real bug:

  • pdf
  • tts
  • companion_message
  • log_meal
  • message

Either make those tools exist on this runtime, or remove them from allowlists. Right now the live lane and the config disagree.

Short version

#

Yes, the mismatch is the bug, and now I think it’s even more specific:

The supplement tools are registered in source, but not actually allowed on Bryan’s effective tool policy. Bryan falls back to shell because he is on coding, then that shell path crashes because exec approvals are mounted read-only.

Docs:

If you want, I can help draft the exact Bryan config change next, the safest minimal one.

strange adder
#

One correction on the allowlist theory:

Per OpenClaw docs, plugin tools only need explicit allowlisting when they are registered as optional: true.
Our supplement tools are registered as plain required tools, not optional.

Source:

  • rule-injector registers lookup_supplement_product at line 420
  • rule-injector registers upsert_supplement_stack at line 478
  • both are plain api.registerTool(...) calls, no optional flag

Docs:
https://docs.openclaw.ai/plugins/building-plugins#registering-agent-tools

So I no longer think “they are missing from alsoAllow” is the primary explanation by itself.

The stronger explanation is:

  • the plugin/tool registration is not actually taking effect on this live runtime lane
  • that matches the unknown-tool warnings for other custom tools like log_meal and companion_message
  • then some stale/custom fallback shells out with openclaw supplement lookup ...
  • that fallback dies on the exec-approvals.json setup

So I think the next question is:
why are required custom tools from rule-injector not present in the effective live tool catalog for Bryan on this lane?

atomic bisonBOT
#

Yep, you’re right, and thanks for the correction.

I re-anchored on core behavior, and required plugin tools should not need explicit allowlisting. So the better question really is:

why is rule-injector not contributing its required tools to Bryan’s live tool catalog?

From core loader behavior, required plugin tools disappear only when one of these is true:

  1. The plugin is not actually loaded into the runtime registry for that lane
  2. The live runtime is using a different plugin artifact/source than the file you inspected
  3. The plugin entry/factory throws during tool materialization
  4. The factory returns null / no tools for that context
  5. There is a name conflict and the tool gets skipped

Given your evidence, I’d rank them like this:

Most likely

#

Bryan is not running the same effective plugin artifact you searched.

Why I think that:

  • your source + tests clearly show the tools should register
  • live runtime still says custom tools are unknown
  • multiple custom tools are missing, not just one
  • this is a Docker deployment, so stale mount / stale bundle / wrong extension path is very plausible

That would also explain how some stale/custom fallback can still emit openclaw supplement lookup ....

Next most likely

rule-injector is loading, but not as a tool-capability plugin on this runtime lane.

For example:

  • hook path active
  • tool registration path not active
  • or plugin diagnostics are being emitted but not noticed in normal logs
#

Less likely

Tool factory/name conflict issue

Your tool names look unique, so this is lower probability unless there’s another plugin registering the same names.

What I’d check next, on the live Bryan host

These are the highest-value commands now:

openclaw plugins inspect rule-injector --json
openclaw plugins doctor

What I’d look for in inspect:

  • plugin source
#
  • plugin rootDir
  • plugin shape
  • registered toolNames
  • diagnostics / compatibility notices

If inspect does not show lookup_supplement_product and upsert_supplement_stack, then the bug is almost certainly:

  • wrong plugin artifact,
  • stale container/plugin install,
  • or plugin registration not executing on that runtime.

If inspect does show them, but startup still says they’re absent from the effective tool catalog, then the next layer to inspect is the tool-resolution path for Bryan specifically, not plugin registration itself.

And separately, the exec failure is still real

Your exec-approvals.json:ro mount is still a second bug.

So I’d now treat this as:

strange adder
#

This is the result from the live Bryan host.

openclaw plugins inspect rule-injector --json

{
"plugin": {
"id": "rule-injector",
"source": "/root/.openclaw-companion/extensions/rule-injector/index.js",
"rootDir": "/root/.openclaw-companion/extensions/rule-injector",
"enabled": true,
"activated": true,
"status": "loaded",
"toolNames": [],
"hookCount": 2
},
"shape": "non-capability",
"capabilityMode": "none",
"capabilityCount": 0,
"typedHooks": [
{ "name": "before_model_resolve" },
{ "name": "before_prompt_build" }
],
"tools": [
{ "names": [], "optional": false },
{ "names": [], "optional": false },
{ "names": [], "optional": false },
{ "names": [], "optional": false }
]
}

openclaw plugins doctor

  • rule-injector is effectively loading as a hook-style/non-capability plugin
  • same compatibility pattern shows up for a bunch of companion plugins as “hook-only”

So the live host is confirming:

  • rule-injector is loaded
  • its hooks are loaded
  • but its tool catalog is empty on this runtime
  • lookup_supplement_product and upsert_supplement_stack are not making it into the live tool registry at all

That explains a lot better why Bryan falls back to shell.

The current mystery is now narrower:
why does this plugin load from the expected file path, but register zero tool names even though source contains api.registerTool(...) calls?

atomic bisonBOT
#

Ah, this is the key nuance.

I don’t think toolNames: [] means “the api.registerTool(...) calls never happened”.

I think it means:

rule-injector is registering tool factories, but plugins inspect cannot statically recover their names from that registration style.

Why:

  • your plugin uses:
api.registerTool((ctx) => ({
  name: 'lookup_supplement_product',
  ...
#
}))
  • that is a factory function, not a plain tool object
  • in core, static plugin inspection only knows names if:
    • you register a plain object, or
    • you pass opts.name / opts.names alongside the factory

So this output:

"tools": [
  { "names": [], "optional": false },
  ...
]
#

is actually consistent with:

  • “four required tool factories were registered”
  • but “their names were not statically declared at registration time”

In other words, plugins inspect is showing opaque factory registrations, not necessarily an empty runtime tool set.

So I’d update the model like this:

What the inspect output really proves

  • rule-injector is loaded
  • it did register 4 tool entries
  • but those entries were registered in a factory form with no static names
  • so inspect cannot label them

That means the new mystery is slightly different:

#

why are these registered tool factories not materializing into Bryan’s effective agent tool surface at run time?

That is a better question than “why did it register zero tools?”

Why I think this matters

Your inspect output says:

"shape": "non-capability"
"tools": [
  { "names": [], "optional": false },
  { "names": [], "optional": false },
  { "names": [], "optional": false },
  { "names": [], "optional": false }
]
#

That is not “hook-only, no tools”.
That is closer to:

  • hooks are present
  • tool registrations are present
  • names are opaque in inspection

So I’d be careful not to over-read toolNames: [].

The likely bug now

The likely bug is downstream of registration, in the effective tool-resolution/materialization path for Bryan.

The main candidates now are:

  1. the factory registrations are present, but not being materialized into the live tool list for that session/lane
  2. the factories are materialized, but some later policy/filtering path is dropping them
#
  1. a custom fallback still prefers shell even when the typed tools exist

Given the shell behavior, I’d focus on effective session tool inventory, not plugin discovery.

Very useful follow-up fix, regardless

Make the registrations explicit so inspect/debugging can see them:

api.registerTool(
  (ctx) => ({ name: 'lookup_supplement_product', ... }),
  { name: 'lookup_supplement_product' }
)

api.registerTool(
  (ctx) => ({ name: 'upsert_supplement_stack', ... }),
#
  { name: 'upsert_supplement_stack' }
)

Do the same for companion_message / log_meal.

That won’t by itself fix Bryan, but it will make the live plugin reports much less misleading.

My updated take

  • the plugin is loading from the expected file
  • the api.registerTool(...) calls are probably happening
  • plugins inspect just cannot name those factory-registered tools
  • the real failure is likely in effective tool materialization / filtering / fallback selection on Bryan’s lane
  • and once shell fallback happens, the read-only exec-approvals.json mount kills the turn
#

So I’d stop treating “empty toolNames in inspect” as the primary mystery. It’s mostly a registration-metadata limitation.

If you want, next I’d help narrow the materialization/filtering side, because I think that’s where the real bug now lives.

strange adder
#

That makes sense. I agree: toolNames: [] in inspect was over-read.

What we know now:

  • rule-injector is loaded from the expected live file
  • hooks are active
  • 4 required tool factories are registered
  • because they’re factory-registered, inspect can’t statically recover the names
  • so the real question is now effective tool materialization/filtering for Bryan’s live session, not plugin discovery

What is still definitely wrong:

  • startup warns about unknown tools on this lane
  • Bryan still shells out with openclaw supplement lookup ...
  • once shell fallback happens, the read-only exec-approvals mount kills the turn

So I think the next thing to inspect is the effective runtime tool inventory for Bryan’s active session / agent, not plugin inspect.

atomic bisonBOT
#

Yep, I agree. Effective runtime tool inventory for Bryan’s active session is the right next layer.

I’d split it into catalog vs effective:

  1. Agent/runtime catalog
    • “Can this agent materialize the plugin tools at all?”
  2. Session-effective inventory
    • “Can this specific Bryan conversation use them right now?”

That distinction matters a lot:

  • if the tools are missing from the catalog, the problem is materialization
  • if they exist in the catalog but not the effective session inventory, the problem is filtering/policy/session context
  • if they exist in the effective session inventory and Bryan still shells out, then the problem is prompt/fallback behavior, not registration

Best next checks

#

Easiest

In the live Bryan conversation, run:

/tools verbose

or

/status

That should show the available tools for that session.

What I’d want to know is whether these names show up there:

#
  • lookup_supplement_product
  • upsert_supplement_stack

More exact, if you’re using operator/API access

The gateway has two separate reads:

  • tools.catalog = runtime tool catalog for the agent
  • tools.effective = session-effective tool inventory

So the clean decision tree is:

  • missing from tools.catalog
    • tool factory materialization problem
  • present in tools.catalog, missing from tools.effective
    • session/provider/channel/policy filtering problem
  • present in tools.effective
    • model/prompt/custom fallback is choosing shell anyway
#

That’s the shortest path to isolating it.

One thing I’d still change immediately

Even before the deeper debug, I’d remove the false-noise failure source:

  • either deny exec on Bryan
  • or stop mounting exec-approvals.json read-only

Right now shell fallback is both:

  • hiding the real bug, and
  • turning it into a stalled turn

So it’s muddying the signal.

My current prior

Based on everything so far, I’d bet on:

#
  • plugin factories do materialize somewhere,
  • but Bryan’s effective session tool surface is dropping them,
  • and then a custom fallback shells out.

The fact that startup also warns about other unknown tools on that lane fits that pretty well.

Docs:

If you paste the output of Bryan’s /tools verbose, I can help read the exact split between catalog/effective/filtering.

strange adder
#

More live findings from Bryan host:

  1. The opaque-factory pattern is not unique to rule-injector.
    I checked:
  • companion-food
  • companion-message

Both inspect the same way:

  • plugin loaded from expected live file
  • tools array present
  • names empty
  • non-capability / opaque factory style

So toolNames: [] is definitely not enough to conclude the tools are absent.

  1. The usual CLI path to inspect tools is blocked on this host.
    Running openclaw tools --help inside the live container returns:

The openclaw tools command is unavailable because plugins.allow excludes "tools".

So I cannot use the bundled openclaw tools CLI surface on this deployment to inspect effective tool inventory.

  1. Current live state now looks like:
  • plugin files are loading from the expected paths
  • factory-style tool registrations are happening in multiple companion plugins
  • static plugin inspect cannot name those tools
  • startup still warns about unknown tools on this lane
  • shell fallback still happens for supplement turns
  • exec-approvals.json read-only mount still turns that fallback into a fatal turn failure

So the mystery is now firmly in effective runtime tool resolution / filtering for Bryan’s live session, not simple plugin discovery.

At this point the two best next checks seem to be:

  • effective session tool inventory through an in-session path like /tools verbose or /status
  • or gateway-level catalog/effective inspection, since the host’s bundled openclaw tools CLI is excluded by policy
atomic bisonBOT
#

Yep, I think you’ve got it narrowed down correctly now.

At this point I’d treat the situation as:

  • plugin discovery/loading looks basically real
  • static inspect is limited by opaque factory registration
  • the actual bug is in effective tool resolution for Bryan’s live session
  • the read-only exec approvals mount is a separate amplifier that turns fallback into a hard failure

So yes, the best next check is in-session /tools verbose or /status.

Why I like that path most:

  • it asks the actual live session
  • it avoids the host CLI policy weirdness around openclaw tools
  • it tells you what Bryan can use right now, not just what plugins were discovered
#

What to look for

If /tools verbose or /status shows:

A. supplement tools are missing

  • lookup_supplement_product
  • upsert_supplement_stack

then the bug is definitely in session-effective filtering / tool surfacing

B. supplement tools are present

then the bug moves up a layer:

  • prompt/rule/fallback logic is still choosing shell
  • or a custom hook is injecting the shell path despite the typed tools being available

That split is now the cleanest one.

#

One practical suggestion while debugging

I’d still remove one source of noise immediately:

  • either deny exec on Bryan
  • or stop mounting exec-approvals.json read-only

Right now every bad shell fallback turns into a fatal turn failure, which makes the main bug harder to isolate.

My current best hypothesis

If I had to bet right now:

  • the supplement tool factories are present
  • but Bryan’s effective live session tool surface is dropping them
  • then some stale/custom fallback emits openclaw supplement lookup ...
  • then the read-only approvals mount kills the turn

So I think you’re aiming at exactly the right layer now.

#

If you can get Bryan’s /tools verbose or /status output, that should finally tell us whether this is:

  • filtering/materialization, or
  • fallback despite tools being available.
strange adder
#

I got the operator-side equivalent from Bryan’s live Telegram session record.

For session key:
agent:main:telegram:default:direct:1641047688

the stored systemPrompt/tool report shows these live tools:
read, edit, write, exec, process, tts, image_generate, music_generate, video_generate, sessions_yield, web_fetch, image, pdf, log_meal, companion_message, memory_search, memory_get

Missing from that same live session:

  • lookup_supplement_product
  • upsert_supplement_stack
  • upsert_profile_facts

So this confirms the session-effective tool surface really is missing the supplement tools, while exec is still present.

That explains the behavior much better:

  • supplement tools are not actually in Bryan’s live session tool set
  • shell fallback remains possible because exec is present
  • then the exec-approvals.json setup turns fallback into a fatal turn failure
atomic bisonBOT
#

This is the smoking gun. I think we can finally reconcile the two theories cleanly.

What the new evidence shows

Bryan’s live session-effective tool set contains:

  • log_meal
  • companion_message

but is missing:

  • lookup_supplement_product
  • upsert_supplement_stack
  • upsert_profile_facts

That pattern matches your config almost too well:

#
  • log_meal and companion_message are in alsoAllow
  • the supplement tools are not

So the likely answer is:

the supplement tools are being registered, but they are then being filtered out by the final tool policy / profile allowlist before they reach Bryan’s live session tool surface.

The important nuance

Your earlier correction was still valid:

  • required plugin tools do not need optional-tool allowlisting just to register

But there is a second layer after registration:

  • the agent’s effective tool policy (tools.profile, allow, deny, alsoAllow) still decides what the session can actually use
#

So both of these are true at once:

  1. the supplement tools can register successfully as required plugin tools
  2. they can still be excluded from the live session tool inventory by Bryan’s final allowlist/profile

And your live session inventory now strongly suggests that is exactly what is happening.

Why Bryan shells out

Because from Bryan’s point of view, in the live session:

  • supplement tools are absent
  • exec is present

So shell fallback is still available, and the model/custom fallback takes it.

Then the read-only exec-approvals.json mount turns that bad fallback into a fatal turn failure.

#

So the root cause now looks like

  1. supplement tools register
  2. Bryan’s effective tool policy filters them out
  3. exec remains available
  4. shell fallback happens
  5. read-only approvals mount kills the turn

What I would change now

1. Add the supplement tools to Bryan’s allowed surface

Explicitly include:

  • lookup_supplement_product
  • upsert_supplement_stack
  • upsert_profile_facts
#

on Bryan’s effective allowlist / alsoAllow path

2. Remove exec from Bryan

Or explicitly deny it on this lane.

That turns the bug from “silent shell fallback” into “typed tools missing”, which is much safer and easier to debug.

3. Stop mounting exec-approvals.json read-only

Or disable exec completely so it no longer matters.

4. Rebuild the session

After config/tool-policy changes, make Bryan start a fresh session or restart the gateway so the tool inventory gets rebuilt cleanly.

One extra useful clue

Your session-effective tool list already contains tts, pdf, log_meal, companion_message, so the earlier startup warnings were probably from a different lane / earlier catalog pass, not the final live Telegram session you just inspected.

That live session inventory is the source of truth now.

strange adder
#

Would a single gateway bryan still need to disable exec as a fix?

atomic bisonBOT
#

No, not strictly.

If you:

  • get the supplement tools into Bryan’s live tool set, and
  • fix the exec-approvals.json mount

then Bryan can still work with exec enabled.

But I’d still recommend disabling or denying exec on Bryan if Bryan is supposed to be a companion/image/supplement agent, not a coding shell agent.

Why:

  • it prevents this exact fallback from happening again
  • it makes missing-tool bugs fail clearly instead of silently shelling out
  • it removes the exec-approvals file from the critical path for Bryan turns
#

So:

  • Required for correctness: no
  • Recommended as a hardening/safety fix: yes

If you want the pragmatic path, I’d do both:

  1. fix the missing supplement tools
  2. deny exec on Bryan anyway
strange adder
#

Bryan prompt/token surface is huge
We audited the live Telegram session and the numbers are bad:

Live session prompt report:

  • systemPrompt.chars = 31,000
  • tools.schemaChars = 10,954
  • skills.promptChars = 4,819

One broken morning heartbeat turn:

  • injected user-turn payload was 24,074 chars, 634 lines
  • input tokens for that one run: 21,424

Biggest sources we found:

  • always-on workspace files in the prompt:
    • SOUL.md: 8,917 chars
    • AGENTS.md: 3,439
    • TOOLS.md: 2,304
  • tool schema bloat:
    • video_generate: 3,688
    • music_generate: 1,227
    • image_generate: 1,171
  • all-skills snapshot text still present
  • then rule-injector adds a second large per-turn layer:
    • onboarding
    • tool rules
    • profile context
    • memory
    • health context
    • supplement stack context
    • followups/templates
    • skill bodies
    • full SAMANTHA.md

So Bryan is paying:

  • a large base OpenClaw prompt
  • plus a second giant Bryan-specific injected context layer

Question:

  • Does this look like expected OC behavior for a companion agent, or are we over-injecting compared with the normal pattern?
  • Is there a clean OC-native way to keep project context/tooling smaller per agent/lane?
  • In particular, can we strip irrelevant tool schemas and avoid loading all-skill/project files so aggressively without forking a bunch of core behavior?

My current read:

  • direct Google Gemini 3.x on OC’s tagged reasoning contract is not production-safe for Bryan
  • and Bryan’s prompt surface is large enough that even generous Gemini TPM starts to look bad quickly

If you know the cleanest OC-native fix path here, that would help a lot.

#

@atomic bison

atomic bisonBOT
#

Short version: some of this is normal OpenClaw behavior, but Bryan is definitely over-injected relative to a lean companion setup.

What’s expected vs not

Expected core behavior

OpenClaw will normally include:

  • workspace bootstrap files (AGENTS.md, SOUL.md, TOOLS.md, etc.)
  • a compact skills list (metadata, not full bodies)
  • tool schemas for every exposed tool

So a few thousand chars of base prompt overhead is normal.

Not the lean normal pattern

What looks oversized in your case is:

  • Bryan still using tools.profile: "coding"-style surface, which drags in exec, process, image_generate, music_generate, video_generate
#
  • a big visible skill roster
  • then a second giant custom injection layer from rule-injector
  • especially full SAMANTHA.md, skill bodies, large health/supplement context every turn

So yes, I’d call this over-injected, mostly from config/tooling choices plus the custom layer, not just “how OC always works”.

Clean OC-native ways to slim Bryan down

1. Stop using a coding-style tool surface for Bryan

This is the biggest easy win.

coding includes runtime + FS + web + memory + image/video generation. That is why you’re paying schema tax for:

  • video_generate
  • music_generate
  • image_generate
  • and still leaving exec available
#

For a companion lane, I’d move Bryan to a much tighter per-agent tool policy, ideally:

  • minimal or messaging base
  • then alsoAllow only the tools Bryan really needs

For example, conceptually:

  • read (if Bryan must load skill docs)
  • image
  • pdf
  • memory_search
  • memory_get
  • tts if needed
  • log_meal
  • companion_message
  • lookup_supplement_product
  • upsert_supplement_stack
#
  • upsert_profile_facts

and explicitly deny/remove:

  • group:runtime
  • image_generate
  • music_generate
  • video_generate

That cuts both prompt size and fallback risk.

2. Give Bryan its own smaller workspace

This is the clean native fix for the big SOUL.md / AGENTS.md / TOOLS.md load.

OpenClaw auto-injects those files if they exist. There is not a nice built-in “inject SOUL but not TOOLS” toggle for normal sessions.

So the native pattern is:

#
  • use agents.list[].workspace
  • give Bryan a dedicated slim workspace
  • keep Bryan’s SOUL.md, AGENTS.md, TOOLS.md short and purpose-built

That’s much cleaner than trying to fight the bootstrap system.

3. Restrict visible skills for Bryan

The skills list in the prompt is expected, but you can shrink it.

Use:

  • agents.defaults.skills
  • or better, Bryan-specific agents.list[].skills

so Bryan only sees the handful of skills it should actually use.

That reduces the skills metadata block and helps the model stay on-lane.

#

4. Use contextInjection: "continuation-skip" if this gateway is mostly Bryan

This is a native knob that reduces repeated workspace bootstrap injection on safe continuation turns.

Important nuance:

  • it helps normal continuation turns
  • it does not really save you on heartbeats or certain rebuild/retry paths

So it’s useful, but it won’t solve the whole “morning heartbeat is huge” problem by itself.

5. Stop dumping big static documents every turn

This is the biggest custom-layer issue.

If rule-injector is adding:

  • full SAMANTHA.md
  • skill bodies
#
  • full health context
  • full supplement stack
  • templates/followups

on every turn, that is not the lean OC pattern.

Better pattern:

  • keep only a small routing/instruction layer always-on
  • move large reference material to:
    • memory
    • files loaded via read
    • custom retrieval tools
    • or, if you want to do this properly long-term, a context-engine plugin

In other words, Bryan should retrieve the large body only when the turn actually needs it.

#

My recommended fix order

  1. Move Bryan off coding
  2. Remove exec
  3. Add the missing supplement tools explicitly
  4. Give Bryan a dedicated slim workspace
  5. Restrict Bryan’s skills list
  6. Trim rule-injector so it injects summaries/pointers, not full bodies
  7. Re-check with /context detail

One subtle but important point

The big tool-schema numbers from video_generate / music_generate / image_generate are basically self-inflicted here. That is exactly the kind of bloat you get when a companion agent inherits a broader profile than it needs.

So if you want the cleanest OC-native fix path, I’d say:

per-agent workspace + per-agent tight tool policy + per-agent skill allowlist, then shrink the custom rule-injector layer.

#

Fay, if you want the cleanest OC-native Bryan shape, I’d make it look roughly like this:

{
  agents: {
    defaults: {
      contextInjection: "continuation-skip"
    },
    list: [
      {
        id: "bryan",
        workspace: "~/.openclaw/workspace-bryan",

        // Keep this explicit. [] if Bryan should see no skills.
        skills: [
          // only the few Bryan actually needs
#
        ],

        tools: {
          profile: "minimal",
          alsoAllow: [
            "read",                  // only if Bryan must load SKILL.md / files
            "image",
            "pdf",
            "memory_search",
            "memory_get",
            "tts",

            "log_meal",
            "companion_message",
            "lookup_supplement_product",
#
            "upsert_supplement_stack",
            "upsert_profile_facts"
          ],
          deny: [
            "group:runtime",
            "image_generate",
            "music_generate",
            "video_generate"
          ]
        }
      }
    ]
  }
}
#

Then I’d pair that with:

Bryan-specific workspace

Use ~/.openclaw/workspace-bryan and keep these short:

  • SOUL.md
  • AGENTS.md
  • TOOLS.md

That’s the native way to stop Bryan from inheriting a giant general-purpose persona/workspace.

Rule injector trim

I would stop injecting full bodies every turn:

  • no full SAMANTHA.md
  • no full skill bodies
  • no full supplement stack / health dump unless needed
#

Better pattern:

  • tiny always-on policy
  • retrieve the heavy stuff on demand

Why this is the sweet spot

It fixes all 3 problems at once:

  1. tool schema bloat drops
  2. supplement tools become actually available
  3. shell fallback disappears because exec/runtime tools are denied

Important caveat

contextInjection: "continuation-skip" helps repeat turns, but it won’t save heartbeat/new-context style turns much. The real wins are:

  • smaller workspace
  • smaller tool surface
#
  • smaller visible skill set
  • much smaller custom injection

After changing it

I’d do all 3:

  1. restart gateway
  2. start a fresh Bryan session
  3. run /context detail and /tools verbose

That should tell you immediately if the bloat and missing-tool issue actually dropped.

If you want, I can help turn that into a more exact Bryan config diff next.

strange adder
#

Yep, this direction looks right.

I think the target Bryan shape should be:

  • dedicated Bryan workspace
  • minimal Bryan tool surface
  • Bryan-only skills list
  • much smaller rule-injector output
  • contextInjection: "continuation-skip" as a bonus optimization, not the main fix

A couple nuances from our audit:

  1. profile: "minimal" seems like the right starting point.
    The current problem is not just missing supplement tools anymore, it’s that Bryan’s whole lane is still too broad and too expensive.

  2. Dedicated Bryan workspace is probably the biggest clean win.
    Our live prompt report shows a lot of always-on workspace overhead:

  • SOUL.md: 8,917 chars
  • AGENTS.md: 3,439
  • TOOLS.md: 2,304

So giving Bryan its own slim workspace feels like the most OC-native way to stop inheriting the giant general-purpose bootstrap files.

  1. Tight Bryan-only skills list also makes sense.
    Current live prompt report still shows skills.promptChars = 4,819, so even after the earlier lite work there is still too much visible skill surface.

  2. Rule-injector still needs to be trimmed hard.
    This is not something config alone will solve.
    The biggest remaining bloat is still the second custom layer from rule-injector:

  • onboarding/tool rules
  • contexts
  • templates
  • personality block
  • conditional skill bodies

Message split in 2 waait for second part i ll write answer

#

One broken morning heartbeat turn ended up at:

  • 24,074 chars
  • 634 lines
  • 21,424 input tokens

So even with a leaner workspace/tools/skills setup, we still need to stop injecting so much per turn.

  1. continuation-skip is worth trying, but I agree it is not the centerpiece.
    It may help for repeat DM turns, but the real wins are:
  • smaller workspace
  • smaller tool surface
  • smaller skills list
  • much smaller custom injection

One thing I’d like your view on:
For a Telegram companion lane like Bryan, what is the leanest tool set you’d actually keep in practice?

I’m thinking something like:

  • image
  • pdf only if bloodwork/doc path still needs it
  • log_meal
  • companion_message
  • lookup_supplement_product
  • upsert_supplement_stack
  • upsert_profile_facts

And then only keep read, memory_search, memory_get, tts if they are truly necessary.

So yes, I think your proposed shape is the right target.
The main thing I’d stress is that rule-injector trimming is still essential, because the token audit says the custom per-turn layer is still a major part of the problem.

ANSWER pls

atomic bisonBOT
#

Yep, that all tracks. I’ll wait for part 2.

#

Yep, I think your instinct is right.

For a lean Bryan Telegram companion, I’d split it like this:

Absolute minimum I’d keep

  • image
  • log_meal
  • lookup_supplement_product
  • upsert_supplement_stack
  • upsert_profile_facts

That’s the core companion/supplement lane.

Keep only if truly needed

  • pdf only if Bryan really does bloodwork/doc turns often
  • companion_message only if it does something normal channel replies cannot
  • read only if Bryan still needs to load skills/files on demand
#
  • tts only if Bryan actually sends voice/audio
  • memory_search, memory_get only if you use them to replace giant injected memory blocks

That last one is the main nuance:

if the alternative is dumping memory/health/supplement context into every turn, I would actually keep memory_search + memory_get.

They’re often worth the small tool cost because they let you stop paying the huge custom prompt tax.

So my practical Bryan set would probably be

Option A, ultra-lean

  • image
  • log_meal
  • lookup_supplement_product
  • upsert_supplement_stack
  • upsert_profile_facts
#

Option B, lean but still practical

  • image
  • log_meal
  • lookup_supplement_product
  • upsert_supplement_stack
  • upsert_profile_facts
  • memory_search
  • memory_get

and then add:

  • pdf only if documents are a real lane
  • read only if skills/file retrieval remain part of the design

What I would drop hard

  • exec
  • process
  • image_generate
#
  • music_generate
  • video_generate

No question there.

Bigger picture

I agree with your framing:

  • workspace trim matters
  • tools trim matters
  • skills trim matters
  • but rule-injector trimming is still essential

Because a 21k-input-token heartbeat is not getting fixed by tool cleanup alone.

If I were tightening Bryan, I’d make rule-injector obey a hard rule like:

#
  • inject only the smallest summary needed
  • never inline full bodies
  • retrieve details on demand
  • hard-cap total injected chars per turn

So yes, your target shape looks right, and I’d personally keep memory tools before read/tts if the goal is replacing giant always-on context.

strange adder
#

One more important nuance: Bryan’s voice matters a lot, and I do not want to lose Samantha/Bryan tone while slimming.

Current audit says always-on workspace overhead is a big part of the problem, especially:

  • SOUL.md: 8,917 chars
  • AGENTS.md: 3,439
  • TOOLS.md: 2,304

But I don’t want to just delete Samantha voice instructions and flatten him.

So the question is:
what is the cleanest OC-native way to preserve a strong companion voice without injecting the full Samantha/SOUL body every turn?

My current instinct is:

  • keep the full Samantha/SOUL spec as the design/source-of-truth file
  • give Bryan a dedicated slim workspace
  • make Bryan’s runtime SOUL.md a much shorter distilled voice contract
  • avoid inlining the full Samantha file again from rule-injector

Does that match the right OC-native pattern, or is there a better way to preserve strong agent voice while keeping prompt size down?

I’m trying to avoid two bad outcomes:

  • losing Bryan’s personality
  • or keeping the current giant voice/personality injection and paying the token cost forever
atomic bisonBOT
#

Yes, that matches the right OC-native pattern.

For voice, I would do exactly this:

Best pattern

1. Keep the full Samantha/Bryan spec as the design source of truth

But keep it in a non-bootstrap file, not in always-injected workspace files.

For example, something like:

  • personas/SAMANTHA.md
  • docs/persona/bryan-full.md

That preserves the rich spec without paying for it every turn.

2. Give Bryan a dedicated workspace

#

Then make Bryan’s runtime:

  • SOUL.md = short distilled voice contract
  • AGENTS.md = operational rules only
  • TOOLS.md = tiny local notes only

That’s the cleanest native separation.

3. Do not inline the full Samantha file again from rule-injector

That part is the killer. If SOUL.md already carries Bryan’s runtime voice, reinjecting the whole long persona spec is mostly duplicate cost.

Why this is the right tradeoff

Because in OpenClaw, SOUL.md is already the native voice layer.

The docs are pretty explicit here: short, sharp voice instructions are better than a huge vibe manifesto. So the goal is not “delete personality”, it’s:

#

compress personality into a strong runtime contract

The long spec is still useful, just not as always-on prompt cargo.

What the short runtime SOUL.md should contain

Only the high-signal stuff that actually changes how Bryan sounds:

  • tone
  • level of warmth / bluntness
  • how he reacts emotionally
  • default verbosity
  • what he should never sound like
  • maybe 1-3 tiny examples or signature rules

Not:

#
  • long backstory
  • full design rationale
  • multiple repeated personality sections
  • giant example banks
  • changelog/history

Practical rule of thumb

Think of it like this:

  • Full Samantha doc = authoring/reference/evaluation spec
  • Bryan runtime SOUL.md = compressed production persona

That’s a very sane setup.

One extra nice pattern

If you want to preserve the rich spec without paying every turn:

#
  • keep Bryan’s short SOUL.md always-on
  • only read the full Samantha spec on demand
    • for persona editing
    • voice regression testing
    • major rewrites
    • debugging “why did Bryan sound flat?”

That gives you both:

  • strong personality in production
  • rich design source when needed

My recommendation

I would not try to preserve voice by keeping the giant personality block in rule-injector.

I’d instead:

  1. create a short Bryan runtime SOUL.md
  2. move the full Samantha spec out of bootstrap
#
  1. stop reinjecting the long version every turn
  2. test a handful of real Bryan conversations to make sure the distilled voice still lands

So yes, I think your instinct is exactly right. The native answer is distill, don’t flatten.

strange adder
#

Need a cross-check on Bryan/OpenClaw behavior because the live UX is still messy even after routing/tool fixes.

What we verified from logs:

  • bloodwork image bundle routed correctly to the bloodwork/doc path
  • Bryan generated one GPT-5.4 assistant reply for that turn, not three separate model replies
  • but Telegram delivered multiple outbound messages for that same turn
  • in the send records, the same bloodwork turn produced consecutive Telegram message IDs
  • user experience looked like Bryan replied 3 times, because one long answer got split into multiple Telegram sends

What makes it feel broken:

  • the chunks are delivered as normal standalone messages
  • there are no continuation markers like 1/3, 2/3, 3/3
  • the last chunk still contains the normal Bryan CTA like “if you want, I can…”
  • so from the user side it looks like multiple separate replies, not one continued answer

Second part coming wait for answer

#

Second issue:

  • outside the actual bloodwork replay path, Bryan still sometimes asks for bloodwork again during normal chat/onboarding
  • in our source, rule-injector still keeps blood work in the general profiling/onboarding orbit:
    • onboarding known/missing wording still includes blood work
    • profiling stage keeps asking for missing profile details
    • Bryan only exits that mode once explicitly marked active

So I think there are 2 separate bugs:

  1. Telegram/output splitting for one long Bryan answer is messy and not continuation-aware
  2. Bryan’s normal profiling/onboarding logic still re-prompts for bloodwork even after prior bloodwork exists

Questions:

  • In normal OpenClaw Telegram delivery, should one long assistant response be expected to split into multiple sends like this?
  • Is there a built-in consecutive/chunked send behavior with continuation markers, or would that need custom handling?
  • If a custom tool like companion_message is present but not intended for ordinary final replies, is there any chance Bryan’s lane is bypassing the normal OC delivery behavior?
  • For the onboarding side, does OC have a cleaner native pattern for “optional topic already offered/provided, do not keep asking again,” or is this something we should model ourselves in our own state machine?

Main thing I want to know:
is Bryan behaving badly here because we diverged from the normal OC message delivery/onboarding patterns, or is this roughly expected behavior that just needs better custom handling on our side?

answer

atomic bisonBOT
#

Got it, I’ll wait for part 2.

#

I’m checking the Telegram delivery path and whether OpenClaw adds continuation markers or just blindly chunks long messages.

#

Yeah, I think these are two separate bugs, and they split pretty cleanly into normal OC delivery behavior vs custom Bryan logic.

1. Telegram multi-message output

This part looks roughly expected from normal OpenClaw delivery.

What I found in core:

  • Telegram outbound text is chunked
  • default channels.telegram.textChunkLimit is 4000
  • channels.telegram.chunkMode="newline" can prefer paragraph boundaries

So:

one assistant reply turning into multiple Telegram sends is normal if the final text exceeds the chunk limit.

Important nuance

I did not find built-in text continuation markers like:

#
  • 1/3
  • 2/3
  • 3/3

OpenClaw appears to just send the chunks sequentially. So the UX you described, multiple standalone-looking messages, is believable under normal chunking.

What that means

  • Multiple Telegram message IDs for one model reply: expected
  • No continuation labels: also expected right now
  • Feels messy to users: yep, but that’s more of a UX gap than a routing bug

Best native tweaks

  • set channels.telegram.chunkMode: "newline" so breaks happen on paragraph boundaries more often
  • keep Bryan’s answers shorter when possible
  • if you want numbered continuation markers or “continued…” handling, that looks like custom handling, not something OC is already doing for text chunks

2. Could companion_message be bypassing normal delivery?

#

Possible in general, but your specific symptom sounds more like normal OC chunking.

If companion_message were sending the final user-visible reply itself, I’d expect a more obviously tool-originated send pattern.

What you described instead is:

  • one GPT reply
  • then consecutive Telegram message IDs
  • for that one reply

That matches normal outbound chunk splitting pretty well.

So I’d say:

  • yes, a custom tool can bypass standard reply delivery if you use it that way
  • but this exact bloodwork case sounds more like the standard Telegram chunker than companion_message going rogue
#

3. Bloodwork re-prompting during onboarding

This part does not sound like an OpenClaw-native behavior problem.

I don’t think OC has a built-in semantic like:

“optional topic was already offered/provided, stop asking forever unless re-opened”

That’s really your domain state machine / profiling logic.

So if Bryan keeps re-asking for bloodwork even after bloodwork exists, that’s probably on the custom side:

  • onboarding known/missing logic
  • profiling state
  • “active” transition rules
  • optional-topic suppression rules

Clean pattern for the onboarding side

#

I’d model this explicitly in Bryan state, not hope the model infers it:

  • bloodwork_status = none | offered | declined | provided
  • bloodwork_last_offered_at
  • bloodwork_last_provided_at
  • maybe bloodwork_reprompt_allowed = false unless manually reopened

Then your profiling prompt builder should exclude bloodwork from “missing items” when:

  • it was already provided
  • or it was explicitly declined / deferred
  • or it was already offered recently and is optional

That’s a Bryan-side rule, not an OC core feature.

My bottom line

Telegram split replies

#
  • roughly expected OC behavior
  • chunked send is normal
  • continuation-aware UX would need custom polish

Bloodwork re-prompting

  • custom logic bug
  • not really an OC-native onboarding feature you’re missing
  • should be fixed in Bryan’s profiling/state machine layer

So I’d frame it like this

Bryan is not “mysteriously broken by OpenClaw” here.

He’s doing two things:

  1. using normal Telegram chunked delivery, which is ugly but expected
  2. using custom onboarding logic that is too persistent, which is your real app bug
#

Docs:

If you want, I can help sketch:

  • a clean Telegram continuation UX rule, and
  • a minimal bloodwork state model that stops the repeat asking.
strange adder
#

Need a design cross-check on a Bryan/OpenClaw bug before we overbuild the fix.

Observed bug:

  • user was mid-onboarding
  • afternoon heartbeat fired
  • instead of a light continuation or silence, Bryan sent the full first-message intro again:
    “hey, i’m bryan ... what brought you here?”
  • logs/transcript show the heartbeat turn was explicitly built with:
    ONBOARDING MODE: Bryan intro stage
  • so this was not model improv; the system thought intro was still unfinished

Current facts:

  • heartbeats already have suppression language like:
    • “If recent proactives went unanswered, skip unless there is a strong reason.”
  • scheduler also has recent-outbound / unanswered suppression checks
  • Bryan onboarding state currently lives partly in companion-data/users.json and partly in server/Postgres-backed profile state
  • the intro replay happened because the stored Bryan state still had no bryanIntroDone / no bryanStage, so resolveBryanStage() fell back to intro

Product decision we already want:

  • if onboarding is unfinished and user goes quiet, heartbeat should be:
    • at most one soft natural resume nudge
    • then silence until the user replies
  • absolutely never rerun the full intro

Question:
what is the cleanest fix shape here?

Option A:
small fix only

  • system-mark intro done on first intro turn
  • heartbeat special-case: never allow intro branch, only soft resume or skip

Option B:
broader state fix

  • also make Bryan onboarding/protocol state single-source-of-truth instead of split between users.json and server/Postgres-backed state

My concern:
Option A may be enough for the visible bug.
Option B is safer if the deeper issue is state drift.

Would you ship A first and stop there unless more bugs appear, or is the split-state issue already severe enough that B is the right minimum fix?

atomic bisonBOT
#

I’d ship A first, but I would not stop at A.

My read

  • A is the right hotfix
  • B is the right architecture

And importantly, even if you do B, you still want part of A anyway.

Why A first

Because “heartbeat must never replay full intro” is a product invariant, not just a storage detail.

So I’d add these guardrails immediately:

  1. Heartbeat can never take the full intro branch
    • only soft_resume or skip
  2. Mark intro done as soon as the intro is sent
    • not only after later stages
#
  1. If state is missing, do not default to intro blindly
    • if transcript/history already shows prior Bryan outbound, fall back to resume/incomplete, not intro

That fixes the visible bug fast and makes the system safer even before deeper cleanup.

Why B is still justified

The split-state setup has already caused a real user-facing bug.

So I would not treat this as “only do B if more bugs appear”. You already have evidence of state drift hurting behavior.

What I would avoid is making B the blocker for the immediate fix, because:

  • it’s broader
  • higher-risk
  • probably needs migration/backfill
  • and the product rule is already clear today

So my actual recommendation

#

Ship now

A+, not plain A

Meaning:

  • system-mark intro done on first intro send
  • heartbeat hard-blocks intro replay
  • missing-state fallback should never reopen intro if prior Bryan activity exists

Then next

Do B as follow-up:

  • one source of truth for onboarding/protocol state
  • other stores become cache/derived views only

If I had to phrase it bluntly

  • Would I ship A first? yes
#
  • Would I stop there? no
  • Is split state already severe enough that B is warranted? yes

So I’d treat this as:

A for immediate containment, B for root-cause removal

That’s the cleanest path without overbuilding the first fix.

somber berry
#

Hello. I am a Senior Software Engineer. I think I can help you resolve your issue. If you have any questions or need assistance with project development, please contact me via DM.

atomic bisonBOT
#

Best to keep it in-thread so others can verify it and Answer Overflow can index the fix.

If you’ve got a concrete hypothesis, logs to compare, or a specific OpenClaw behavior you think is off, drop it here.

strange adder
#

hey Krill, Opus here picking up Bryan after the rounds you coordinated with Fay on discord. doing a cross-check before i touch anything and want to understand the pattern you recommended.

where we landed:

workspace/SOUL.md slimmed 175 to 57 lines (rules only, no examples).

workspace/SAMANTHA.md (236 lines of voice few-shots) pulled from runtime, parked in workspace/reference/persona/ and not loaded anywhere.

rule-injector/index.js grew from ~360 lines at initial import to 1528 lines now. the biggest single jump (+316 / -9) came in the same commit that cut SOUL.

the before_prompt_build hook now emits ~18 section directive blocks per turn: language, health context, profile context, supplement stack, active follow-ups, onboarding checklist, three onboarding-mode variants keyed on bryanStage (intro/profiling/heartbeat-resume), background user notes, memory recall, bloodwork bundle, protocol switched, time, heartbeat resume only, identity.

the identity block is one line: "You are Bryan. Follow SOUL.md closely." so SOUL is referenced by one pointer, everything else is injected verbatim.

symptoms the user is reporting:

Bryan sounds cold and GPT-like. voice feels drowned under rules.

onboarding feels code-shaped. the onboarding checklist literally injects raw state field names (bryanOnboarding.topics.goal=known) into the prompt, and three different onboarding-mode prompts swap in and out depending on bryanStage.

questions:

PArt 2 coming wait for answer

#

was injecting many section directive blocks per turn your recommendation, or did we overgrow it? my read of OC docs was more "put content in workspace files, keep the hook lean" than "generate directive walls at runtime."
for persona and voice, is the OC-canonical pattern to load SOUL.md and SAMANTHA.md as workspace and let the session surface them, rather than point at SOUL in one identity line and drown it with competing rules injected at the same priority?
for per-state behavior (intro vs profiling vs active), is there a cleaner OC primitive than branching before_prompt_build on a server-held state flag and emitting three different system prompts? is there a way to keep one prompt and let memory do the state tracking?
1528 lines in a single extension's before_prompt_build hook. is that normal at your larger deployments, or is that a smell?
any OC pattern for few-shot voice examples that doesn't require re-injecting every turn? put them in memory once and reference?
not asking for a code review. just want to know whether we followed the OC grain or cut against it. if we cut against it i'd rather unwind to the idiomatic pattern than keep patching.

happy to share the file or a diff if useful.

answer pls

atomic bisonBOT
#

Got it, I’ll wait for part 2.

#

Short answer: you cut against the OC grain.

I don’t think the idiomatic OpenClaw pattern is “one tiny identity line plus 18 runtime directive walls at the same priority.” That’s basically a second prompt compiler.

My take on each point

Were many per-turn directive blocks my recommendation?

No. That’s overgrown.

The OC-native shape is much closer to:

  • stable workspace files
  • tight tool surface
  • small hook additions
  • on-demand retrieval via tools/read/memory

not giant per-turn system-layer synthesis.

#

Persona / voice

The canonical pattern is:

  • runtime voice in SOUL.md
  • operating rules in AGENTS.md
  • full long-form persona spec kept outside always-injected bootstrap

So I would not load both SOUL.md and full SAMANTHA.md every turn.

But I also would not reduce voice to one line and then drown it in competing injected rules.

Best pattern:

  • concise strong Bryan SOUL.md as the real runtime voice layer
  • full Samantha file kept as source-of-truth/reference, not normal turn cargo

Per-state behavior

#

I would not use memory as the primary state machine.

Memory is fine for recalled facts/context. It is not a good replacement for deterministic protocol state like:

  • intro
  • profiling
  • active
  • resume allowed / not allowed

That state should stay in app/server state.

But the prompt representation should be small semantic state, not raw internal fields and not three giant mode prompts.

So instead of:

  • separate huge intro/profiling/heartbeat-resume prompt branches
#

I’d prefer:

  • one stable prompt
  • one compact state summary block, something like:
    • current mode: profiling
    • known: goal, language
    • optional unresolved: bloodwork
    • heartbeat policy: soft resume once, then silence

1528-line before_prompt_build hook

Yeah, that’s a smell.

Not “impossible”, not “forbidden”, but definitely a smell.

If the hook has become the main context/orchestration engine, then one of two things is true:

  1. it should be radically simplified
#
  1. or it wants to become a proper context-engine plugin / more explicit architecture

Right now it sounds like it’s doing too much at the highest-priority layer.

Few-shot voice examples

I would not re-inject them every turn.

And I also would not put them in memory as if they were user facts.

Better options:

  • distill the voice into SOUL.md
  • if absolutely needed, keep 1-2 tiny examples in SOUL.md
  • keep the larger example bank in a reference file for:
    • evals
    • persona rewrites
    • debugging
#
  • occasional on-demand read

The core problem

I think Bryan feels cold because the runtime has become:

state machine instructions + protocol directives + field dumps + checklist language

with voice reduced to a pointer.

That will absolutely flatten personality.

So the idiomatic unwind is

  • strong concise SOUL.md
  • lean AGENTS.md
  • tiny semantic state summary
  • deterministic state in app/backend
  • no raw field-name dumps
#
  • no full few-shot persona blocks every turn
  • if custom context assembly is truly central, graduate to a context-engine instead of growing before_prompt_build forever

So yes, I think you cut against the grain, and I’d unwind toward the idiomatic pattern instead of patching this shape forever.

Docs:

If you want, I can help turn that into a target architecture checklist for Bryan, no code review needed.

strange adder
#

yes please, a target architecture checklist for Bryan would help a lot. we'll treat it as the north star and unwind toward it piece by piece rather than patching the current shape. no rush on your end. if easier, a bulleted "here's the shape, here's what's out of scope" is enough — we'll turn each item into its own spec.

atomic bisonBOT
#

Absolutely. Here’s the north-star target architecture I’d use for Bryan.

Bryan target shape

1. Bryan is a thin custom layer on top of normal OpenClaw

  • OpenClaw handles the base session/runtime/tooling/delivery model
  • Bryan adds:
    • voice/persona
    • a small amount of protocol state
    • a few domain tools
    • a compact state summary when needed
  • Bryan does not become a second giant prompt assembly engine

2. Persona lives in a real runtime SOUL, not in per-turn directive walls

  • Dedicated Bryan workspace
#
  • SOUL.md is the actual runtime voice contract
  • AGENTS.md is operational behavior only
  • TOOLS.md is tiny or empty unless genuinely useful
  • full SAMANTHA.md / long few-shot persona spec stays in a reference file, not normal runtime prompt

Goal: Bryan sounds like Bryan because the runtime prompt is voiced correctly, not because a hook restates personality every turn.


3. Per-turn custom injection is small, semantic, and boring

The hook should inject at most a compact summary like:

  • current mode: intro / profiling / active
  • known user facts relevant to this turn
  • optional unresolved topics
  • active follow-up, if any
  • heartbeat policy hint, if relevant
#

It should not inject:

  • raw internal field names
  • giant onboarding checklists
  • repeated persona prose
  • full memory dumps
  • multiple competing mode prompt walls

Goal: the custom layer tells Bryan what matters right now, not how to think from scratch every turn.


4. Onboarding/protocol state is deterministic app state

  • One source of truth for:
    • bryanStage
    • intro sent/done
    • profiling status
    • optional-topic status
#
  • active follow-up / pending resume
  • Memory can support recall, but it is not the onboarding state machine

Goal: no more “missing field => fall back to intro” bugs.


5. Optional topics are modeled explicitly

For things like bloodwork:

  • not_offered
  • offered
  • declined
  • provided
  • optionally revisit_later

Rules:

#
  • do not keep asking once offered/declined/provided
  • only re-open when user asks, new evidence appears, or product policy explicitly says to revisit

Goal: optional does not become persistent nagging.


6. Heartbeats are continuation-aware, never onboarding resets

Heartbeat behavior should be:

  • if active conversation is warm and unanswered, maybe one soft resume
  • after that, silence until the user replies
  • never replay full intro
  • never reopen optional onboarding topics aggressively

Goal: heartbeats feel like gentle continuity, not a broken script restart.

#

7. Bryan’s tool surface is minimal and explicit

Base shape:

  • profile: "minimal"
  • explicitly allow only the tools Bryan truly needs

Likely keep:

  • image
  • log_meal
  • lookup_supplement_product
  • upsert_supplement_stack
  • upsert_profile_facts

Optional only if justified:

  • pdf
#
  • memory_search
  • memory_get
  • read
  • tts

Do not keep:

  • exec
  • process
  • image_generate
  • music_generate
  • video_generate

Goal: small schema, fewer failure modes, no shell fallback.


8. Final replies use normal OpenClaw delivery

#
  • Normal assistant replies go through normal OC outbound delivery
  • custom tools like companion_message are only for deliberate special sends, not ordinary final answer delivery
  • Telegram chunking is accepted as transport behavior, with paragraph-aware splitting if possible

Goal: one assistant turn should behave like one assistant turn, even if Telegram has to split the transport payload.


9. If custom context assembly stays big, promote it to a real architecture

If Bryan truly needs heavy dynamic context logic long-term, the answer is not an ever-growing before_prompt_build.

At that point, choose one:

  • simplify hard
  • or move toward a proper context-engine/plugin-level design

Goal: avoid a 1500-line hook becoming the hidden brain of the system.

#

What is out of scope for the target shape

Not the pattern

  • giant per-turn directive walls
  • one-line identity pointer plus competing rule dump
  • raw backend state fields in prompt text
  • memory used as a protocol state machine
  • full few-shot persona examples every turn
  • shell fallback for domain actions
  • heartbeat redoing onboarding
  • custom tool delivery replacing normal replies by accident

Success criteria

#

You know Bryan is back on track when:

  • he sounds like Bryan again without giant runtime persona injection
  • onboarding feels conversational, not code-shaped
  • bloodwork is optional and not naggy
  • heartbeat never restarts intro
  • prompt size drops materially
  • tool surface is small and stable
  • custom hook output is understandable in one screen, not a novella

Suggested unwind order

  1. lock heartbeat behavior
  2. remove exec / shrink tool surface
  3. unify onboarding state
  4. trim hook to compact semantic summaries
#
  1. restore strong concise runtime SOUL.md
  2. keep long Samantha spec out of runtime
  3. only then decide whether a context-engine is actually needed

If this fixed it, please right-click the solution message → Apps → ✅ Mark Solution so Answer Overflow can index it for future folks.

strange adder
#

hey Krill, follow-up isolation audit on Bryan before we continue the unwind. serious one, we want to be sure before shipping.

Bryan's current shape: one agent id main, shared workspace at /root/.openclaw-companion/workspace, session.dmScope "per-account-channel-peer", sandboxing off. N Telegram users all route through session keys like agent:main:telegram:direct:peerId. Per-user data (profile, memory, health context) is resolved inside the rule-injector extension by parsing peerId out of the session key, then loading per-user dirs from companion-server.

what we already verified from docs:

config-reference endorses id:main + per-sender session scoping (matches Bryan)
multi-agent page says "true isolation requires one agent per person"
session page warns about unscoped multi-user DMs — Bryan IS scoped, so that specific warning doesn't apply
no hostile-multi-tenancy boundary documented on any page we checked
no per-user isolation primitive finer than agent+sandbox
the three pages contradict each other on our exact case, so we want your read.

ANSWER after part 2

#

ive questions:

for a sensitive-but-not-regulated consumer bot (personal health data, no HIPAA on the roadmap), is "per-account-channel-peer" dmScope + single agent + extension-level per-peer data loading actually sufficient in your view, or is the "one agent per person" line on the multi-agent page the bar for this risk class?
we have a fallback in our user-resolver: if sessionKey is bare "main" or "agent:main:main", it resolves to a hardcoded DEFAULT_USER_ID (my user). this is bryan-code, not OC-code. is there any normal OC path that would produce a bare "main" session key in a multi-peer setup — internal crons, heartbeat replay, reset flows, memory consolidation — or should we treat any bare "main" key as "impossible in normal flow, therefore a bug if it arrives"?
sandboxing with agents.defaults.sandbox.scope: "agent" — is this expected for consumer bots or overkill? we're sandbox-off because extensions need host filesystem access for per-user workspace/memory paths. does sandbox mode support that pattern or does it break extension-level absolute-path reads?
companion-server calls back into OC via POST /hooks/agent with explicit sessionKey from app-side state. if that server ever sends the wrong sessionKey by bug (user A's trigger accidentally carries user B's peerId), OC just trusts it and fires for user B. any OC-side validation hooks or patterns you'd recommend to harden that boundary?
known classes of cross-agent or cross-peer leak you'd want us to specifically test for? compaction, memory flush, post-reset context, anything session-adjacent?
not asking for exhaustive answers. for each, "definitely a problem at your scale/sensitivity, fix now" vs "fine at your trust level" vs "depends on X we didn't tell you" is enough.

if any of these are obviously answered in docs and we missed them, just point us at the page.

ANSWER pls

atomic bisonBOT
#

Got it, I’ll wait for part 2.

#

I’m checking the session, multi-agent, and sandbox docs so I can answer this cleanly instead of guessing.

#

I think the docs are talking about different layers, not actually contradicting each other:

  • dmScope=per-account-channel-peer = good session/context isolation
  • one agent per person = true workspace/auth/session-store isolation
  • security docs = session key is routing, not auth

So my read is:

1. Single agent + per-peer dmScope + extension-level per-peer loading

Verdict: depends, but not crazy for your trust level.

For a sensitive consumer bot without hostile multi-tenancy claims, this is acceptable as a context-isolation pattern, not as a strong tenant-isolation boundary.

So I’d classify it as:

  • fine at your trust level if:
    • peer-scoped session keys are enforced everywhere
#
  • no default-user fallback

    • companion-server cannot arbitrarily retarget users
    • you explicitly test cross-peer leak paths
  • not enough if you want to claim strong isolation, or if users are mutually adversarial

That’s exactly where the docs line up:

  • session docs say per-peer scoping is the fix for DM leakage
  • multi-agent docs say one-agent-per-person is the bar for true isolation
  • security docs say session key is not itself an auth boundary

So: good enough for context hygiene, not a hard security boundary.


2. Bare main / agent:main:main fallback to your default user

Verdict: definitely a problem, fix now.

#

This is the sharpest red flag in your note.

Because yes, OpenClaw does have normal paths that use the main session:

  • heartbeat defaults to main
  • hooks/CLI/API flows can target main
  • tools/docs treat "main" as a valid session selector in some paths

So main is not impossible in OpenClaw generally.

But in your Bryan multi-peer consumer flow, a bare main session should never silently map to a real end user.

My recommendation:

  • treat bare main / agent:main:main as internal/admin/unscoped
  • never resolve it to a real consumer by fallback
  • if a user-scoped action arrives without a peer-scoped session key, fail closed
#

That one I’d fix immediately.


3. Sandboxing with scope: "agent"

Verdict: fine, but not the control that solves your current risk.

scope: "agent" is the documented default and a normal choice, not overkill.

But the important nuance is:

sandboxing mainly constrains tool execution, not your host-side routing/state logic.

Your current scary failure mode is:

  • wrong session key
  • wrong user resolution
  • wrong data loaded by extension/server
#

Sandboxing does not fix that class of bug.

Also, docs are explicit that the Gateway process itself is not sandboxed. So host-side extension/hook code can still read host paths.

So I’d say:

  • fine to add if you want more blast-radius reduction
  • not the main fix for your cross-peer risk
  • does not save you from bad sessionKey mapping in host-side extension code

4. /hooks/agent with caller-supplied sessionKey

Verdict: definitely a problem at your sensitivity, harden now.

The exact risk you described is real:

  • if companion-server sends the wrong session key
#
  • OpenClaw will route that run to that session

Docs-backed hardening I’d use:

  • keep hooks.allowRequestSessionKey=false unless you truly need it
  • if you do need it, set hooks.allowedSessionKeyPrefixes
  • better yet, do not let app code send arbitrary raw session keys if you can avoid it

Best pattern in your stack:

  1. app sends a logical user identity / peer id
  2. one trusted layer derives the canonical OC session key
  3. validate:
    • expected channel = telegram
    • expected account
    • expected peer id shape
  4. reject main for user-scoped runs
#

So yes, I’d harden that boundary now.


5. Leak classes I would explicitly test

Verdict: test now.

The main ones I’d care about:

Must-test

  • wrong sessionKey on /hooks/agent sends/loads the wrong user
  • bare main session accidentally loads default user
  • heartbeat run without explicit peer key leaks into wrong user context
  • target:last / last-route delivery after resets still goes to the right peer
  • /new or /reset on one peer does not affect another peer
  • compaction summary from user A never appears in user B
  • memory/profile recall path never pulls cached data from prior peer
#

Also worth testing

  • post-reset bootstrap / startup context
  • isolated heartbeat vs normal heartbeat
  • any companion-server cache keyed too loosely
  • any proactive send path that uses session-derived routing

Bottom line by question

Is current model sufficient?

Fine at your trust level, if hardened. Not strong isolation.

Is bare main fallback okay?

No. Fix now.

Is sandbox.scope="agent" overkill?

#

No, but it’s not the main fix for this bug class.

Should /hooks/agent caller-supplied session keys be hardened?

Yes. Fix now.

Are there leak classes you should explicitly test?

Yes, especially main-session fallback, wrong-session-key routing, compaction, reset, and heartbeat delivery.

Useful docs:

If this fixed it, please right-click the solution message → Apps → ✅ Mark Solution so Answer Overflow can index it.

strange adder
#

should i move out of multi agent and just go for single gateway

atomic bisonBOT
#

Short answer: no, not yet.

I would stay on a single gateway for now.

Your immediate problems are not “multi-agent architecture is wrong”, they’re:

  • bad/default session fallback
  • caller-supplied sessionKey trust
  • split state
  • oversized custom prompt assembly

Moving gateways won’t magically fix those.

What I’d do

Stay single gateway if:

  • Bryan is one product/bot
  • users are not mutually adversarial in a hard isolation sense
#
  • you’re okay with context isolation, not strong tenant isolation

Harden now:

  • kill bare main → default-user fallback
  • require peer-scoped session keys for user-scoped runs
  • constrain /hooks/agent session key usage
  • keep dmScope: "per-account-channel-peer"
  • test leak paths hard

When I would move further

One agent per person

Only if you want stronger workspace/auth/session-store isolation inside one gateway.

Separate gateways

If users are a real trust boundary, or you want to make stronger privacy/isolation claims.

So my recommendation is:

#

single gateway is fine for now, but harden the routing/state boundary first.

I wouldn’t spend effort on a gateway split before fixing the bugs you already found.

strange adder
#

I am just scared of getting sued for data leak

atomic bisonBOT
#

Yeah, that fear is valid.

I’m not giving legal advice, but from an engineering-risk point of view:

if you’re scared of cross-user data leaks, treat any path that can misroute a session as a ship blocker.

So I would not panic-migrate everything right this second, but I also would not hand-wave the current boundary.

What I’d call ship blockers

Fix these before you trust Bryan with real user health data at scale:

  • bare main / default-user fallback
  • caller-supplied sessionKey that can retarget another peer
  • no explicit fail-closed behavior for malformed/unscoped session keys
  • no cross-peer leak test suite

Those are the scary ones.

#

My honest recommendation

If this is the thing keeping you up at night, choose the boring architecture:

  • single gateway is still okay
  • but make the routing boundary extremely strict
  • fail closed everywhere
  • reduce custom magic
  • test for cross-peer leaks like it’s a product requirement

If after that you still feel uneasy, then yes, push toward stronger isolation, not because docs force it, but because your risk tolerance does.

Practical framing

  • Need to launch soon?
    Keep single gateway, harden the boundary, make leak tests mandatory.

  • Need to sleep at night / reduce blast radius further?
    Move toward stronger isolation over time, because simpler and harder boundaries are easier to defend.

#

So I wouldn’t say “you must split gateways today.”

I would say:

do not ship with known session-routing footguns if data leak liability is the thing you’re afraid of.

If you want, I can turn that into a pre-ship privacy/routing checklist for Bryan.

strange adder
#

yes please, a pre-ship privacy/routing checklist would help a lot. we'll treat those as hard ship gates before any more user growth. no rush on your end.

one follow-up while you draft it: for 4.20 specifically, do you have a recommended pattern for the "trusted layer that derives canonical OC session key from logical peerId"? should this live in companion-server (our code) or is there an OC-side hook/middleware that can validate/enforce session-key shape on inbound /hooks/agent calls

atomic bisonBOT
#

Yep. For 4.20, my preferred pattern is:

Recommended pattern for canonical session key derivation

Best

Do not let companion-server send arbitrary raw sessionKey at all.

Instead:

  1. companion-server sends a logical identity payload
    • agent
    • channel
    • accountId
    • peerId
    • intended delivery target
  2. OpenClaw receives that through a mapped hook
  3. the hook transform derives the canonical session key inside the OC boundary
#
  1. hooks.allowRequestSessionKey stays false

That’s the cleanest shape, because the untrusted caller never gets to choose the routing key directly.

If you keep derivation in companion-server

That’s still workable, but then I’d want:

  • one shared canonical derivation function
  • no bare main fallback
  • OC-side allowedSessionKeyPrefixes
  • hard rejection if key shape does not match expected agent/channel/account/peer format

Closest OC primitive

There is not a generic “middleware validator” for /hooks/agent beyond:

  • hooks.allowRequestSessionKey=false by default
  • hooks.allowedSessionKeyPrefixes
#
  • hooks.mappings + transform

So if you want OC-side enforcement, mapped hooks + transform is the native path.


Bryan pre-ship privacy / routing checklist

These are the ones I’d treat as hard ship gates.

A. Session routing must fail closed

  • [ ] No default-user fallback for bare main / agent:main:main
  • [ ] Any user-scoped run without a peer-scoped session key is rejected
  • [ ] Canonical session key derivation exists in exactly one place
  • [ ] Session key shape includes expected agent + channel + account + peer
  • [ ] Resolver mismatch = hard error, never silent fallback
#

B. Hook ingress must not trust raw routing keys

  • [ ] hooks.allowRequestSessionKey=false unless absolutely required
  • [ ] Prefer hooks.mappings + transform over raw /hooks/agent session key input
  • [ ] If explicit session keys are allowed, constrain with hooks.allowedSessionKeyPrefixes
  • [ ] Companion-server cannot arbitrarily retarget another peer by accident
  • [ ] Bare main is forbidden for consumer user actions

C. Per-user data resolution must be strict

  • [ ] User resolution is derived from canonical peer identity, not “best guess”
  • [ ] No resolver fallback to your own default user
  • [ ] Profile/memory/health paths are keyed strictly by resolved user id
  • [ ] If resolved peer ↔ user mapping is ambiguous, error out
  • [ ] Logs record peerId, resolved userId, and final sessionKey

D. Heartbeats / proactives must be peer-safe

  • [ ] Heartbeat runs for users use explicit peer-scoped session keys or isolated sessions
  • [ ] No heartbeat may reopen intro because state is missing
#
  • [ ] target:last behavior is tested after reset/idle/session reuse
  • [ ] No proactive send can fall back to the wrong peer from stale routing state

E. Cross-peer leak tests must exist

  • [ ] Wrong session key test: user A trigger cannot load/send as user B
  • [ ] Compaction test: A summary never appears in B
  • [ ] Reset test: /new or /reset for A does not affect B
  • [ ] Post-heartbeat test: A heartbeat cannot deliver into B
  • [ ] Memory/profile recall test: no peer switch contamination
  • [ ] Startup/recovery test: restored sessions do not default to wrong user

F. Prompt / tool surface must not create side-channel failures

  • [ ] exec and process removed unless explicitly needed
  • [ ] No shell fallback for supplement or health paths
  • [ ] Tool inventory is minimal and explicit
  • [ ] Final user replies go through normal OC delivery unless intentionally special-cased
  • [ ] No giant per-turn prompt assembly that reintroduces routing ambiguity
#

G. State model must be deterministic

  • [ ] One source of truth for onboarding/protocol state
  • [ ] Optional topics have explicit status (offered, declined, provided, etc.)
  • [ ] Missing state never implies “replay intro”
  • [ ] Heartbeat logic cannot invent onboarding state from absence alone

H. Operational safeguards

  • [ ] Structured audit logs for every inbound/outbound user run
  • [ ] Ability to disable proactives quickly if routing bug appears
  • [ ] Canary tests before growth
  • [ ] Manual incident playbook for suspected cross-user leak

My recommendation

If you want the safest shape without overbuilding:

#

Use a mapped hook transform in OpenClaw to derive the session key from logical peer identity, keep allowRequestSessionKey=false, and delete every fallback that guesses a user.

That’s the cleanest boundary.

Useful docs:

If this helped, please right-click the solution message → Apps → ✅ Mark Solution so Answer Overflow indexes it.

strange adder
#

one last thing worth flagging: 4.22 (leak test suite) is the thing that gives you confidence to sleep. it's the hardest to write (needs multi-user fixtures)

4.22 leak test suite. 2 days, since you need 10+ scenarios with real fixtures.

can you help with this

atomic bisonBOT
#

Yes, absolutely. And I agree, 4.22 is the one that buys sleep.

I’d build it as a multi-user isolation test matrix with two fake users:

  • User A → peer 111, userId user-a
  • User B → peer 222, userId user-b

Give them obviously different fixtures:

  • different profile facts
  • different bloodwork
  • different memory snippets
  • different last-route / session state

Then assert Bryan never crosses the streams.

Test harness shape

#

Shared fixture setup

For every test, create:

  • sessionKeyA = agent:main:telegram:default:direct:111
  • sessionKeyB = agent:main:telegram:default:direct:222

And distinct data:

  • A fact: favorite_color = red
  • B fact: favorite_color = blue
  • A bloodwork marker: LDL_A_ONLY
  • B bloodwork marker: LDL_B_ONLY

Also capture:

  • resolved sessionKey
  • resolved peerId
#
  • resolved userId
  • loaded file paths / profile ids
  • outbound delivery target

That makes leak detection obvious.


4.22 leak suite, suggested scenarios

A. Routing / resolver correctness

1. Correct peer → correct user

Input: sessionKeyA
Assert: loads only user A data, never B

2. Wrong peer key should not silently fallback

#

Input: malformed or missing peer session key
Assert: hard error, no default user resolution

3. Bare main must fail closed

Input: main or agent:main:main in user-scoped Bryan flow
Assert: rejected, never resolves to default user

4. Wrong session key on /hooks/agent

Input: event logically for A but sessionKeyB passed
Assert: either rejected by validation or clearly routed to B in test so bug is caught


B. State separation

5. Onboarding stage isolation

Setup: A in profiling, B active

#

Assert: A prompt/state summary never contains B’s stage and vice versa

6. Optional-topic isolation

Setup: A bloodwork provided, B bloodwork missing
Assert: A is not reprompted due to B’s missing topic

7. Follow-up isolation

Setup: A has active follow-up, B none
Assert: B never sees A’s follow-up prompt state


C. Memory / profile / bloodwork isolation

8. Memory recall isolation

Setup: A memory contains unique sentinel, B different sentinel
Assert: A run never loads B sentinel

#

9. Bloodwork bundle isolation

Setup: distinct bloodwork docs for A/B
Assert: A analysis never touches B bundle path/content

10. Profile context isolation

Setup: A and B have conflicting profile fields
Assert: rendered profile summary is always peer-correct


D. Session lifecycle isolation

11. /new / reset isolation

Action: reset A session
Assert: B session untouched, B state persists

12. Heartbeat isolation

#

Setup: A unanswered, B recently active
Assert: heartbeat for A never delivers to B and never uses B state

13. Last-route / delivery isolation

Setup: last outbound route differs for A/B
Assert: target:last for A never resolves to B destination

14. Post-restart recovery isolation

Setup: persist both sessions, restart/hydrate
Assert: restored session state maps back to the right peer/user


E. Compaction / transcript safety

15. Compaction summary isolation

Setup: A and B both long histories

#

Assert: compacted A summary contains no B sentinel and vice versa

16. Post-compaction follow-up isolation

Action: compact then send next user turn
Assert: resumed context stays peer-correct


F. Negative / adversarial checks

17. Cross-peer peerId substitution

Input: payload says user A, session key says peer B
Assert: rejected

18. Missing account/channel component

Input: partial session key missing expected Telegram/account parts
Assert: rejected

#

19. Unknown peer

Input: valid-shaped session key for nonexistent peer
Assert: rejected, no fallback

20. Fixture swap regression

Setup: intentionally flip A/B storage dirs in test harness
Assert: suite catches it immediately via sentinels


What I’d require before calling it good

Minimum green bar

I’d want these passing before sleep:

  • 1, 3, 4, 8, 9, 11, 12, 13, 15, 17

Nice-to-have next

#
  • restart/recovery
  • compaction continuation
  • malformed partial keys
  • proactive delivery edge cases

Practical advice

Do it in two layers:

Fast layer

Pure resolver / state / routing tests

Slow layer

A few end-to-end runs with real fixture dirs and captured outbound targets

That keeps the suite maintainable.

#

If you want, I can help turn this into a proper test spec file format next, like:

  • test name
  • setup
  • trigger
  • expected resolution
  • expected loaded paths
  • expected outbound target
  • forbidden sentinels
strange adder
#

You can suggest ok

atomic bisonBOT
#

Ok. I’d start with the 10-test minimum green bar first, not the full 20.

Best next move:

  1. build 2 canonical users with loud sentinel fixtures
  2. write fast resolver/routing tests first
  3. add 3 slow end-to-end tests for:
    • wrong sessionKey
    • heartbeat delivery
    • compaction isolation

If you want, I can draft the exact test case template next so you can hand it straight to whoever writes 4.22.

strange adder
#

okay thank you write me the test template so i can apply it

atomic bisonBOT
#

Yep. Use this as the 4.22 leak-test case template.

## Test ID
4.22.x

## Title
Short name, e.g. "Bare main session key is rejected in user-scoped Bryan flow"

## Layer
- [ ] Fast unit/integration
- [ ] Slow end-to-end

## Risk Class
- [ ] Routing leak
- [ ] State leak
#
- [ ] Delivery leak
- [ ] Memory/profile leak
- [ ] Compaction/reset leak

## Why this exists
One sentence on the failure mode this test is meant to catch.

## Fixtures

### Users
- User A
  - peerId:
  - userId:
  - sessionKey:
  - sentinel facts:
#
- User B
  - peerId:
  - userId:
  - sessionKey:
  - sentinel facts:

### File/Data Fixtures
- A profile path:
- B profile path:
- A memory path:
- B memory path:
- A bloodwork marker:
- B bloodwork marker:

## Initial State
#
Describe any preloaded onboarding/session/profile state.

Example:
- A.stage = profiling
- B.stage = active
- A.lastRoute = telegram:111
- B.lastRoute = telegram:222

## Trigger
Exact input that starts the run.

Example:
- POST /hooks/agent with payload:
  - message:
  - sessionKey:
#
  - channel:
  - to:

## Expected Resolution
What Bryan/OpenClaw should resolve before model work starts.

- expected agentId:
- expected sessionKey:
- expected peerId:
- expected userId:
- expected data roots loaded:
- expected delivery target:

## Forbidden Resolution
Anything that must never happen.
#

- must NOT resolve to userId:
- must NOT load paths:
- must NOT send to:
- must NOT contain sentinels:

## Assertions
Concrete checks.

- [ ] resolved sessionKey == ...
- [ ] resolved userId == ...
- [ ] loaded profile path belongs only to ...
- [ ] prompt/context does not contain forbidden sentinel
- [ ] outbound target == ...
- [ ] no fallback/default user path used
#
- [ ] error thrown / request rejected (if negative test)

## Observability / Evidence to Capture
What logs or traces the test should assert on.

- derived sessionKey
- derived peerId
- derived userId
- loaded file paths
- outbound chat id / peer id
- compaction summary text (if relevant)

## Pass Condition
One clear sentence.
#
## Notes
Any edge cases, follow-ups, or why this test is slow.

Minimal fixture convention

Use loud sentinels so leaks are obvious:

User A:
- peerId: 111
- userId: user-a
- sessionKey: agent:main:telegram:default:direct:111
- sentinel: ONLY_A_SENTINEL

User B:
#
- peerId: 222
- userId: user-b
- sessionKey: agent:main:telegram:default:direct:222
- sentinel: ONLY_B_SENTINEL

Example filled test

## Test ID
4.22.03

## Title
Bare main session key is rejected in user-scoped Bryan flow
#
## Layer
- [x] Fast unit/integration
- [ ] Slow end-to-end

## Risk Class
- [x] Routing leak

## Why this exists
Prevents accidental fallback from bare main -> default user.

## Fixtures

### Users
- User A
  - peerId: 111
#
  - userId: user-a
  - sessionKey: agent:main:telegram:default:direct:111
  - sentinel facts: ONLY_A_SENTINEL
- User B
  - peerId: 222
  - userId: user-b
  - sessionKey: agent:main:telegram:default:direct:222
  - sentinel facts: ONLY_B_SENTINEL

## Initial State
- Bryan runs in user-scoped Telegram mode
- no request should ever resolve consumer data from bare main

## Trigger
- POST /hooks/agent
#
  - message: "heartbeat resume"
  - sessionKey: "agent:main:main"

## Expected Resolution
- request is rejected before user resolution
- no peerId derived
- no userId derived
- no profile/memory files loaded

## Forbidden Resolution
- must NOT resolve to user-a
- must NOT resolve to user-b
- must NOT load /profiles/user-a/*
- must NOT load /profiles/user-b/*
#
## Assertions
- [x] error thrown / request rejected
- [x] no fallback/default user path used
- [x] no outbound target selected

## Observability / Evidence to Capture
- attempted sessionKey
- resolver error code
- loaded paths list is empty

## Pass Condition
Bare main does not resolve any real user and does not produce a run.

Best practice

For each test, make sure you always capture:

#
  • resolved sessionKey
  • resolved userId
  • loaded paths
  • outbound target
  • forbidden sentinels absent

If you want, next I can write the 10-test minimum green-bar set already filled out in this template.

strange adder
#

Hey @atomic bison

More harness or maybe is more of a what is missing question

The companion now feel like something missing

Doesn’t proactive tell the user what it can do

Trying to showcase the features

And being proactively guiding the user in the journey of discovering how to use the companion

It talks like a friend but something missing

For example you are an assistant for dicord and you proactively try to help and ask if user needs more help

Also general gpt or other wep interface for llm feel more proactive too

What is missing?

What should we add in soul or somewhere else

atomic bisonBOT
#

I think what’s missing is not more harness. It’s:

assistant posture + capability surfacing

Right now Bryan may have:

  • friend tone
  • lighter voice
  • less rule sludge

But he may be missing:

  • a clear sense of what job he does for the user
  • when to show what he can do
  • how to guide the next step without sounding salesy

So he feels pleasant, but not fully assistant-like.

What’s probably missing

#

1. Capability signaling

He should occasionally make the affordances visible:

  • what he can help with
  • what kinds of inputs are useful
  • what the next best action is

Not as a giant feature list, more like:

  • “If you want, send a meal and I can log it.”
  • “You can also upload bloodwork and I’ll walk you through it.”
  • “If supplements are part of this, I can help clean that up too.”

2. Guided follow-through

A good assistant doesn’t just reply, it gently advances the conversation.

That means:

#
  • after answering, suggest one relevant next step
  • when user seems unsure, offer 1-2 concrete ways to use Bryan
  • during onboarding, orient the user without dumping protocol text

3. Product confidence

He should feel like:

  • “I know what I’m for”
  • not just “I’m here if you need anything”

That’s the missing assistant energy.


Where to put it

Put in SOUL.md

Put the stance there:

#
  • Bryan is proactive, not passive
  • he helps users discover what he can do
  • he offers the next helpful step
  • he does not dump menus or sound like a tutorial bot
  • he guides naturally, like a smart companion

That belongs in SOUL because it changes how he feels.

Put in AGENTS.md or compact runtime behavior rules

Put the operating behavior there:

  • in early turns, reveal capabilities contextually
  • after a successful action, offer one relevant next step
  • if the user is stalled, suggest the best input Bryan can use next
  • never dump the full feature catalog unless asked

That belongs in behavior/policy, not just tone.

#

Put in the state/onboarding layer

Put the contextual hinting there:

  • intro stage → show 1-2 things Bryan can help with
  • profiling stage → suggest the most relevant next input
  • active stage → mostly stop explaining, just help
  • heartbeat resume → nudge with one concrete next move, not a generic “how can I help?”

That should be a tiny semantic block, not a giant checklist.


The short version

What’s missing is:

“I’m a companion with initiative”, not just “I’m a friendly chat voice.”

#

He needs:

  • stronger mission
  • contextual capability reveal
  • one-step-ahead guidance

Not more prompt bulk, just better product posture.


Practical rule to add

If you want one simple north-star rule:

Bryan should usually leave the user knowing one useful thing he can do next.

That alone changes the feel a lot.

If you want, I can draft:

#
  1. a small SOUL add-on for this
  2. a tiny AGENTS/onboarding rule block
  3. a 3-stage capability reveal pattern for intro / profiling / active.
strange adder
#

Yes that would be great so I could get a better concrete example

atomic bisonBOT
#

Nice, here’s a small concrete version I’d use.

1. SOUL.md add-on, stance

## Assistant posture

You are not just a friendly chat partner. You are a capable health companion with initiative.

When it helps, make your usefulness visible. Briefly show the user what you can help with through the next relevant action, not a feature dump.

After a useful reply, try to leave the user with one concrete next move.

Guide naturally. Do not sound like a tutorial, onboarding wizard, or app menu.

Do not list everything you can do unless asked. Reveal capabilities contextually and sparingly.
#

Be warm, confident, and lightly proactive. Help the user feel accompanied, not managed.

That gives Bryan the missing assistant energy without making him salesy.


2. AGENTS.md / behavior block

## Capability reveal behavior

Bryan should help users discover what he can do through the conversation itself.
#
Rules:
- In early turns, mention 1-2 relevant things Bryan can help with.
- After a successful action, offer one sensible next step.
- If the user seems unsure what to do, suggest the most useful input Bryan can work with next.
- Never dump a full feature list unless the user explicitly asks.
- Prefer concrete examples over abstract capability labels.
- Once the user is clearly active and understands Bryan, reduce explanation and just help.

This makes the behavior operational, not just vibes.


3. Tiny state-layer pattern

Instead of a giant onboarding checklist, I’d keep the runtime hint this small:

#

Intro stage

Current mode: intro.
Goal: help the user quickly understand what Bryan can do.
Behavior: be warm, brief, and concrete. Mention one or two useful things Bryan can help with right away.

Profiling stage

Current mode: profiling.
Goal: gather missing high-value context without sounding like a form.
Behavior: ask for only the next most useful detail, and connect it to how Bryan can help.

Active stage

#
Current mode: active.
Goal: help directly.
Behavior: stop explaining the system unless needed. Offer one next helpful action when relevant.

That’s enough. No raw field names, no code-shaped dumps.


Example feel shift

Too passive

hey, i’m bryan. what brought you here?

Better

hey, i’m bryan. i can help with stuff like meals, supplements, bloodwork, and generally making sense of your health inputs. what’s the main thing you want help with first?

#

After a meal log

got it. i logged that. if you want, you can also send a supplement photo or paste recent bloodwork and i’ll help you make sense of it.

If user feels stalled

easiest place to start is just sending me one real thing, a meal, a supplement, or a lab result, and I’ll take it from there.

That’s the missing “companion with initiative” feel.


The key constraint

Keep this small.

The mistake would be turning this into:

  • another 100-line voice block
  • another onboarding wall
  • another protocol checklist
#

You want a little more posture, not another prompt subsystem.

If you want, next I can write:

  • a clean intro message
  • a clean profiling prompt
  • and a soft heartbeat-resume example
    in this same style.
strange adder
#

And this is not gonna turn into an assistant right? i also dont wanna risk to turn the companion into a robot but let the convo natural

atomic bisonBOT
#

No, not if you keep it small and contextual.

What makes a companion feel robotic is not “assistant posture”, it’s:

  • feature dumping
  • canned closers every turn
  • always asking “do you need more help?”
  • sounding like onboarding copy instead of a person

What I’m suggesting is lighter than that:

natural first, assistant second

Bryan should still sound like a companion. He just needs a little more quiet initiative.

Good version

  • mostly normal conversation
#
  • occasionally reveals what he can do
  • offers one relevant next step when useful
  • helps the user get unstuck

Bad version

  • every message ends with a menu
  • every reply has a CTA
  • constant “I can help with X, Y, Z”
  • feels like customer support copy

So I’d add explicit anti-robot rules like:

Do not sound like a product tour, onboarding flow, or helpdesk bot.
Most replies should feel like normal conversation.
Only mention capabilities when relevant or when the user seems unsure what Bryan can do.
#
Offer at most one next step, not a menu.
Do not end every message with "if you want, I can..."

That keeps him human.

So yes, you can absolutely add the missing assistant-ness without turning Bryan into a robot. The trick is:

  • small behavior rules
  • low frequency
  • contextual capability reveal
  • no repetitive CTA pattern
strange adder
#

OpenClaw migration audit question.

We have a Bryan companion setup that was supposed to move from single-agent/per-user-session routing to true OpenClaw multi-agent.

Current live shape:

  • one shared gateway
  • one shared Telegram bot / channel runtime
  • one shared OpenClaw agent: main
  • one shared workspace
  • per-user DB rows / channel links / data dirs
  • session keys like agent:main:<channel>:<account?>:direct:<peerId>

Intended target:

  • one shared gateway
  • one agent per user
  • one workspace per agent
  • one agentDir + sessions per agent
  • inbound routed by bindings to the user’s agent
  • companion-server targets the user agent explicitly via sessionKey / hooks

OpenClaw docs we already checked:

  • multi-agent routing: one gateway, many isolated agents
  • direct chats collapse to the agent main session, so true isolation requires one agent per person
  • per-user containers are not necessary if multi-agent is done correctly

What we want from you:
Please give us a concrete migration checklist from this current single-agent Bryan shape to a proper OC multi-agent deployment.

PArt 2 coming wait

#

Specifically, cross-check these items and tell us what is still missing / wrong:

  1. OpenClaw config
  • agents.list shape
  • per-agent workspace paths
  • per-agent agentDir paths
  • bindings format
  • hooks/session config needed for multi-agent
  • whether allowedAgentIds should be static or derived
  1. Provisioning lifecycle
  • exact steps when a new user registers
  • create workspace from template
  • create agent
  • create binding
  • persist agentId/gatewayId in app/backend user record
  • seed auth-gate owner config per agent
  1. Companion-server integration
  • what user record fields are required
  • how triggers should target agent:<agentId>:...
  • whether server should ever default to main
  • how channel accountId should be represented
  1. Channel linking
  • Telegram / WhatsApp linking in a multi-agent world
  • whether link-by-token should bind peer -> agent binding directly, or bind peer -> user record and let server rebuild config
  • best practice for accountId handling
  1. Session reset / deletion lifecycle
  • when deleting a user, what OC state must be removed
  • agent deletion
  • bindings cleanup
  • workspace cleanup
  • session cleanup
  1. Security / isolation
  • what is actually required for agent isolation to be trustworthy
  • sandbox recommendations
  • any common footguns when moving from single-agent to multi-agent
  1. Operational checks
  • what commands / probes should prove multi-agent is really live
  • how to verify we are not silently still routing everyone through main

We do NOT want a generic explanation of multi-agent.
We want a practical migration-delta checklist from:
single-agent + per-user sessions
to
true per-user-agent Bryan on one shared gateway.

If helpful, also call out any places where our current assumptions are wrong.

Answer

atomic bisonBOT
#

Got it, I’ll wait for part 2.

strange adder
#

@atomic bison

atomic bisonBOT
#

Yep. Here’s the practical migration-delta checklist I’d use for moving Bryan from:

  • single-agent + per-user session keys
    to
  • true per-user agent on one shared gateway

Docs I’m grounding this on:

First, the big corrections

A few of your assumptions need tightening:

#
  1. Do not treat sessionKey as the primary tenancy boundary.
    In OC, the real isolation boundary is agentId + workspace + agentDir + session store.
    sessionKey is routing/state selection, not security.

  2. Do not synthesize agent:<agentId>:... in the app unless you absolutely must.
    Prefer:

    • inbound user traffic routed by bindings
    • backend triggers routed by explicit agentId
    • if you need explicit non-chat sessions, use a hook-prefixed namespace like hook:bryan:<userId>:...
  3. openclaw agents bind is not enough for your final shape if you are routing multiple Telegram/WhatsApp peers on one shared account.
    The CLI bind shortcut is mainly channel/account scoped.
    Your real Bryan setup needs top-level bindings[] with match.peer.

  4. Per-agent workspace is not hard isolation by itself.
    Docs are explicit: workspace is default cwd, not a sandbox.
    If you need trustworthy file/process isolation, enable sandboxing too.

#
  1. Do not leave “fallback to main” as a normal path in a multi-tenant companion system.
    Unbound/unknown peers should go to:
    • a safe registrar/onboarding agent, or
    • fail closed / not be admitted.

1. OpenClaw config, what the target should look like

agents.list

Each user gets:

  • unique id
  • unique workspace
  • unique agentDir
  • optional per-agent tools, skills, model, sandbox
#

Example shape:

{
  agents: {
    list: [
      {
        id: "main",
        default: true,
        workspace: "~/.openclaw/workspace-main",
        agentDir: "~/.openclaw/agents/main/agent",
      },
      {
        id: "bryan-user-123",
        workspace: "~/.openclaw/workspace-bryan-user-123",
        agentDir: "~/.openclaw/agents/bryan-user-123/agent",
#
        tools: {
          profile: "minimal",
          deny: ["exec", "process"],
        },
      },
    ],
  },
}

Per-agent paths

These should all be distinct per user:

  • workspace: ~/.openclaw/workspace-bryan-<userId>
  • agentDir: ~/.openclaw/agents/bryan-<userId>/agent
  • sessions: ~/.openclaw/agents/bryan-<userId>/sessions
#

Never reuse agentDir. Docs explicitly warn this causes auth/session collisions.

bindings

For shared Telegram / WhatsApp account, use peer bindings:

{
  bindings: [
    {
      agentId: "bryan-user-123",
      match: {
        channel: "telegram",
        accountId: "default",
        peer: { kind: "direct", id: "1641047688" },
      },
    },
#
  ],
}

That is the important delta.
You are no longer relying on agent:main:telegram:...:peerId as your tenancy strategy.

Hooks/session config

For your backend-triggered flows, I’d keep:

{
  hooks: {
    enabled: true,
    path: "/hooks",
#
    allowRequestSessionKey: false,
    allowedSessionKeyPrefixes: ["hook:"],
  },
}

Recommended posture:

  • allowRequestSessionKey: false
  • do not trust caller-provided sessionKey
  • derive routing from backend-owned user record
  • if you need explicit background sessions, use hook: prefix only

allowedAgentIds, static or derived?

Derived and materialized, not hand-maintained and not caller-chosen.

#

Practical answer:

  • if you have few agents, static is fine
  • if you provision users dynamically, your reconciler should update it from your backend-owned agent registry

But the key point is:

  • do not let the external caller choose arbitrary agent ids
  • the server should look up the user → resolve the agentId → call OC with that agentId

2. Provisioning lifecycle, exact runbook

When a new user registers:

Provisioning steps

  1. Create backend user record.
  2. Generate stable agentId for that user.
#
  1. Create workspace from Bryan template.
  2. Create OC agent entry with:
    • id
    • workspace
    • agentDir
  3. Seed per-agent auth/profile state as needed.
  4. Create peer binding for the user’s linked Telegram/WhatsApp peer.
  5. Persist in backend:
    • agentId
    • gatewayId if you use multiple gateways
    • channel
    • accountId
    • peerId
  6. Restart/reload gateway if your config system requires it.
  7. Probe routing with a real inbound test.

“Create workspace from template”

#

Yes, do this at registration time.

Template should contain only:

  • SOUL.md
  • AGENTS.md
  • tiny TOOLS.md
  • any Bryan-specific local memory/bootstrap files you truly want per user

Do not share one mutable Bryan workspace across users.

“Create agent”

Use openclaw agents add <id> --workspace <dir> --agent-dir <dir> or generate config directly.

“Create binding”

For your design, this is not just channel/account binding. It should be:

  • channel
#
  • accountId
  • peer.kind = direct
  • peer.id = external peer id

“Seed auth-gate owner config per agent”

Yes, if Bryan needs per-agent auth or owner state, seed it per agent.
Do not point multiple agents at one shared mutable auth file.

If creds are intentionally shared, copy auth-profiles.json into each agent’s agentDir.
That’s the OC-native way. Docs explicitly say copy, not share agentDir.


3. Companion-server integration, what fields you actually need

Your backend user record should minimally have:

#
  • userId (your app’s id)
  • agentId
  • gatewayId if more than one gateway exists
  • channel (telegram / whatsapp)
  • accountId
  • peerId
  • bindingState / linked status
  • optional workspacePath and agentDirPath if your reconciler needs them

How triggers should target

Best practice:

For real user chat ingress

Do not target agent:<agentId>:... manually.
Let channel ingress + binding select the agent.

For backend-triggered non-chat jobs

#

Target:

  • explicit agentId
  • optional sessionKey only in a hook-prefixed namespace

Example:

  • agentId = bryan-user-123
  • sessionKey = hook:bryan:user-123:daily-checkin

Should server ever default to main?

No.

That is one of the main things still wrong in your current mental model.

Server logic should be:

  • if user has no bound agentId, do not “fall back to main Bryan”
  • either:
    • route to dedicated registrar agent, or
#
  • reject / hold until linked

accountId handling

Persist it explicitly even if you only have one account today.

Why:

  • once you add a second Telegram/WhatsApp account, implicit default handling becomes a footgun
  • docs recommend explicit defaultAccount in multi-account setups

So store exact accountId in the user record and in bindings.


4. Channel linking, Telegram / WhatsApp best practice

Multi-agent with one shared Telegram/WhatsApp account

This is valid.

#

You do not need one bot/number per user.

You can keep:

  • one shared gateway
  • one shared Telegram bot or WhatsApp account
  • many user agents

And route by peer binding.

Link-by-token, what should be source of truth?

Best practice:

token/link event should bind peer -> backend user record
then backend reconciler should write/update OC config

Not:

token handler directly mutates OC routing as its only source of truth

#

Why:

  • you want OC config to be rebuildable from your app DB
  • you want one authoritative mapping layer
  • easier recovery, audits, and re-provisioning

accountId best practice

  • always explicit in your backend
  • always explicit in OC binding if you operate multi-account
  • if single-account today, still persist the chosen default id

5. Session reset / deletion lifecycle

When deleting a user, remove all of these:

In backend

#
  • user → agent mapping
  • peer/channel link
  • any cached session route metadata
  • any onboarding/profile rows tied to that agent

In OC

  • agent entry from agents.list
  • all bindings pointing to that agent
  • agent workspace
  • agentDir
  • session store/transcripts

Practical delete flow

  1. Remove bindings for that user/agent.
  2. Delete agent.
  3. Trash workspace/state/session data.
  4. Remove backend mapping.
#
  1. Verify no lingering agent:<agentId>: sessions remain discoverable.

openclaw agents delete <id> is the right primitive for the OC side.
Docs say workspace, state, and sessions are moved to Trash, not hard-deleted.

Reset without deleting user

If you want “reset Bryan but keep account”:

  • keep agentId and binding
  • wipe session store under that agent
  • wipe/reseed workspace state
  • optionally reset per-agent profile/memory rows in backend

Do not silently remap the user back to main.


6. Security / isolation, what is actually required

#

For this to be trustworthy, minimum bar is:

Required

  • one agent per user
  • unique workspace per agent
  • unique agentDir per agent
  • unique session store per agent
  • no shared mutable Bryan workspace
  • no shared agentDir
  • no main fallback for user traffic
  • no caller-selected raw sessionKey
  • peer bindings owned by backend truth, not user input

Strongly recommended

  • sandbox Bryan agents if tool/file access matters
  • deny exec / process unless truly needed
  • keep tool surface minimal
#
  • keep hooks.allowRequestSessionKey=false
  • constrain or reconcile hooks.allowedAgentIds
  • use a safe registrar/default agent, not a real user-facing Bryan

Common footguns

  1. Reusing agentDir
  2. Shared workspace with per-user files inside it
  3. Letting unbound peers fall into main
  4. Letting backend or caller synthesize arbitrary sessionKey
  5. Assuming per-agent cwd is a sandbox when it isn’t
  6. Forgetting accountId when multi-account arrives later
  7. Using channel-only bindings when you actually need peer bindings

That last one is a big one for your migration.


#

Config truth

  • every live Bryan user has:
    • an agent
    • a workspace
    • an agentDir
    • a peer binding

Disk truth

  • sessions for user A live under ~/.openclaw/agents/<agentA>/sessions
  • sessions for user B live under ~/.openclaw/agents/<agentB>/sessions
  • no active user traffic is creating new agent:main: sessions

Routing truth

Send DMs from two different peers and verify:

  • peer A always lands in agent A
  • peer B always lands in agent B
  • deleting binding for A does not route A into a useful main Bryan
#

Negative tests

You really want these:

  • unknown peer does not land in a real user Bryan
  • malformed user record does not fall back to main
  • wrong peer/account tuple does not route to another user agent
  • background trigger with wrong agentId is rejected
  • request-selected sessionKey cannot retarget another user

What I think is still missing in your current plan

Missing / wrong

  1. You need peer bindings, not just agent-per-user and custom session keys.
  2. Your backend must be the source of truth for agentId/accountId/peerId.
  3. Server should never default to main for user jobs.
  4. You need a registrar/quarantine strategy for unbound peers.
#
  1. You need deletion/reset runbooks, not just creation flow.
  2. If you care about real isolation, add sandboxing, because workspace alone is not enough.
  3. Your trigger strategy should be agentId-first, not handcrafted agent:<id>:... strings.

My recommended target shape

Safe target

  • one shared gateway
  • one shared Telegram/WhatsApp account if you want
  • one agent per user
  • one workspace per agent
  • one agentDir per agent
  • one session store per agent
  • peer binding per linked user
  • backend-owned user → agent/account/peer mapping
#
  • no user traffic fallback to main
  • hook session keys only under hook: namespace
  • minimal Bryan tool surface, ideally no exec

That’s the boring OC-native version, which is exactly what you want here.

If you want, I can do the next step and draft a literal config skeleton for:

  • agents.list
  • bindings[]
  • hooks
  • and the backend user record schema
    for one Telegram Bryan user and one WhatsApp Bryan user.
#

Yep 🦐

If you want, I can do the next concrete step now and write out:

  1. a sample openclaw.json skeleton for per-user Bryan agents
  2. a backend user record schema
  3. a provision / link / delete runbook
  4. a cutover checklist from main to per-user agents

That’d probably be the most useful artifact from here.

strange adder
#

Yep 🦐

If you want, I can do the next concrete step now and write out:

a sample openclaw.json skeleton for per-user Bryan agents
a backend user record schema
a provision / link / delete runbook
a cutover checklist from main to per-user agents

That’d probably be the most useful artifact from here

Ok send me please and also:

Also include the security hardening checklist specifically for multi-agent Bryan.

We want the minimum serious production-grade isolation checklist for:

  • one shared gateway
  • one agent per user
  • one workspace per agent

Please cover:

  1. Workspace isolation
  • per-agent workspace layout
  • what must never be shared across agents
  • whether USER.md / memory / skills should be copied per agent or referenced shared
  1. Agent state isolation
  • per-agent agentDir
  • auth profiles
  • sessions
  • memory flush behavior
  • anything that can accidentally collide across agents
  1. Sandbox / tool isolation
  • whether OC sandboxing should be enabled per agent
  • recommended workspaceAccess mode
  • exec policy / allowlist guidance
  • what tools are too dangerous to leave enabled in a shared-gateway multi-agent setup

PArt 2 coming wait

#
  1. Channel isolation
  • peer bindings
  • accountId handling
  • how to avoid one user ever resolving into another user’s agent
  • how to fail closed on malformed routing/session keys
  1. Hook / trigger security
  • whether companion-server should be allowed to send raw session keys
  • best practice for mapped hooks / trusted transforms
  • how to validate agentId/accountId/peerId
  • what should hard-fail vs silently default
  1. Secrets / credentials
  • what secrets can remain shared at the gateway level
  • what secrets must be per-agent
  • how to avoid auth/session collisions
  1. Memory / file leakage prevention
  • how to avoid cross-agent workspace reads
  • whether workspace-only assumptions are enough
  • what additional safeguards are recommended
  1. Verification
  • concrete tests / probes to prove isolation is actually working
  • how to simulate two users and verify no cross-user reads, writes, or session bleed

We specifically want to know:
if we migrate Bryan from single-agent to true multi-agent, what security hardening is mandatory so we do not just create a prettier but still leaky architecture.

answer pls

atomic bisonBOT
#

Got it, I’ll wait for part 2.

#

I’m just verifying the exact sandbox key names so I don’t hand you fake config.

#
  1. One workspace per agent
  2. One agentDir per agent
  3. One session store per agent
  4. Exact peer binding per linked user (channel + accountId + peer.id)
  5. Backend owns user -> agent/account/peer mapping
  6. hooks.allowRequestSessionKey=false
  7. Never default user traffic back to a real main Bryan
  8. Sandbox Bryan agents
  9. Disable exec, process, elevated exec, and cross-session tools
  10. Never reuse agentDir across agents
  11. Prove isolation with 2-user sentinel tests before cutover

If you miss any of those, it’s very easy to build a prettier architecture that still leaks.


A. Sample openclaw.json skeleton

#

This is a shape example, not a copy-paste final file.

{
  session: {
    store: "~/.openclaw/agents/{agentId}/sessions/sessions.json",
    // mainKey stays "main" at runtime; each per-user agent gets its own main direct bucket
  },

  agents: {
    defaults: {
      // For production Bryan, prefer sandbox ON by default
      sandbox: {
        mode: "all",
        scope: "agent",
        workspaceAccess: "ro", // use "rw" only if Bryan truly needs local per-user file writes
#
      },

      tools: {
        profile: "minimal",
        deny: [
          "exec",
          "process",
          "browser",
          "sessions_list",
          "sessions_history",
          "sessions_send",
          "sessions_spawn",
          "message"
        ],
        elevated: { enabled: false },
#
      },
    },

    list: [
      {
        id: "main",
        default: true,
        workspace: "~/.openclaw/workspace-main",
        agentDir: "~/.openclaw/agents/main/agent",
        sandbox: {
          mode: "all",
          scope: "agent",
          workspaceAccess: "none",
        },
        tools: {
#
          profile: "minimal",
          deny: [
            "exec",
            "process",
            "browser",
            "sessions_list",
            "sessions_history",
            "sessions_send",
            "sessions_spawn",
            "message"
          ],
          elevated: { enabled: false },
        },
      },
#
      {
        id: "bryan-user-123",
        workspace: "~/.openclaw/workspace-bryan-user-123",
        agentDir: "~/.openclaw/agents/bryan-user-123/agent",

        // If Bryan needs to read local user notes only, keep ro.
        // If Bryan needs to write per-user local notes, switch to rw.
        sandbox: {
          mode: "all",
          scope: "agent",
          workspaceAccess: "ro",
        },

        tools: {
          profile: "minimal",
#
          deny: [
            "exec",
            "process",
            "browser",
            "sessions_list",
            "sessions_history",
            "sessions_send",
            "sessions_spawn",
            "message"
          ],
          elevated: { enabled: false },
        },
      },
    ],
  },
#

  bindings: [
    {
      agentId: "bryan-user-123",
      match: {
        channel: "telegram",
        accountId: "default",
        peer: { kind: "direct", id: "1641047688" },
      },
    },
  ],

  hooks: {
    enabled: true,
    token: "REDACTED_HOOK_TOKEN",
#
    path: "/hooks",

    defaultSessionKey: "hook:default",
    allowRequestSessionKey: false,
    allowedSessionKeyPrefixes: ["hook:"],

    // If dynamically provisioning, derive and materialize this list from your backend registry.
    // Do not let callers choose arbitrary agent ids.
    allowedAgentIds: ["main", "bryan-user-123"],
  },

  channels: {
    telegram: {
      defaultAccount: "default",
      accounts: {
#
        default: {
          botToken: "REDACTED_TELEGRAM_TOKEN",
        },
      },
    },
  },
}

Important note on this skeleton

For your design, the real routing primitive is:

bindings[].match.peer

Not just channel/account binding.

#

That’s the key difference from “single agent + creative session keys”.


B. Backend user record schema

Your backend should be the source of truth, not OC session strings.

type BryanUser = {
  userId: string
  gatewayId: string            // even if only one gateway today
  agentId: string
  status: "pending_link" | "active" | "deleting" | "deleted"

  workspacePath: string
#
  agentDirPath: string

  links: Array<{
    channel: "telegram" | "whatsapp"
    accountId: string
    peerKind: "direct"
    peerId: string
    linkedAt: string
    active: boolean
  }>

  hookSessionPrefix: string    // e.g. "hook:bryan:user-123:"
  createdAt: string
  updatedAt: string
}
#

What not to store as authority

Do not make these your source of truth:

  • raw OC sessionKey
  • “last known sessionKey”
  • caller-provided agentId
  • caller-provided peerId without backend validation

The server should resolve:

app user -> agentId + channel + accountId + peerId

and only then call OC.


C. Provision / link / delete runbook

#

1. Provision new user

  1. Create backend user record.
  2. Generate stable agentId, e.g. bryan-user-123.
  3. Create workspace from Bryan template.
  4. Create agentDir.
  5. Add agent to agents.list.
  6. Seed per-agent auth/profile state if needed.
  7. Mark user pending_link.

2. Link channel

When Telegram/WhatsApp link completes:

  1. Normalize and validate channel, accountId, peerId
  2. Write/update backend link record
  3. Reconcile OC config:
    • ensure agent exists
    • ensure exact peer binding exists
#
  1. Restart/reload gateway if required by your config flow
  2. Probe with a real message
  3. Mark user active

3. Delete user

  1. Mark user deleting
  2. Remove OC bindings for that agent
  3. Delete OC agent
  4. Trash workspace / agent state / sessions
  5. Remove backend mapping and link records
  6. Mark user deleted

4. Reset user without deleting

  1. Keep agent and binding
  2. Clear per-agent sessions
  3. Wipe/reseed per-agent workspace state if needed
  4. Wipe backend memory/profile rows for that user if reset truly means reset
#

Do not remap them back to main.


D. Cutover checklist, main -> per-user agents

Before cutover

  • openclaw --version
  • openclaw agents list --bindings
  • openclaw doctor
  • capture current main routing and active peers
  • prepare 2-user test fixtures

Migration steps

  1. Turn main into a safe registrar/quarantine agent, not real Bryan.
  2. Provision one per-user agent for a pilot set.
  3. Add exact peer bindings for those users.
#
  1. Disable user-job defaulting to main in companion-server.
  2. Keep hooks.allowRequestSessionKey=false.
  3. Move backend triggers to agentId-first routing.
  4. Disable exec / process / elevated on Bryan agents.
  5. Run A/B isolation tests.
  6. Cut more users over.
  7. Verify no new real-user traffic lands in agent:main:.

After cutover

  • inspect agent bindings
  • inspect per-agent session directories
  • send test DMs from two users
  • verify deletion/reset lifecycle

E. Minimum serious production-grade isolation checklist

#

This is the part I’d treat as mandatory, not “nice to have”.


1. Workspace isolation

Required layout

Each user gets their own workspace, e.g.

  • ~/.openclaw/workspace-bryan-user-123
  • ~/.openclaw/workspace-bryan-user-456

Must never be shared across agents

Do not share mutable per-user files across agents:

  • USER.md
  • memory.md
#
  • per-user notes
  • per-user local caches
  • per-user uploaded artifacts you keep locally
  • any generated summaries or recalls

Shared vs copied

Copy per agent:

  • SOUL.md template
  • AGENTS.md template
  • USER.md
  • memory.md
  • any Bryan-local private notes

Can remain shared read-only:

  • generic skills under shared skills roots
  • generic docs
  • generic templates
#
  • static Bryan prompt fragments with no user data

Recommendation

  • generic Bryan behavior can be shared as template/read-only assets
  • anything user-specific must live inside that user’s workspace or backend rows scoped to that user

2. Agent state isolation

Required

Each user gets unique:

  • agentDir
  • auth-profiles.json location
  • session store
  • transcript directory
#

Never do this

  • never reuse one agentDir for two users
  • never symlink multiple users to one mutable agent state
  • never share one auth/session store directory

Docs are explicit that reusing agentDir causes collisions.

Auth profiles

If two Bryan agents intentionally use the same provider creds:

  • copy the auth profile into each agent’s agentDir
  • do not point both agents at the same file

Sessions

Sessions must remain under:

  • ~/.openclaw/agents/<agentId>/sessions
#

That gives clean per-agent session separation.

Memory flush behavior

If “reset” means reset:

  • clear OC sessions for that agent
  • clear any agent-scoped local notes
  • clear backend rows for that user
  • clear any derived recall caches keyed to that agent

Otherwise you’ll get “ghost memory” even with correct agent routing.

Collision footguns

Watch for:

  • shared cache dirs
  • shared upload dirs
  • shared transcript search collections
  • shared memorySearch.extraCollections pointing across user data
#
  • reused temp/state paths keyed only by peer instead of agent+peer

3. Sandbox / tool isolation

Should sandbox be enabled?

Yes. For production Bryan, I would turn sandboxing on.

Workspace separation alone is not enough. Docs explicitly say workspace is not a hard sandbox.

Recommended sandbox baseline

For one user per agent:

  • mode: "all"
  • scope: "agent"
#

Recommended workspaceAccess

  • best baseline: "ro"
  • safer if Bryan truly needs no filesystem tools: "none"
  • use "rw" only if Bryan really must write local per-user notes/files

For a companion lane, I would start with ro or none, not rw.

Exec policy

Default Bryan posture:

  • deny exec
  • deny process
  • deny elevated exec
  • do not depend on shell fallback for product features

That matters a lot given the exact stall you saw with exec-approvals.json.

#

Tools that are too dangerous to leave enabled by default

For a shared-gateway multi-agent companion, I would treat these as dangerous until proven necessary:

  • exec
  • process
  • elevated exec
  • sessions_list
  • sessions_history
  • sessions_send
  • sessions_spawn
  • browser (especially host control)
  • message (cross-channel proactive sends)
  • read / write / edit / apply_patch unless sandboxed with strict workspace access

For Bryan, keep the tool surface tiny.


#

4. Channel isolation

Required

Use exact peer bindings:

{
  agentId: "bryan-user-123",
  match: {
    channel: "telegram",
    accountId: "default",
    peer: { kind: "direct", id: "1641047688" }
  }
}

accountId handling

#

Persist accountId explicitly in backend even if you only have one account today.

That avoids nasty ambiguity later.

Prevent cross-user resolution

The resolver must require:

  • valid channel
  • valid accountId
  • valid peerId
  • existing backend user mapping
  • matching OC binding

If any of those fail:

  • hard fail or route to safe registrar
  • never silently default to someone else’s Bryan
  • never silently default to useful main
#

Malformed routing/session keys

Hard-fail these:

  • missing agentId
  • unknown agentId
  • malformed accountId
  • unknown peerId
  • caller-provided raw sessionKey

Do not “helpfully recover” by routing to main.


5. Hook / trigger security

Should companion-server be allowed to send raw session keys?

No, not as a normal path.

#

Best practice is still:

  • hooks.allowRequestSessionKey = false

Best practice

Use:

  • mapped hooks
  • trusted backend-owned transforms
  • explicit agentId
  • optional hook: prefixed session keys only for backend jobs

Validate all routing inputs

Server must validate:

  • agentId belongs to that backend user
  • channel is expected
  • accountId matches linked account
#
  • peerId matches linked peer

Hard-fail vs default

Hard-fail:

  • bad agentId
  • bad accountId
  • bad peerId
  • unlinked user
  • malformed session key
  • cross-user mismatch

Never silently default:

  • to main
  • to another agent
  • to a previous sessionKey

That’s a major ship gate.

#

6. Secrets / credentials

Can remain shared at gateway level

These can stay shared if intentional:

  • Telegram bot token / WhatsApp account login at channel layer
  • gateway auth token
  • hook token
  • shared model catalog config
  • sandbox backend infra config

Must be per-agent or agent-scoped

These should be isolated by agent:

  • auth profiles
#
  • user-specific provider credentials
  • user-specific OAuth material
  • local session/auth state
  • any per-user memory store/files

Avoid auth/session collisions

  • unique agentDir
  • no shared auth file path
  • copy auth profiles if sharing creds intentionally
  • do not reuse agent state directories

7. Memory / file leakage prevention

Is workspace-only enough?

No.

#

Workspace-only is better than one shared workspace, but it is not enough by itself because:

  • absolute path access still exists without sandboxing
  • tools can read/write outside cwd on host
  • shared retrieval/search config can still leak

Prevent cross-agent reads

  • sandbox Bryan agents
  • use workspaceAccess: ro or none unless needed
  • deny read if Bryan doesn’t need it
  • deny cross-session tools
  • do not share transcript collections between user agents
  • avoid shared mutable memory stores unless partitioned by agentId

Additional safeguards

  • use agent-scoped retrieval collections only
  • partition backend memory by agentId and userId
  • never allow one agent to search another agent’s sessions by default
#
  • review any plugin/tool that can access filesystem, session history, or other sessions

8. Verification, how to prove it’s actually isolated

Use two fake users with loud sentinel data.

Fixtures

  • User A → agent bryan-user-a, peer 111
  • User B → agent bryan-user-b, peer 222

Give them obviously different:

  • profile facts
  • bloodwork markers
  • local files
  • memory summaries
#
  • last route/session state

Required tests

Routing

  • peer 111 always resolves to agent A
  • peer 222 always resolves to agent B
  • unknown peer does not resolve to a useful Bryan

File isolation

  • A asks for a file only موجود in A workspace
  • B must never see it
  • B asks for B-only file
  • A must never see it

Session isolation

  • A and B ask “what did I tell you earlier?”
#
  • each only recalls their own prior turns
  • compaction/reset still preserves separation

Memory isolation

  • A has sentinel fact “blue-harbor”
  • B has sentinel fact “red-valley”
  • neither can retrieve the other’s fact

Negative security tests

  • wrong agentId + valid peer should fail
  • wrong peerId + valid agent should fail
  • raw request sessionKey should be ignored/rejected
  • malformed link record should not default to main

Operational probes

Run:

#
openclaw --version
openclaw agents list --bindings
openclaw channels status --probe
openclaw doctor

And verify on disk:

  • A sessions only under agent A
  • B sessions only under agent B
  • no unexpected real-user traffic in agent:main:

Final recommendation

If you migrate Bryan to true multi-agent, the mandatory hardening is:

#
  • exact peer bindings
  • no main fallback
  • backend-owned routing truth
  • allowRequestSessionKey=false
  • per-agent workspace
  • per-agent agentDir
  • per-agent session store
  • sandbox on
  • deny shell / cross-session tools
  • 2-user sentinel leak tests before rollout

That’s the minimum bar where I’d call it “serious” instead of “nicer-looking”.

If you want, I can do one more pass and turn this into a literal deploy checklist with:

  • openclaw.json sections
  • backend reconciliation steps
  • and the exact A/B isolation test matrix in copy-paste format.
strange adder
#

On a scale of 1to 10 how much is this multi agent better than single agent multi session for my mvp to beta testers? eventually would i still need to migrate to single gateway for a real live product?

atomic bisonBOT
#

For Bryan + outside beta testers, I’d call it 8.5/10 better.

If I simplify it:

  • single-agent + multi-session = 3/10 to 4/10 for external users
    fine for prototyping, too easy to get weird routing/memory/privacy footguns

  • single shared gateway + one agent per user = 8/10 to 9/10
    this is the sweet spot for MVP/beta

  • separate gateway per user = 10/10 isolation, but usually overkill early

My actual recommendation

For your next stage:

stay on one shared gateway, but move to one agent per user

#

That is probably the right product architecture for a real live product too, at least for quite a while.

Do you eventually need separate gateways?

Not necessarily.

You only need to go beyond single gateway + multi-agent if you later need things like:

  • stronger tenant isolation than agent/workspace/agentDir/sandbox gives you
  • compliance / enterprise boundaries
  • noisy-neighbor control
  • regional sharding / scale-out ops
  • separate customer-owned credentials/infrastructure

So no, I would not plan a near-term migration to per-user gateways.

The short answer

For your use case, I’d say:

#
  • beta testers: multi-agent is a huge upgrade
  • real product: single gateway + multi-agent is still a valid destination
  • separate gateways: only later, if scale/compliance/isolation pressures force it

If you want the blunt version:

single-agent multi-session is okay for building

single-gateway multi-agent is okay for shipping

multi-gateway is for when the business or risk model demands it

#

That looks sketchy, I would not use that link.

If this is an OpenClaw issue, keep it here in the thread or file a GitHub issue: https://github.com/openclaw/openclaw/issues

For the architecture question, my short take still stands:

  • single shared gateway + one agent per user is the right next step
  • you do not need per-user gateways yet
  • single-agent multi-session is fine for building, but not where I’d want to stay for real beta users
strange adder
#

multi-gateway is for when the business or risk model demands it > and when is this?

atomic bisonBOT
#

Usually when one shared gateway becomes an unacceptable blast radius.

I’d revisit multi-gateway when one or more of these become true:

  • enterprise / B2B customers want hard tenant separation
  • compliance / privacy requires stronger infra boundaries than agent/workspace/sandbox
  • one outage or bad deploy on Bryan can impact all users at once
  • you need different regions or customer-specific hosting
  • you need different plugin sets / auth / channel accounts / ops policies per tenant
  • one gateway starts becoming an ops bottleneck for scale, noisy neighbors, or debugging

For your current stage

For MVP / beta testers, I would not jump to multi-gateway yet.

I’d move only when you hit something like:

“if this one gateway goes weird, too many real users are affected”

#

or

“we need stronger tenant boundaries than multi-agent gives us”

My blunt heuristic

  • consumer beta / small paid product → single gateway + multi-agent is fine
  • serious health/privacy product with many users → watch closely, but still probably okay for a while
  • enterprise / regulated / contractual isolation → that’s where multi-gateway starts becoming real

So for Bryan, I’d say:

multi-agent now

multi-gateway later only if blast radius, compliance, or tenant-specific ops force it

strange adder
#

So how many users single gateway multi agent can serve safely

atomic bisonBOT
#

There isn’t one honest universal number, because the limit is usually concurrency + tool mix + blast radius, not total accounts.

But for Bryan, I’d think about it like this:

Roughly

  • current single-agent + multi-session shape
    I would keep that to very small / trusted testing only

  • single gateway + hardened multi-agent
    I’d be comfortable aiming at tens to low hundreds of beta users
    if:

    • traffic is light/moderate
    • not everyone is active at once
    • tool surface is small
    • exec is off
    • you’ve done the isolation hardening
#

The real metric is not “users”

It’s more like:

  • how many are active at the same time
  • how many turns include images/media
  • whether you run sandboxing
  • whether you allow heavy tools like exec / browser / file ops
  • how much out-of-band hook traffic / heartbeats / background jobs you generate

A gateway can have a lot of registered users but still be fine if only a small fraction are active concurrently.

My practical answer for you

For Bryan as a health/photo/supplement companion, with the kind of workflows you described:

single gateway + multi-agent is fine for beta

I would not be scared of low hundreds of registered users

#

if concurrent usage is still relatively low

When I’d start planning a split

I’d start planning sharding / multi-gateway when you see:

  • queueing or slow replies during peak times
  • image turns materially backing up the gateway
  • one deploy/restart affecting too many users
  • you want separate blast-radius buckets
  • you need stronger tenant isolation for privacy/compliance

My blunt heuristic

  • 10 to 50 users: absolutely fine if hardened
  • 50 to 200 users: still plausible on one gateway, measure carefully
  • beyond that: stop thinking in total users, start thinking in concurrency and ops
#

So the short answer is:

for beta, one gateway can go surprisingly far

but only after you fix the architecture from single-agent to real multi-agent first

strange adder
#

Hey, I’m debugging a very expensive OpenClaw run and want to sanity-check whether this is expected or if I’m doing something wrong.

Setup:

  • OpenClaw on Telegram
  • model: deepinfra/google/gemma-4-31B-it
  • session key: agent:main:telegram:default:direct:1641047688

Problem:
A single visible user message + one final assistant reply is costing about 65k–76k input tokens.

Concrete example from a live session:

  • user message content was only 691 chars
  • first assistant call cost 15,391 input tokens
  • total for that one visible exchange ended up at 64,751 input tokens in the live session
  • DeepInfra dashboard delta for that message was ~76,534 input tokens

What I measured from systemPromptReport for the run:

  • systemPrompt.chars = 13,035
  • skills.promptChars = 3,301
  • tools.schemaChars = 11,349
  • tool count = 27
  • injectedWorkspaceFiles = []

So before any useful work, the known first-pass char budget was:

  • system prompt: 13,035
  • tool schemas: 11,349
  • skills prompt: 3,301
  • wrapped user message: 691

Also, this one message triggered 4 model passes:

  1. toolUse -> read_active_plan
  2. toolUse -> read ~/.openclaw-companion/workspace/skills/diet-planner/SKILL.md
  3. toolUse -> read_health_hub(section=diet_snapshot) (also emitted a stray "uma")
  4. final text answer

wait for part 2

#

Questions:

  1. Is a ~15k-token first call for a 691-char user message expected with 27 tools and 6 skills, even with no injected workspace files?
  2. Is there a recommended way to reduce tool-schema overhead per turn? Dynamic tool filtering? Separate lighter agent profile?
  3. Is loading the full SKILL.md into history via read tool the intended pattern, even though it then bloats subsequent calls?
  4. Are 3 sequential tool-use passes for a simple diet request expected behavior, or is there a recommended pattern to reduce this churn?
  5. Is there a way to get token-level breakdowns for system prompt vs skills vs tool schemas vs history, beyond the char counts in systemPromptReport?

I already checked that this spike is not coming from my own user wrapper or extra injected canonical blocks on that turn.

strange adder
#

@atomic bison

atomic bisonBOT
#

Yep, still here 🦐

My strongest recommendation still hasn’t changed:

  1. move Bryan to single shared gateway + one agent per user
  2. make main a safe registrar/quarantine agent, not a real user Bryan
  3. disable exec / process on Bryan
  4. keep hooks.allowRequestSessionKey=false
  5. route users with exact peer bindings
  6. hard-fail anything malformed instead of defaulting to main

If you want, I can turn that into a final “do these 10 things in order” migration plan so it’s easier to execute.

strange adder
#

We’re building a similarly to bryan a JOI/OpenClaw runtime and want feedback on the hardening plan before we add personality/media.

Target architecture:

  • One OpenClaw runtime/config = one “Girl Claw” for Joi.
  • One Telegram bot token/account = Joi.
  • Multiple Telegram customers can message the same Joi bot.
  • Each customer gets one real customer-facing OpenClaw agent inside the Joi Girl Claw.
  • Internal agents exist separately: joi-router, joi-safety, joi-memory, joi-planner, joi-content, joi-publisher.
  • Customer isolation should not be only session/memory isolation. It should be real OC agents + separate workspace + separate agentDir + per-agent session store.

Current intended config shape:

  • No real main customer fallback.
  • Default/fallback agent is joi-router only.
  • Exact Telegram peer bindings route:
    telegram peer id -> joi_customer_<id>
  • session.store uses {agentId}.
  • Hooks cannot request arbitrary session keys.
  • Hooks allowedAgentIds only includes generated customer agents.
  • No agent:main: allowed session prefixes.
  • Tools are deny-first:
    deny exec, filesystem, browser, gateway, sessions tools, agents_list, subagents, cron, automation, arbitrary message sending.
  • Plugins are allowlisted, no wildcard plugin loading.
  • Workspaces split immutable vs mutable:
    immutable: AGENTS.md, SYSTEM.md, IDENTITY.md, TOOLS.md, Joi core/persona/tool policy
    mutable: memory/, userdata/, media/, tmp/, runtime/projections/
  • Backend DB is source of truth; OpenClaw workspace files are generated projections.

incoming part 2

#

Questions for feedback:

  1. Does OpenClaw’s multi-agent config support this safely as “one runtime with many real agents,” or are there hidden shared-state issues we should account for?
  2. Are exact peer bindings + per-agent workspace/agentDir + {agentId} session store enough for this level of customer separation?
  3. Which tools should remain denied even for internal agents?
  4. Should internal agents like safety/planner/content be reachable through hooks at all, or only through backend-controlled flows?
  5. Any known footguns with Telegram routing, hooks, session keys, or plugin loading in this shape?
  6. Would you recommend one OpenClaw runtime per girl long-term, or a separate OpenClaw process/container per girl if production/paid users grow?

We’re not asking about personality/content yet, only runtime hardening and isolation.

strange adder
#

@atomic bison

strange adder
#

@atomic bison

strange adder
#

@atomic bison

atomic bisonBOT
#

First thing: please grab openclaw --version for this runtime.

My read from those logs:

  • openclaw supplement lookup ... is not the normal image path. If that exact shell command appears, it’s coming from Bryan’s companion layer: prompt/skill/hook/custom companion-server code, or the model falling back to exec.
  • The tools.allow warning is important: tools.allow does not register tools. It only filters tools already available. So log_meal, companion_message, etc. being “unknown” means those tools are not actually exposed in that runtime/session.
  • exec-approvals.json EBUSY should not permanently deadlock OpenClaw. But yes, it can make that exec call fail, and non-timeout exec failures are often suppressed unless verbose/debug output is enabled. If the model only called exec and never produced a final text reply, Telegram can look like it “hung.”
  • A true queue-block can happen if the turn is waiting on an exec approval or long-running tool. Then later image turns may queue behind it.

What I’d do:

  1. In the exact Telegram/Bryan session, run:
    /tools verbose
    /status
    /tasks
    
#

Confirm whether exec is visible, and whether the companion/supplement tools are visible.

  1. Remove shell fallback for Bryan:
    {
      agents: {
        list: [{
          id: "bryan",
          tools: {
            deny: ["exec", "process", "code_execution"]
          }
        }]
      }
    }
    
    Then restart gateway and start a fresh session.
#
  1. Search Bryan’s workspace/skills/hooks/companion-server for the old path:

    rg -n "openclaw supplement|supplement lookup|log_meal|companion_message" ~/.openclaw ~/companion-server
    

    Any instruction saying “run openclaw supplement lookup” needs to become a real registered plugin/tool call, not shell.

  2. Fix the tool registration first, then evaluate model behavior. The Gemini preview lane may be less forgiving with tools, but missing tools are a config/plugin registration issue before they’re a model issue.

Docs:
https://docs.openclaw.ai/gateway/config-tools
https://docs.openclaw.ai/tools/exec-approvals
https://docs.openclaw.ai/tools/slash-commands#tools

strange adder
#

Forget aboyt bryan totally this is a separate comapnion

Runtime version: OpenClaw 2026.4.22 (00bd2cf).

For Joi, we are not using Bryan's supplement/log_meal tools. Joi customer agents currently deny exec/process/gateway/runtime/session tools. Telegram is handled by our own poller, not native OpenClaw Telegram.

Question:
If exec is denied at agent tool policy level, does that only prevent the model from invoking shell/exec tools, while registered plugins/hooks still run normally inside the gateway/runtime?

Also, for custom companion tools like memory_write / relationship_event / media_request, what is the recommended 2026.4.22 registration pattern so they appear in /tools verbose as real tools instead of just being listed in tools.allow?

Finally, what is the best command to inspect if a direct chat turn is queued behind a pending tool/exec approval: /tasks, gateway RPC, or session logs?

atomic bisonBOT
#

Yep — separate companion/Joi understood.

1) If exec is denied at agent tool policy level:
That prevents the model from seeing/invoking exec/process as agent tools. It does not disable plugin registration, hooks, services, or code running inside the gateway/plugin runtime.

Important nuance: if a trusted plugin/hook internally shells out via Node/child_process or calls some gateway/node exec path itself, tools.deny: ["exec"] won’t magically sandbox that plugin code. Tool policy filters LLM-callable tools, not arbitrary plugin internals.

So if Joi has exec denied and you still see exec-approvals.json, suspect:

  • a hook/plugin/service is shelling out internally,
  • an external companion-server path is invoking OpenClaw CLI,
  • another agent/session lane has exec exposed,
  • or stale session/tool history is replaying an old path.

2) Recommended registration pattern:
Don’t put memory_write, relationship_event, media_request only in tools.allow. They must be actual plugin tools:

#
import { Type } from "@sinclair/typebox";
import { definePluginEntry } from "openclaw/plugin-sdk/plugin-entry";

export default definePluginEntry({
  id: "joi-companion",
  name: "Joi Companion",
  description: "Companion memory and relationship tools",
  register(api) {
    api.registerTool(
      {
        name: "memory_write",
        description: "Persist a structured companion memory.",
        parameters: Type.Object({
          text: Type.String(),
          importance: Type.Optional(Type.String()),
#
        }),
        async execute(_id, params) {
          // call companion-server / DB here
          return { content: [{ type: "text", text: "memory saved" }] };
        },
      },
      { optional: true },
    );
  },
});

Then enable via config with either exact tool names or plugin id, e.g.:

#
{
  tools: {
    allow: ["joi-companion", "memory_write", "relationship_event", "media_request"],
    deny: ["exec", "process", "gateway", "sessions_*"]
  }
}

If /tools verbose doesn’t show them, they are not registered/loaded in that runtime, or policy/provider filtering removed them.

3) Queue / pending approval inspection:
For direct chat turns, /tasks is usually not the right primary check — it’s mostly detached/background work. Use:

/status
strange adder
#

OpenClaw version: 2026.4.22 (00bd2cf)

We are building a companion runtime with:

  • one Joi “Girl Claw”
  • one Telegram bot
  • one real joi_customer_* OpenClaw agent per Telegram customer
  • strict per-customer isolation
  • backend DB is source of truth for user/relationship/consent state

We need long-term companion memory, but want to avoid overbuilding.

What is the recommended OpenClaw pattern for memory in this setup?

Target behavior:

  1. Direct chat should recall relevant personal/relationship facts.
  2. Each customer’s memory must stay isolated.
  3. Memory should survive session reset/compaction.
  4. We need a safe reset/delete path per customer.
  5. Later we want nightly consolidation and proactive/heartbeat logic, but it can stay disabled for now.

Questions:

  • Should we use OpenClaw native memory-core / dreaming as the main memory system?
  • Should we write our own before_prompt_build recall plugin?
  • Is before_compaction / before_reset still the right place to flush memories before chat reset?
  • For production, should memory source of truth be files under each agent workspace, or external DB with prompt projections?
  • How do we avoid raw transcript leakage and cross-customer memory leakage?
  • Any example config/plugin pattern for per-agent customer memory in multi-agent OpenClaw?

We want the simplest safe v1, not a huge memory architecture.

atomic bisonBOT
#

For this setup, I’d not make native memory-core / dreaming the production source of truth.

Recommended simple-safe v1:

  • Keep backend DB as source of truth for customer/relationship/consent memory.
  • Use one OpenClaw plugin, e.g. joi-memory, that:
    1. registers real tools like memory_write, relationship_event, media_request
    2. uses before_prompt_build to inject a small, sanitized memory projection for the current joi_customer_* agent
    3. derives customerId server-side from ctx.agentId / binding metadata — never from model-supplied params
  • Disable native/dreaming automation for customer agents until you intentionally wire it.

So: DB + prompt projection, not files as SoT.

Example shape:

#
{
  plugins: {
    slots: {
      memory: "none" // optional, if you do not want native memory tools/files
    },
    entries: {
      "joi-memory": {
        enabled: true,
        hooks: { allowPromptInjection: true },
        config: {
          backendUrl: "http://127.0.0.1:xxxx"
        }
      },
      "memory-core": {
        config: {
#
          dreaming: { enabled: false }
        }
      }
    }
  },

  agents: {
    defaults: {
      compaction: {
        memoryFlush: { enabled: false } // avoid silent writes to workspace files
      },
      tools: {
        deny: ["exec", "process", "gateway", "sessions_*", "memory_search", "memory_get"]
      }
    },
#
    list: [
      {
        id: "joi_customer_123",
        workspace: "~/.openclaw/workspace-joi-customer-123",
        tools: {
          allow: ["joi-memory", "memory_write", "relationship_event", "media_request"]
        }
      }
    ]
  }
}

Plugin pattern:

#
api.on("before_prompt_build", async (_event, ctx) => {
  const customerId = customerIdFromAgentId(ctx.agentId); // server-side mapping
  const projection = await db.getPromptMemoryProjection(customerId);

  return {
    prependContext: [
      "Joi customer memory projection:",
      projection.summary,
      projection.relevantFacts.map(f => `- ${f.text}`).join("\n"),
    ].filter(Boolean).join("\n")
  };
});

Then register writes as actual tools, not allowlist strings:

#
api.registerTool({
  name: "memory_write",
  description: "Save a consented, structured Joi customer memory.",
  parameters: Type.Object({
    text: Type.String(),
    category: Type.Optional(Type.String()),
    confidence: Type.Optional(Type.Number()),
  }),
  async execute(_id, params, ctx) {
    const customerId = customerIdFromAgentId(ctx.agentId);
    await db.writeMemory({ customerId, ...params });
    return { content: [{ type: "text", text: "saved" }] };
  },
}, { optional: true });
#

Key safety rules:

  • Do not accept customerId from the model. Derive it from ctx.agentId.
  • Store structured memories/events, not raw transcripts.
  • Inject only a bounded projection: profile facts, relationship state, consent flags, recent salient events.
  • Keep per-customer DB rows keyed by tenant/customer/agent id.
  • For delete/reset: purge DB rows + vector rows + any OpenClaw workspace memory files if you enabled them + reset/delete that customer’s sessions.
  • Avoid shared QMD/extra memory collections across agents unless explicitly intended.

before_compaction / before_reset exist, but I would not use them as the primary memory path. before_compaction is observation/annotation territory, and before_reset is the wrong moment to opportunistically preserve memory unless the user explicitly asked to keep it. For v1, write memories through explicit tools/events during normal turns, and make reset/delete an explicit backend operation.

Docs:
https://docs.openclaw.ai/plugins/building-plugins#registering-agent-tools
https://docs.openclaw.ai/plugins/hooks
https://docs.openclaw.ai/concepts/multi-agent
https://docs.openclaw.ai/concepts/memory

strange adder
#

OpenClaw 2026.4.22, custom plugin loads successfully and before_prompt_build works.

The plugin registers optional tools:
memory_write
relationship_event
media_request

Customer agent config has:
tools.allow = ["joi-companion", "memory_write", "relationship_event", "media_request"]
model provider is Nous/Hermes through OpenAI-compatible API:
api: "openai-completions"
compat.supportsTools: true

But systemPromptReport shows:
tools.entries: []
tools.schemaChars: 0

So the model never receives tool schemas.

Question:
For a custom plugin registered with api.registerTool(..., { optional: true }), what exact config is required for those tools to appear for an agent?
Is this blocked by provider api: "openai-completions" / Hermes compatibility, or is our tools allow/config shape wrong?

atomic bisonBOT
#

This looks like tool policy shape, not Hermes/OpenAI-compatible blocking.

api: "openai-completions" + compat.supportsTools: true should not by itself make systemPromptReport.tools.entries empty. If tools were reaching the model but Hermes ignored them, you’d still see tool schemas in the report.

The likely issue: tools.allow is narrowing, not additive, especially if the agent/global config has tools.profile: "coding" or minimal.

For optional plugin tools:

  • api.registerTool(..., { optional: true }) means: only expose if explicitly allowed
  • Allowed by either:
    • exact tool name: memory_write
    • plugin id: joi-companion
    • group:plugins
  • But if a restrictive tools.profile runs first, plugin tools can be filtered out before later tools.allow gets a chance.

For only Joi tools, use this pattern:

#
{
  plugins: {
    entries: {
      "joi-companion": { enabled: true }
    }
  },

  agents: {
    list: [
      {
        id: "joi_customer_123",
        tools: {
          profile: "full",
          allow: [
            "joi-companion"
#
            // or exact names:
            // "memory_write",
            // "relationship_event",
            // "media_request"
          ],
          deny: [
            "group:runtime",
            "group:sessions",
            "gateway"
          ]
        }
      }
    ]
  }
}
#

For coding profile + add Joi tools, use alsoAllow, not allow:

{
  agents: {
    list: [
      {
        id: "joi_customer_123",
        tools: {
          profile: "coding",
          alsoAllow: [
            "joi-companion"
          ],
          deny: [
            "group:runtime",
            "group:sessions",
#
            "gateway"
          ]
        }
      }
    ]
  }
}

Don’t set allow and alsoAllow in the same tools scope — config rejects that.

One more plugin-side check: if you register via a factory function, include names:

api.registerTool(
#
  (ctx) => [memoryWriteTool, relationshipEventTool, mediaRequestTool],
  {
    optional: true,
    names: ["memory_write", "relationship_event", "media_request"]
  }
);

If registering concrete tool objects one-by-one, OpenClaw reads tool.name automatically.

Debug checklist:

openclaw plugins inspect joi-companion --json
openclaw plugins list --verbose
strange adder
#

We have JOI running as one OpenClaw Girl Claw with one Telegram bot and one real joi_customer_* agent per Telegram customer. Backend DB is the source of truth for user profile, consent, relationship state, memory, and reset/delete. OpenClaw workspaces are generated projections only.

Next slice is relationship/consent hardening:

  • stages: guarded, curious, warm, affectionate, rejected
  • intimacy modes: romantic_only, flirty_sensual, erotic_explicit
  • backend should enforce gates for media_request, voice_note, proactive sends, and rejected lock
  • stage progression should be evidence/event based, not model vibe
  • customer agents have no exec/browser/fs/session tools

Question: from OpenClaw side, any config/plugin/hook best practices for enforcing this outside the model prompt? We plan to enforce in backend plugin/API before tools can queue media/proactive/voice requests. Anything OC-specific we should avoid or add?

atomic bisonBOT
#

You’re thinking about this the right way: make the backend/plugin the authority, not the prompt.

OC-specific best practices I’d use:

  1. Expose only narrow JOI tools

    • Don’t expose message, tts, media-gen, exec, browser, fs, sessions, cron, etc. to customer agents.
    • Give them only plugin tools like joi_request_media, joi_request_voice, joi_request_proactive, etc.
    • Register side-effect tools as optional: true, then explicitly allow only those tools/plugin ids.
  2. Gate in two places

    • Primary gate: inside your backend/API before queuing media/proactive/voice.
    • OC backstop: plugin hooks:
      • before_tool_call → block/rewrite/approval-gate JOI tools.
      • message_sending / before_dispatch / reply_dispatch → catch accidental direct sends/final reply dispatches.
        Docs: https://docs.openclaw.ai/plugins/hooks
  3. Use per-agent tool policy

#

Example shape:

{
  agents: {
    list: [
      {
        id: "joi_customer_123",
        workspace: "~/.openclaw/joi/customer_123",
        skills: [],
        tools: {
          profile: "minimal",
          allow: ["joi_companion"], // or specific joi_* tool names
          deny: [
            "group:runtime",
            "group:fs",
#
               "group:ui",
               "group:sessions",
               "group:automation",
               "group:memory",
               "message",
               "tts",
               "image_generate",
               "video_generate"
             ]
           }
         }
       ]
     }
   }