#OpenClaw Gateway: Do Agent Changes Require Restart or Support Hot Reload?

1 messages · Page 1 of 1 (latest)

dusk rover
#

Hi, I have a question regarding OpenClaw's multi-agent and configuration behavior.

From the documentation, I understand that OpenClaw Gateway supports some level of hot reload. However, I'm a bit unclear about the exact boundaries.

My questions are:

  1. Does OpenClaw support dynamically adding or modifying agents (e.g., updating the agents section in openclaw.json) without restarting the gateway?

  2. For agent binding or routing (e.g., mapping users/accounts to different agents), can this be changed dynamically at runtime, or does it require a openclaw gateway restart?

  3. Which types of configuration changes explicitly require a gateway restart? For example:

    • Channel/account configuration
    • Network/port settings
    • Agent definitions
  4. Is there a recommended best practice for implementing dynamic multi-agent routing (e.g., one dispatcher agent vs static config binding)?

I’m trying to design a scalable multi-user system with dynamic agent allocation, so understanding the runtime behavior is very important.

Thanks in advance!

opal tundraBOT
thin rainBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> âś… Mark Solution

dusk rover
#

Hi, I have some questions about OpenClaw’s runtime behavior in a multi-agent setup, especially regarding hot reload.

What I'm trying to do
I'm building a multi-user system that requires:

  • Dynamically adding or modifying agents
  • Assigning agents per user (potentially with load balancing later)
  • Avoiding frequent gateway restarts

What I'm unsure about
From the documentation, it seems that OpenClaw supports some level of hot reload, but the boundaries are unclear. Specifically:

  1. If I modify the agents section in openclaw.json (e.g., add or update an agent), will it take effect automatically?

    • Or is openclaw gateway restart required?
  2. For user/account → agent routing:

    • Can this mapping be changed dynamically at runtime?
    • Or is it tied to static configuration that requires a restart?
  3. Which configurations explicitly require a Gateway restart?

    • Agent definitions
    • Skills
    • Channel/account configurations (e.g., WeChat, Feishu)
    • Network settings (ports, host, etc.)
  4. Best practice for multi-agent architecture:

    • Is static config binding recommended?
    • Or using a dispatcher/router agent for dynamic routing?

Environment

  • Deployment: Docker (Ubuntu)
  • OpenClaw version: 2026.4.9

Any clarification or real-world experience would be greatly appreciated 🙏

livid mulchBOT
#

I checked the current docs plus the reload plan code.

Short version:

  1. Agents: yes. Editing agents/agents.list is live in normal watcher mode. With the default gateway.reload.mode: "hybrid", you do not need a manual openclaw gateway restart for agent definition changes. If reload mode is "off", then changes wait until the next restart.

  2. Bindings/routing: yes. Top-level bindings[] are dynamic too, and openclaw agents bind / unbind just updates config. New inbound messages use the new route after reload. One important nuance: history is not migrated. Session keys are agent-scoped, so if you rebind a conversation to a different agent, future turns land in that other agent’s session store.

  3. What needs restart:

    • No full gateway restart: channels.*, agents, bindings, models, messages, session, tools, skills, logging, ui
    • Full gateway restart: gateway.* (port, bind, auth, tailscale, TLS, HTTP), plugins, discovery, canvasHost
    • Nuance: channel/account config is hot-applied, but the affected channel adapter may reconnect/restart in process
    • Exception: gateway.reload and gateway.remote do not force restart
  4. Best practice: use static deterministic bindings for tenant/user/account ownership, then do dynamic work allocation inside that agent via delegation/subagents/ACP if needed. I would not use binding rewrites as a per-message load balancer. OpenClaw routing is deterministic, first-match-wins, not a scheduler.

So for a scalable multi-user design, I’d recommend:

#

Yep, same core answer, with a couple more precise bits for your case:

On current OpenClaw behavior, direct config-file edits are watched.
With the default gateway.reload.mode: "hybrid":

  • agents: live, no manual gateway restart
  • bindings / user→agent routing: live, no manual gateway restart
  • skills: live, no manual gateway restart, and skill snapshots refresh on the next turn
  • channels.* (including bundled/extension channels): no full gateway restart, but the affected channel adapter may restart/reconnect in-process
  • gateway.* (port, bind, auth, TLS, HTTP, tailscale): gateway restart required
  • plugins, discovery, canvasHost: gateway restart required

Two small exceptions under gateway.*:

  • gateway.reload
  • gateway.remote

Those do not force a restart.

#

A couple of practical nuances:

  1. Routing changes affect future turns, not past state
    If you rebind a user from agent A to agent B, new messages go to B, but history is not migrated. Session isolation is agent-scoped.

  2. Docker/Ubuntu does not change the reload model
    As long as the gateway process can see the updated openclaw.json, behavior is the same.

  3. If gateway.reload.mode: "off", then none of this hot reload applies until manual restart.

For architecture, my recommendation is still:

  • static bindings at ingress for tenant/user/account ownership
  • one stable agent per tenant/persona/org
  • do dynamic work allocation inside that agent via delegation/subagents/ACP

I would avoid using live binding rewrites as a load balancer unless you really need it, because it fragments session continuity.

dusk rover
#

Thanks for the detailed explanation — this helps a lot.

However, I’m encountering a situation that seems inconsistent with the expected hot reload behavior:

What I did

  • I updated the bindings[] configuration in openclaw.json to route a user to a different agent.

Logs show successful reload

[reload] config change detected; evaluating reload (bindings)
[reload] config change applied (dynamic reads: bindings)

What I expected

  • New incoming messages should be routed to the new agent.

What actually happens

  • The chat channel still routes messages to the previous agent
  • It does not switch to the new binding as expected

Additional context

  • This is happening on an existing conversation/session
  • The channel is already active (no reconnect observed)

My question
Is this expected behavior due to session-level caching or channel adapter behavior?

Specifically:

  • Are bindings only applied to new sessions instead of ongoing ones?
  • Does the channel adapter need to reconnect/reinitialize for routing changes to take effect?
  • Or is there some internal routing cache that delays the binding switch?

Any clarification would be greatly appreciated 🙏

livid mulchBOT
#

This is not the expected behavior for plain configured bindings[].

A few important details:

  • Configured bindings are evaluated on each inbound message
  • They are not “new session only”
  • Channel reconnect is not required
  • Agent routing happens before session lookup, so an old session should not pin the agent by itself

There is an in-memory route cache, but I checked the resolver and it is invalidated when cfg.bindings changes, so a delayed switch because of normal route caching is unlikely.

What usually causes the symptom you described:

  1. A more specific binding is still winning
    Precedence is:
    peer → parentPeer → guild+roles → guild → team → account → channel → default
#

So if you changed a channel/account binding, but an older peer-specific binding still matches, it will keep winning.

  1. A runtime conversation/thread binding is active
    Things like:

    • /focus
    • ACP --bind here
    • thread-bound subagent/ACP sessions

    can pin the conversation to a session/agent independently of top-level bindings[].

  2. You are looking at configured bindings, but the conversation is using a runtime bind
    openclaw agents bindings only shows the configured routing rules, not temporary/current-conversation binds.

So the short answer is:

  • No, bindings are not only for new sessions
  • No, the channel adapter should not need reconnect for normal binding changes
#
  • No, ordinary session caching should not keep the old agent
  • If it still sticks, it’s usually precedence or a runtime conversation binding

Best next check:

  • enable verbose/debug logging and look for the next inbound:
    • [routing] resolveAgentRoute: ...
    • [routing] match: matchedBy=... agentId=...

That will tell you exactly which rule is winning.

If you’re on a surface with thread bindings, also check whether that conversation was previously focused/bound. On Discord, /agents and /unfocus can help if thread bindings are enabled.

If none of that applies, then honestly I’d start suspecting a bug, because plain bindings[] hot reload should affect the next inbound turn.

Docs:
https://docs.openclaw.ai/channels/channel-routing

dusk rover
#

Here is a minimal reproducible log showing the issue where bindings hot reload is applied, but routing still goes to the old agent:


Reload log (bindings change detected)

[reload] config change detected; evaluating reload (meta.lastTouchedAt, agents.list, bindings)
[reload] config hot reload applied (agents.list)

Next inbound message (after reload)

Received message from user: 480562E9913A985D4A79822A643E27B6
content: "你好"

Session info

sessionId=816ef99a-eacc-4e3c-acb9-55ee5ceff725
sessionKey=agent:study:qqbot:direct:480562e9913a985d4a79822a643e27b6

Observed behavior

  • The message is still routed to agent: study
  • This is the old agent before updating bindings[]
  • Expected: should route to the new agent defined in updated bindings[]

Important observations

  • This is the same ongoing conversation/session
  • No channel reconnect occurred
  • Logs do NOT show [routing] resolveAgentRoute details (can enable if needed)

Question
Given that:

  • bindings[] are hot-reloaded
  • routing should be evaluated per inbound message

Why is the message still routed to the previous agent?

Could this be caused by:

  • sessionKey already bound to agent (agent:study:...)?
  • missing routing re-evaluation step?
  • or a runtime binding overriding config?

I can provide:

  • verbose [routing] logs
  • old vs new bindings[] config

if that helps further debugging.

livid mulchBOT
#

This log actually narrows it down a lot.

A few key points:

  1. bindings not showing up in config hot reload applied (agents.list) is normal
    bindings are treated as dynamic reads, so when the same edit also changes agents.list, the log only prints the hot-applied part (agents.list). That does not mean bindings were ignored.

  2. sessionKey=agent:study:... is the result, not the cause
    Inbound messages do not arrive with a session key that pins the old agent. OpenClaw resolves the agent first, then builds the session key.
    So “same ongoing session” by itself should not force study.

  3. Given that, the most likely explanation is:

    • the new binding did not match this inbound QQBot message, so routing fell back to the default agent, and study is probably your default
    • or the binding matched, but the target agent id did not resolve from the updated agents.list, so it again fell back to default
    • or a more specific binding still wins
#

A very common gotcha here:

  • a binding with no accountId only matches the default account
  • it does not mean “all accounts”
  • for a channel-wide fallback across accounts, use accountId: "*"

So from what you posted, I would not blame session stickiness first.
I would first suspect default fallback or precedence mismatch.

The fastest way to prove it is one verbose routing block. You want to see:

  • [routing] resolveAgentRoute: ...
  • [routing] match: matchedBy=... agentId=...

That will tell us immediately whether this is:

  • matchedBy=default → new binding did not match, or target agent was invalid
#
  • matchedBy=binding.* → some binding still matches, and we can see which tier won

If you can share these 3 things, we can pin it down pretty fast:

  1. whether study is the default agent
  2. the old vs new binding entry
  3. openclaw agents list --bindings output

Docs:
https://docs.openclaw.ai/channels/channel-routing
https://docs.openclaw.ai/concepts/multi-agent
https://docs.openclaw.ai/cli/agents

If the routing log comes back as matchedBy=default while the new binding should clearly match, then I’d start treating that as a real bug.

dusk rover
#

Hi, I’m debugging an issue with bindings hot reload and agent routing, and I’ve narrowed it down with logs.

What I’m trying to do

I want to dynamically route a specific user (QQ direct message) to a dedicated agent using a peer-level binding, while keeping a fallback account-level binding.

Current bindings config
{
"agentId": "tutor_480562E9913A985D4A79822A643E27B6",
"match": {
"channel": "qqbot",
"accountId": "study",
"peer": {
"kind": "direct",
"id": "480562E9913A985D4A79822A643E27B6"
}
}
},
{
"agentId": "study",
"match": {
"channel": "qqbot",
"accountId": "study"
}
}
Expected behavior

The first binding (peer-level) should take precedence over the account-level binding and route this user to the tutor_* agent.

Actual behavior

Routing always matches the account-level binding:

[routing] resolveAgentRoute: channel=qqbot accountId=study peer=direct:480562E9913A985D4A79822A643E27B6
[routing] match: matchedBy=binding.account agentId=study

So it seems the peer-level binding is not matching.

My questions
Is the peer: { kind, id } format correct for matching direct messages in QQBot?
Are there any normalization rules (e.g., case sensitivity, formatting) that could cause this mismatch?
Is peer matching expected to work for QQBot direct messages, or are there known limitations?
Could this be a bug where peer-level bindings are not being matched correctly?
Summary
bindings hot reload is working
account-level binding matches correctly
peer-level binding does not match even though peer appears correct in logs

Any guidance would be greatly appreciated 🙏

livid mulchBOT
#

This looks like a peer-match issue, not a hot reload issue.

What the code says:

  • For QQBot C2C/private chat, OpenClaw routes with:
    • peer.kind: "direct"
    • peer.id = event.author.user_openid (raw OPENID)
  • For QQBot, peer.id is kept as the raw peer identifier. It is not converted to qqbot:c2c:... for routing.
  • peer.kind: "dm" also normalizes to direct, but for guild DMs the ID source is different (author.id, not user_openid).

So for your case, if this is a C2C/private chat, this shape is correct:

"peer": {
  "kind": "direct",
  "id": "480562E9913A985D4A79822A643E27B6"
#
}

A couple of important normalization details:

  • peer.kind is normalized (dm and direct both end up as direct)
  • peer.id is basically trimmed only
  • there is no lowercase normalization for peer.id

So yes, peer ID matching is effectively case-sensitive apart from whitespace.

Also, the core resolver explicitly prefers:

  1. exact peer binding
  2. then account binding
#

and there are tests for “peer binding wins over account binding”. QQBot feeds into that same resolver, so matchedBy=binding.account here is not the expected result if the peer binding is really present in the evaluated bindings.

So my read is:

  • your binding format looks correct
  • QQBot direct peer matching is supposed to work
  • if the full binding is loaded exactly as shown, this is quite possibly a bug

The next log I’d want is the full debug block, especially the [routing] binding: lines. With debug on, OpenClaw logs each binding it evaluated before the final match. That will tell us whether the peer binding is:

  • missing entirely from the live config
  • present but normalized unexpectedly
  • present and still skipped, which would strongly point to a bug

Please include:

  1. openclaw --version
#
  1. the full [routing] binding: lines for that message
  2. whether this was C2C private chat or guild DM

One extra nuance: if this is actually a guild DM and not C2C, then the right ID would be the QQ DM author.id, not the C2C user_openid.

Docs:
https://docs.openclaw.ai/channels/channel-routing
https://docs.openclaw.ai/channels/qqbot

If those binding log lines show your peer binding exactly and it still falls through to binding.account, I’d file a GitHub issue, because that would smell like a real routing bug.

dusk rover
#

âť— Binding inconsistency between CLI and runtime

I’m seeing a mismatch between CLI bindings and runtime routing.

CLI shows:

openclaw agents list --bindings

Includes peer binding:

qqbot accountId=study peer=direct:480562E9913A985D4A79822A643E27B6 → tutor_xxx

But runtime log shows only 1 binding:

[routing] resolveAgentRoute ... bindings=1
[routing] binding: agentId=study accountPattern=study peer=none
[routing] match: matchedBy=binding.account agentId=study

👉 The peer binding is not evaluated at all, even though it exists in CLI.

Expected:
peer binding should win over account binding.

Context:

QQBot C2C (user_openid)
peer.id matches exactly (case-sensitive checked)
OpenClaw version: 2026.4.15

Question:
Why would a binding appear in agents list --bindings but not be included in [routing] evaluation?
Is this a known issue or config edge case?

livid mulchBOT
#

I think I found the reason. This looks like a real QQBot config snapshot bug, not a peer-format issue.

Why CLI and runtime disagree:

  • openclaw agents list --bindings reads the current config
  • but the QQBot gateway appears to route with the cfg captured when the channel started

In extensions/qqbot/src/gateway.ts, startGateway(ctx) grabs cfg once, and later each inbound message does:

resolveAgentRoute({ cfg, channel: "qqbot", ... })

So if you change bindings after QQBot is already running, the channel can keep using the old config snapshot.

That lines up perfectly with your logs:

#
  • CLI sees the new peer binding
  • runtime says bindings=1 and only shows the old account binding
  • bindings changes do not restart the QQBot channel, because in the reload plan bindings are treated as dynamic/no-restart

So the core routing logic is fine, but QQBot is likely not re-reading the live config on inbound.

Practical answer

For 2026.4.15, I would treat this as:

  • not expected behavior
  • very likely a bug
  • workaround is openclaw gateway restart after changing QQBot bindings

Why this happens

The bad combo is:

#
  1. bindings are marked no-restart/dynamic
  2. QQBot inbound routing uses a startup-time cfg snapshot

So runtime and CLI can drift apart.

What I’d report

I’d open a GitHub issue with:

  • version: 2026.4.15
  • QQBot C2C repro
  • the binding snippet
  • CLI output showing the peer binding
  • runtime logs showing bindings=1

Issue tracker:
https://github.com/openclaw/openclaw/issues