#trello webhook issue

1 messages · Page 1 of 1 (latest)

delicate tide
#

You
7:21 AM
A
message
send
Completed

Question for Krill about Trello webhook issue:

Current situation:

Webhook server running on localhost:3003 âś“
Ngrok tunnel forwarding to localhost:3003 âś“
Webhook registered with Trello successfully âś“
But ZERO webhook events received from Trello when Austin moved card to "Ideas" âś—
What's happening:
Austin moved a card to the "Ideas" list in Trello. The webhook server logs show it started up and is listening, but NO POST requests from Trello are being received.

Possible causes Krill should investigate:

Trello webhook not configured to send events to https://fakeurl-fakeurl-fakeurl.ngrok-free.dev/webhook/trello?
Trello webhook is configured but disabled/inactive?
Events being blocked or filtered before reaching the webhook?
Webhook registration succeeded but Trello rejected it silently?
Ngrok tunnel not actually forwarding traffic to webhook endpoint?
Question:
What's preventing Trello from sending webhook events to the registered webhook URL? The webhook server is running and healthy, but receiving zero Trello POST requests.

Files that might help:

Webhook server: /home/yucky/.openclaw/workspace/crm/scripts/webhook-server.js
Webhook debug log: /home/yucky/.openclaw/workspace/crm/scripts/webhook-debug.log
Webhook registration: /home/yucky/.openclaw/workspace/crm/scripts/register-webhook.js
Agent configs in Idea Pipeline workspace (all have same webhook issue)

hot gobletBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> âś… Mark Solution

delicate tide
#

krilllll helpppppp

craggy tigerBOT
#

Most common Trello-webhook “registered but no events” causes are super boring: wrong idModel, webhook pointing at an old ngrok URL, Trello is sending but your server rejects (and you’re only logging after a signature check), or the webhook got disabled after failures.

Here’s how to pinpoint it fast:

1) Is Trello hitting ngrok at all?

Open ngrok’s inspect UI while you move a card:

If you see requests there (even 400/500) but nothing in your app logs, your server is receiving traffic but dropping it before your current logging.

If you see zero requests in ngrok, then it’s either (a) Trello isn’t sending, or (b) Trello is sending to a different URL than your current tunnel.

2) Verify Trello’s stored callbackURL matches your current ngrok URL

Ngrok free URLs change whenever you restart it. If you registered the webhook, then restarted ngrok, Trello is still posting to the old domain.

Run (redact key/token before pasting):

#
curl "https://api.trello.com/1/tokens/$TRELLO_TOKEN/webhooks?key=$TRELLO_KEY"

Check the callbackURL exactly.

3) Confirm you registered the webhook on the right model

Webhooks only fire for the idModel you registered (usually a board id). If you registered it on a single card, moving a different card won’t trigger anything.

In your register-webhook.js, what is idModel set to? Paste that snippet.

4) Make sure your endpoint returns 200 fast (and handles HEAD)

Trello validates webhooks and will stop delivering if your endpoint errors or times out repeatedly.

From your machine:

curl -I "https://<your-ngrok>.ngrok-free.dev/webhook/trello"
#
curl -i -X POST "https://<your-ngrok>.ngrok-free.dev/webhook/trello" \
-H "Content-Type: application/json" \
-d '{"ping":true}'

If either of those isn’t a quick 200/2xx, fix that first (even temporarily).

5) If ngrok shows requests but your app logs don’t

Add a top-level request logger before any Trello signature verification / JSON parsing assumptions. Trello signature validation often fails if you don’t use the raw body.


If you paste:

  1. the output (sanitized) of the “list webhooks” curl (just the callbackURL + idModel parts),
  2. your register-webhook.js webhook creation call,
  3. and whether ngrok’s inspector shows any hits when you move a card,
#

…I can tell you exactly which branch this is in.

delicate tide
#

Update for Krill:

Webhook server is running on localhost:3003. Ngrok is forwarding. Trello webhook appears registered.

Issue: Zero incoming Trello POST requests in logs.

Added request logging to debug. The webhook server now logs all incoming requests to webhooks-request.log. This will show exactly what Trello sends and when.

#

Please help! We had this working and then edited a bunch of the agents.md files and now it's not working and my agent is convinced it's on trello's end - I'm not.

craggy tigerBOT
#

Nice — that request log is exactly what you need. Two key “tell me immediately” checks now:

1) Do you see any request right after webhook creation?

When you register/create a Trello webhook, Trello should hit your callbackURL pretty much immediately (often a HEAD request for verification).

So after you run register-webhook.js, check:

If you see nothing in both places: Trello is not reaching the URL you think it is (almost always: wrong/stale ngrok URL stored in Trello, or webhook created against a different callback than the one you’re watching).

If you see it in ngrok but not in your file log: your server is receiving it but your logging isn’t early enough (e.g., middleware order, crash before logger, different process/port).

2) Confirm Trello’s stored callbackURL matches your current ngrok URL

Ngrok free URLs change on restart. Trello will keep posting to the old one forever until you recreate/update the webhook.

#

Run:

curl "https://api.trello.com/1/tokens/$TRELLO_TOKEN/webhooks?key=$TRELLO_KEY"

Then compare the returned callbackURL to the ngrok URL you currently have open.

Quick paste request

Can you paste (redact key/token if present):

  1. the first ~20 lines of webhooks-request.log after you re-register the webhook (or say “file is empty”), and
  2. the callbackURL + idModel from that curl .../webhooks output?

With just those two, we can narrow it to “Trello never contacted you” vs “Trello contacted you but you’re rejecting/ignoring it” in one step.

delicate tide
#

Question for Krill:

I'm helping configure Gateway hooks to spawn agents from a Trello webhook server. The webhook POSTs to /hooks/agent with this payload:

{
"sessionKey": "main",
"agentId": "delivery-agent",
"message": "task description here...",
"wakeMode": "now"
}
I've added this hook mapping:

{
"id": "trello-agent-spawn",
"match": {"path": "/agent"},
"action": "agent",
"sessionKey": "main",
"timeoutSeconds": 3600
}
Question: Will the Gateway automatically extract agentId from the webhook payload to spawn the correct agent (delivery-agent, planning-agent, etc.)? Or do I need to add a transform or configure it differently to dynamically select the agent based on the payload's agentId field?

The goal is to have one hook endpoint that can spawn different agents (planning, review, coding, delivery) based on what the webhook server sends in the payload.

craggy tigerBOT
#

If you’re POSTing to /hooks/agent, then yes: the Gateway will read agentId from the JSON payload and route the run to that agent (subject to hooks.allowedAgentIds). You don’t need a transform for that.

Key gotchas:

  1. hooks.allowedAgentIds can block dynamic routing
    If you set it, make sure it includes the agents you want (or omit it / include "*").

  2. sessionKey: "main" doesn’t mean “run in the main session”
    /hooks/agent runs an isolated agent turn and then posts a summary into the main session.
    Also, request-provided sessionKey is rejected by default unless:

  • hooks.allowRequestSessionKey: true and
  • (if set) it matches hooks.allowedSessionKeyPrefixes (recommended ["hook:"])
  1. Your mapping is likely unnecessary (and possibly ineffective)
  • /hooks/agent is a built-in endpoint; hooks.mappings are for POST /hooks/<name> (like /hooks/trello), not for overriding the built-in routes.
  • Also: in mappings, agentId is not templated from payload (it’s fixed). If you want a mapped endpoint that chooses agent dynamically from the request body, you must use a transform module.
#

Recommended simplest approach

Just call /hooks/agent directly with:

{
"agentId": "delivery-agent",
"message": "task description here...",
"wakeMode": "now"
}

And config like:

{
hooks: {
enabled: true,
token: "…",
delicate tide
#

Question for Krill - Trello Webhook Pipeline Setup

Current Setup
We have a Trello board with an "Idea Pipeline" workflow:

Ideas → Planning Agent
Review → Review Agent
Plans → Coding Agent
Delivered → Delivery Agent
Architecture:

Trello webhooks → ngrok tunnel → webhook server (Node.js on port 3003)
Webhook server POSTs to Gateway /hooks/agent with payload:
{
"sessionKey": "main",
"agentId": "planning-agent",
"message": "PLANNING TASK: [card details]...",
"wakeMode": "now"
}
Gateway hooks are enabled with a mapping (though you said /hooks/agent is built-in)
The Issue
When we move a card to "Ideas", the webhook is received (confirmed in logs), but the planning-agent doesn't spawn.

Testing: Manual curl to /hooks/agent returns HTTP 202 with a runId, but no agent session appears.

The Confusion
Our agents' AGENTS.md files say they're triggered by Telegram messages (e.g., "You receive Telegram trigger messages when a card moves into the Ideas list").

But the webhook server is trying to spawn them directly via Gateway hooks.

Questions:

Should we be spawning agents directly via /hooks/agent, or should the webhook server send Telegram messages to trigger the agents?

If direct spawning is correct, why isn't the agent session appearing after the 202 response? Do we need allowedAgentIds or other hook configuration?

If Telegram triggering is correct, what's the recommended pattern for programmatically sending messages to specific agent-bound groups from a webhook server?

What's the "right way" to build an automated pipeline where Trello card movements spawn specific agents to perform workflow actions?

craggy tigerBOT
#

Direct spawning via POST /hooks/agent is the right pattern for this. Telegram-triggering can work, but it’s a sideways hack (you’re depending on chat routing + “last channel” state) instead of using the purpose-built ingress.

1) First: your payload is (probably) being rejected, or routed somewhere you’re not looking

Two big gotchas in what you pasted:

A) sessionKey: "main"

  • In current OpenClaw, sessionKey in /hooks/agent requests is rejected by default unless you explicitly set:
  • hooks.allowRequestSessionKey: true
  • And if you also set hooks.allowedSessionKeyPrefixes, the key must match (recommended prefix is "hook:").
    So if you’re sending sessionKey:"main", the gateway will return HTTP 400 with an error mentioning hooks.allowRequestSessionKey.

Action: update your webhook server to log the Gateway response status + body for every POST (don’t just assume it worked because you “sent it”).

B) agentId only routes to that agent if it’s a known agent
/hooks/agent does this:

  • if agentId is known → routes to it
  • if agentId is unknown → silently falls back to the default agent
#

So if planning-agent etc aren’t actually in agents.list, you’ll never see those agents run.

Action: run:

openclaw agents list

and confirm you literally see planning-agent, review-agent, coding-agent, delivery-agent listed.

2) Why you got 202 + runId but “no session appears”

A 202 only means “accepted; job kicked off async”. The actual run happens in the background and failures get written to logs and also get summarized into the main session.

Do these two checks right after you trigger a card move:

openclaw logs --follow
#
openclaw sessions --active 60

You should see a new sessionKey (often hook:<uuid> unless you configured a default), and logs will show if the run failed (missing model creds, agent config, tool allowlist, etc.).

3) Recommended “Trello pipeline” pattern (clean + reliable)

Trello webhook server decides the stage → calls /hooks/agent with the right agentId.

Minimal payload (don’t include sessionKey until you’ve deliberately enabled it):

{
"agentId": "planning-agent",
"name": "Trello",
"message": "PLANNING TASK: ...",
"wakeMode": "now"
#
}

Config hardening (recommended):

  • set hooks.allowedAgentIds to only the agents you want callable via webhooks
  • set hooks.defaultSessionKey: "hook:trello" so runs group in a predictable session (optional)

Docs: https://docs.openclaw.ai/automation/webhook

4) When Telegram-triggering is actually useful

Only if you want humans-in-the-loop approvals / visibility, or you don’t want to expose hook ingress. Otherwise it adds complexity (and you still have to solve routing).

If you tell me what openclaw agents list shows + what HTTP status/body your webhook server gets back from /hooks/agent, I can point at the exact failure mode in one hop.

delicate tide
#

MESSAGE 1 of 3: Question for Krill - Webhook Deduplication Race Condition
Problem Summary
Trello sends duplicate webhooks (same card move event) within milliseconds of each other. Our deduplication logic fails due to race conditions.

Current Setup
Webhook server (Node.js/Express):

// In-memory deduplication cache
const processedEvents = new Map();
const DEDUP_WINDOW_MS = 30000;

function isDuplicateEvent(cardId, listId) {
const key = ${cardId}:${listId};
const now = Date.now();
const lastProcessed = processedEvents.get(key);

// Clean old entries
for (const [k, v] of processedEvents.entries()) {
if (now - v > DEDUP_WINDOW_MS) processedEvents.delete(k);
}

if (lastProcessed && (now - lastProcessed < DEDUP_WINDOW_MS)) {
return true; // Duplicate detected
}

processedEvents.set(key, now);
return false;
}

app.post('/webhook/trello', (req, res) => {
const { card, listAfter } = extractData(req.body);

if (isDuplicateEvent(card.id, listAfter.id)) {
console.log('Duplicate detected, skipping');
return res.status(200).end();
}

// Spawn agent via Gateway hooks
spawnAgent(agentType, task);
res.status(200).end();
});

#

MESSAGE 2 of 3: What We Observed
Logs showing the race condition:

=== WEBHOOK RECEIVED === (Event 1)
đź“‹ Card "X" moved to list: Delivered
🤖 Spawning delivery-agent...
HTTP Status: 202
Response: {"ok":true,"runId":"12b79750-..."}

=== WEBHOOK RECEIVED === (Event 2 - 4ms later)
đź“‹ Card "X" moved to list: Delivered
🤖 Spawning delivery-agent...
HTTP Status: 202
Response: {"ok":true,"runId":"d1fbec03-..."}
Result: Two delivery-agents spawned for the same card move.

Root Cause
Both requests hit isDuplicateEvent() before either could write to the Map:

Time 0ms: Request 1 → checks Map (empty) → returns false
Time 1ms: Request 2 → checks Map (empty) → returns false
Time 2ms: Request 1 → writes to Map
Time 3ms: Request 2 → writes to Map (overwrites)
Questions
What's the recommended pattern for webhook deduplication in OpenClaw?

Should we handle this at the webhook server level?
Or should Gateway hooks have built-in idempotency keys?
Is there a way to make Gateway /hooks/agent idempotent?

Could we pass an idempotencyKey in the payload?
Would Gateway reject duplicate runs with the same key?
Should we move deduplication to the agent level?

Have agents check if another instance is already processing the same card?
Use file-based locks or similar?
What's the simplest robust solution?

File-based persistence with atomic writes?
SQLite with unique constraints?
Redis (probably overkill)?
Context
Single VPS deployment (no container orchestration)
Node.js webhook server
Gateway hooks spawning agents
Trello webhooks for card movements
Redacted: Specific file paths, tokens, and IDs removed.

read
View

Question for Krill - Webhook Deduplication Race Condition

…
tool
6:21 PM
A

#

MESSAGE 3 of 3: Looking at the draft, I see:

I mention "Trello" which is fine (it's the service name)
I mention "OpenClaw" and "Gateway" which are fine
I use generic card names like "Card X"
I use runId examples but those are just UUIDs from logs, not sensitive credentials
I don't include any actual API keys, tokens, or passwords
However, I should double-check if there's anything else that might be considered sensitive. The user might be concerned about:

The specific runIds (but these are just execution IDs, not credentials)
The fact that they're using Trello for an "Idea Pipeline"
The specific card content
Let me confirm with the user what they consider sensitive. The draft looks clean to me - no API keys, no tokens, no passwords, no personal information. Just technical details about the deduplication problem.

Actually, I should be more careful about the runIds. While they're not credentials, they are specific to the user's instance. But they're also just random UUIDs that don't reveal anything sensitive.

Let me provide a clear answer to the user.âś… Yes, all sensitive information is redacted.

What's in the draft:

âś… Generic code examples (no file paths)
âś… No API keys, tokens, or passwords
âś… No Telegram IDs or chat IDs
âś… No Trello board IDs or card IDs
âś… Generic card name ("Card X")
âś… Run IDs shown as truncated (12b79750-...)
What's NOT in the draft:

❌ Actual webhook URLs
❌ Gateway tokens
❌ Telegram bot tokens
❌ Trello API credentials
❌ Specific file system paths
❌ Your username or home directory
The only potentially identifying info is:

Mention of "Trello" (the service you're using)
Mention of "OpenClaw" (the platf

craggy tigerBOT
#

Yep, that dedupe is racey: two requests can both pass lastProcessed before either one writes processedEvents.set(key, now) (Node is single-threaded, but your handler awaits somewhere in spawnAgent / network I/O, so multiple requests interleave across the event loop).

Fix 1 (best, simplest): “claim first”, then do work

Make the “mark processed” atomic before any async work, and don’t do cache cleanup inside the hot path.

const processedEvents = new Map(); // key -> expiresAt
const DEDUP_WINDOW_MS = 30_000;

function claimEvent(key) {
const now = Date.now();
const expiresAt = processedEvents.get(key);
if (expiresAt && expiresAt > now) return false; // already claimed recently
processedEvents.set(key, now + DEDUP_WINDOW_MS);
return true;
}
#

// periodic cleanup
setInterval(() => {
const now = Date.now();
for (const [k, expiresAt] of processedEvents) {
if (expiresAt <= now) processedEvents.delete(k);
}
}, 10_000).unref();

app.post("/webhook/trello", async (req, res) => {
const { card, listAfter, actionId } = extractData(req.body);

// Use the most-specific id you have (see Fix 2). Fallback shown:
const key = actionId ?? `${card.id}:${listAfter.id}`;
#
if (!claimEvent(key)) return res.sendStatus(200);

// fire-and-forget so you return 200 quickly
spawnAgent(/*...*/).catch(err => console.error("spawnAgent failed", err));
return res.sendStatus(200);
});

That removes the race entirely in a single Node process.

Fix 2 (more correct): dedupe by Trello action ID, not (cardId:listId)

cardId:listId will incorrectly dedupe legitimate moves (move card into Ideas, then out, then back within 30s). Trello payloads typically include an action identifier (often action.id or similar depending on what you store). Best key is:

  • action.id (ideal)
  • else action.date + action.type + card.id + listAfter.id
  • else your current fallback
#

Fix 3 (production): make it durable across restarts / multiple instances

If there’s any chance you’ll run >1 webhook server instance (or you restart often), in-memory Map will fail. Use Redis with SET key value NX PX 30000 as a true atomic dedupe.

One more Trello-specific note

Trello will retry deliveries if your endpoint is slow or errors. Keep your webhook handler fast:

  • return 200 immediately
  • do agent spawning async after the response

If you paste a real Trello webhook body (redacted) I can point to the exact action.id field to use for the dedupe key.

#

Your draft/redaction checklist looks solid.

A few extra things that often sneak in (worth scanning for before you post):

  • Ngrok / public webhook URLs (even if “not a secret”, it’s a live ingress endpoint)
  • Gateway hostnames / LAN IPs / Tailscale MagicDNS names
  • Session keys if you ever use chat-derived ones (they can embed channel IDs, phone numbers, etc.)
  • Trello IDs: board/card/list IDs aren’t “credentials”, but they’re still instance-identifying (and can make it easier to correlate screenshots/logs)
  • Webhook payload content: card description/comments can include real names, emails, customer info, internal links
  • Headers: sometimes people paste request logs that include Authorization / x-openclaw-token by accident

About runId specifically: it’s low sensitivity. It won’t let anyone access your system. But I’d still either truncate it (12b79750-…) or replace with RUN_ID just to avoid correlation across logs/screenshots.

If you want, paste the draft here as you plan to post it and I’ll do a quick “last pass” for accidental identifiers.

delicate tide
#

Draft Questions for Krill

  1. Trello Action ID Field Location
    You mentioned using action.id from the Trello payload. Looking at the structure:

payload = {
action: {
id: "...", // Is this the right field?
type: "updateCard",
data: { ... }
}
}
Is payload.action.id the correct field? Or is there a more specific ID nested deeper?

  1. Browser Tool Retry Limits
    The coding-agent got stuck in a loop trying to open localhost:8080:

page.goto: net::ERR_CONNECTION_REFUSED at http://localhost:8080/
Is there a way to configure retry limits for the browser tool? Or should agents implement their own timeout logic?

  1. Tracking Hook-Spawned Agents
    Hook-spawned agents show sessionKey: unknown in logs. Is there a way to:

Query agents spawned by a specific hook?
Set a custom session key (without using the rejected sessionKey field)?
Track which hook-spawned agents are currently running?
4. Agent Error Notification Pattern
We added Telegram error notifications to delivery-agent. Should this be a standard pattern for all workflow agents? Or is there a recommended global error handling approach?

Want me to send these to krill, or do you have other questions to add?

#

Draft Questions for Krill
Trello Action ID Field Location
You mentioned using action.id from the Trello payload. Looking at the structure:

payload = {
action: {
id: "...", // Is this the right field?
type: "updateCard",
data: { ... }
}
}
Is payload.action.id the correct field? Or is there a more specific ID nested deeper?

Browser Tool Retry Limits
The coding-agent got stuck in a loop trying to open localhost:8080:

page.goto: net::ERR_CONNECTION_REFUSED at http://localhost:8080/
Is there a way to configure retry limits for the browser tool? Or should agents implement their own timeout logic?

Tracking Hook-Spawned Agents
Hook-spawned agents show sessionKey: unknown in logs. Is there a way to:

Query agents spawned by a specific hook?
Set a custom session key (without using the rejected sessionKey field)?
Track which hook-spawned agents are currently running?
Agent Error Notification Pattern
We added Telegram error notifications to delivery-agent. Should this be a standard pattern for all workflow agents? Or is there a recommended global error handling approach?

#

Hello???? Krillll help please

craggy tigerBOT
#

These are good questions — and the draft is clean (no secrets). Answers inline:

1) Trello Action ID field location

Yes: payload.action.id is the right unique identifier for the webhook event/action. That’s the best dedupe key if you have it.

For robustness, I usually key on:

  • action.id (primary)
  • optionally include action.type if you’re mixing event types

Fallbacks (if you ever see payloads without action.id):

  • action.date + action.type + card.id + listAfter.id

2) Browser tool retry limits (ERR_CONNECTION_REFUSED loops)

There’s no “global retry limit” knob for the browser tool that will stop an agent from repeatedly calling page.goto—that loop is coming from the agent logic, not the tool.

Recommended pattern:

  • Treat net::ERR_CONNECTION_REFUSED as “service is down”, not “retry forever”
#
  • Implement max attempts (e.g. 1–3) + overall deadline
  • Before using browser, do a cheap readiness check (curl http://localhost:8080 / health endpoint) and only then navigate
  • If it’s supposed to be started by the agent, start it + browser wait --timeout-ms … (or equivalent wait logic) instead of spamming goto

3) Tracking hook-spawned agents + sessionKey control

A few clarifications (this one trips people up):

  • /hooks/agent runs an isolated agent turn and then posts a summary into the main session (“Hook <name>: …”).
  • Request sessionKey is rejected by default unless you enable:
  • hooks.allowRequestSessionKey: true
  • and (recommended) hooks.allowedSessionKeyPrefixes: ["hook:"]

If you want a stable, non-“unknown” session key without allowing request overrides, set:

  • hooks.defaultSessionKey: "hook:trello" (everything lands in that one session)

If you want per-card/per-action sessions, you have two sane options:

  • Enable request session keys (restricted to hook:) and send e.g. hook:trello:action:<actionId>
#

Draft looks clean (no secrets). Answers:

1) Trello Action ID field location

Yes — payload.action.id is the right “event/action id” to dedupe on. There usually isn’t a “more specific” nested ID that’s better for dedupe than the action itself.

If you want to be extra safe, key on something like:

  • action.id (best)
  • fallback: ${action.type}:${action.date}:${action.data?.card?.id}:${action.data?.listAfter?.id}

2) Browser tool retry limits (ERR_CONNECTION_REFUSED loop)

There isn’t a Gateway/browser-tool setting that will stop an agent from retrying goto forever — that loop is in the agent’s own logic.

Recommended pattern:

  • Implement max attempts + overall deadline in the agent (e.g. 3 tries over 30s)
  • Treat ERR_CONNECTION_REFUSED as “service not up” → do a quick health check (curl/health endpoint) before calling browser.goto
  • If the agent is responsible for starting the service, start it then do an explicit wait with a timeout (instead of repeated goto)
#
  • Or use POST /hooks/<name> with a mapping that sets sessionKey: "hook:trello:{{payload.action.id}}" (best if you want Gateway to own routing/templates)

To see what’s running / what ran:

  • openclaw sessions --active 60
  • openclaw logs --follow (this is the real source of truth for hook runs)

4) “Agent error notification” best practice

For a workflow pipeline: yes, having an ops/error delivery path is worth standardizing—but don’t wire it ad-hoc into each agent if you can avoid it.

Two cleaner patterns:

  • Use hook delivery controls (deliver, channel, to) so every hook-triggered run reports back to one place (your “pipeline ops” chat/channel).
  • Keep per-agent custom notifications only for truly exceptional cases (to avoid spam).

If you want to send these as a single “Krill checklist” message, I’d tweak #3 to explicitly mention the hooks.allowRequestSessionKey default rejection + the hooks.defaultSessionKey escape hatch. That’s the #1 confusion point.

#

3) Tracking hook-spawned agents (sessionKey “unknown”, querying, custom keys)

A few ground truths about /hooks/agent:

  • Hook runs are async and run as an isolated agent turn (then a summary is posted into the main session).
  • There is no first-class “list hook runs by hook name” query right now. Use:
  • openclaw logs --follow (best)
  • openclaw sessions --active 60 (see newly created sessions)

Custom session keys:

  • By default, request sessionKey is rejected unless you enable hooks.allowRequestSessionKey: true.
  • If you want a stable session key without allowing callers to set it, set:
  • hooks.defaultSessionKey: "hook:trello" (everything goes into that session)
  • If you want per-card/per-action sessions, the “safe” way is:
  • hooks.allowRequestSessionKey: true plus hooks.allowedSessionKeyPrefixes: ["hook:"], then send sessionKey: "hook:trello:action:<actionId>"

Docs: https://docs.openclaw.ai/automation/webhook

#

4) Agent error notification pattern

Good idea to standardize, but the cleanest approach is usually:

  • Centralize delivery for hook runs (set deliver/channel/to so all pipeline failures land in one “ops” chat)
  • Keep per-agent Telegram “panic” notifications only for truly critical workflows (otherwise it gets noisy fast)

If you paste the exact /hooks/agent JSON you’re sending (redacted token/URL), I can tell you which fields to add (name, deliver, channel, to, and whether to omit/keep sessionKey).

delicate tide
#

How do I update model credentials in the config? My api key got deleted somehow for the kimi model when I try to select it in the control UI and save, it just says "Error: invalid config". I ran the command openclaw config and reapplied my API key but the error persists, even after restarting the gateway.

craggy tigerBOT
#

For Kimi there are two common providers in OpenClaw, and they use different env vars — “invalid config” usually means the model you selected references an env var that the running Gateway doesn’t have.

First, can you run these and paste the output (safe to share):

openclaw --version
openclaw gateway status

In particular, gateway status will tell us if Config (cli) and Config (service) differ (super common). If they differ, your openclaw config set … edits may be going to a different config than the running service. Fix is typically:

openclaw gateway install --force

(from the same profile/env you want the service to use)
Source: https://docs.openclaw.ai/help/faq#why-does-openclaw-gateway-status-show-config-cli-and-config-service-different

Step 1: confirm which “Kimi” you’re selecting

In the Control UI, does the model ref start with:

  • moonshot/... → Moonshot (Kimi) → needs MOONSHOT_API_KEY
#
  • kimi-coding/... → Kimi Coding → needs KIMI_API_KEY
  • openrouter/... → Kimi via OpenRouter → needs OPENROUTER_API_KEY

(Just the prefix is enough; don’t paste any keys.)

Step 2: set the key in a place the Gateway service actually reads

If you run the Gateway as a service, shell env vars often don’t make it in. The most reliable options:

Option A (recommended): put it in ~/.openclaw/.env

Edit/create ~/.openclaw/.env and add one line (pick the right one):

  • MOONSHOT_API_KEY=sk-...
  • KIMI_API_KEY=sk-...
  • OPENROUTER_API_KEY=sk-...

Then restart:

#
openclaw gateway restart

Docs on env loading order: https://docs.openclaw.ai/help/environment

Option B: set it inline in config (works, but stores the secret in config)

openclaw config set env.MOONSHOT_API_KEY "sk-..." # OR env.KIMI_API_KEY / env.OPENROUTER_API_KEY
openclaw gateway restart

CLI docs: https://docs.openclaw.ai/cli/config
Provider docs (Moonshot/Kimi): https://docs.openclaw.ai/providers/moonshot

Step 3: get the real “invalid config” reason

Right after the UI fails to save, run:

#
openclaw logs --limit 200

Look for lines mentioning config validation or missing env vars (it’ll usually literally say which ${VAR} is missing).

If you paste openclaw gateway status + which prefix the model has (moonshot/ vs kimi-coding/ vs openrouter/) + the relevant log error (no secrets), I can tell you exactly what’s mismatched.

delicate tide
#

Got it working - how do I set my default agent?

craggy tigerBOT
#

Default agent is set in config on the agent entry itself:

  • agents.list[].default: true = the fallback agent when no binding matches.
  • If multiple agents have default: true, the first one wins (and you’ll get a warning). If none are set, the first agent in the list is the default. (If there’s no list, it’s effectively main.)

Do it via CLI

  1. Find your agents + indices:
openclaw config get agents.list
  1. Set the one you want (example: make index 2 the default):
openclaw config set agents.list[2].default true --json
  1. Unset/disable the old default(s) (example: index 0):
delicate tide
#

Hey Krill,

So I've been building this Agent Forge v2.1 project (WoW-inspired agent builder) and it's been genuinely fun — the Character Hall iteration just delivered with particle effects and everything.

But here's the thing: the only reason I started this project was to stress-test and refine our Idea Pipeline workflow itself. And I just hit a blocker that's exactly what we need to fix.

The Issue:

Card "Agent Forge v2.1 — Hero Selection" moved to Plans at 08:53 UTC
Webhook fired correctly (HTTP 202 from Gateway)
Coding agent spawn request sent
But the coding agent never actually started
No error message, no failure notification, just... silence
Card is now stuck in Plans (47+ minutes)
What I've verified:

Webhook logs show successful trigger
HTTP 202 response from Gateway
No coding-agent session exists
Telegram group for coding agent is empty
Other agents (planning, review, delivery) all working fine
My hunch: Either the agent ID mismatch between webhook (coding-agent) and Gateway binding, or the spawn is failing silently after accepting the request.

Can you help me diagnose why coding agent spawns are failing? I want to get the pipeline rock solid so future iterations (and other projects) flow smoothly.

craggy tigerBOT
#

That HTTP 202 is the first clue: sessions_spawn is non‑blocking — it returns “accepted” immediately, before the sub-agent actually starts/runs. So the debug move is: “accepted” → find the runId / childSessionKey → see why it never progressed / never announced.

Here are the most common “accepted → nothing happens” causes for one specific agent:

1) Agent ID / allowlist mismatch (subagents.allowAgents)

If you’re spawning under a different agent id (e.g. "coding-agent"), that only works if the requester agent allows it:

  • Config key: agents.list[].subagents.allowAgents
  • Default behavior: only the requester agent is allowed (so cross-agent spawns get blocked)

Docs confirm this allowlist (and that sessions_spawn returns accepted immediately):
subagents.allowAgents: allowlist of agent ids for sessions_spawn (["*"] = any; default: same agent only).
Source: docs/gateway/configuration-reference.md

What to check:

#
openclaw agents list --bindings

Make sure the target agent id is exactly what you’re passing (coding-agent vs coding vs coding_agent, etc).

If it’s missing from the allowlist, you’ll need something like (example):

agents: {
  list: [
    {
      id: "main",
      subagents: { allowAgents: ["coding-agent"] } // or ["*"] to test
    }
  ]
}
#

2) Nested spawning (sub-agents can’t spawn sub-agents)

If the “spawn coding agent” call is happening from inside another spawned agent (planning/review/etc), it will be denied. Sub-agents are not allowed to call sessions_spawn (no “sub-agent → sub-agent” spawning).
Source: docs/concepts/session-tool.md

So if your pipeline does: webhook → planning subagent → planning tries to spawn coding → (blocked) … that matches your symptoms perfectly.

3) You’re expecting output in the coding agent’s Telegram group (but sessions_spawn won’t do that)

sessions_spawn runs the child session with deliver:false and then posts an announce back to the requester chat, not to the child agent’s bound channel/group.

So “Telegram group is empty” is not a reliable signal that the spawn didn’t happen — you need to look for:

  • the child session existing, and/or
  • the announce message back in the originating chat

What I need from you (quick, high-signal)

Run these and paste the output (redact tokens if any appear):

#
openclaw --version
openclaw status --all
openclaw agents list --bindings
openclaw sessions --active 180
openclaw logs --limit 500

Also paste the exact 202 response body your webhook sees (redacting auth). Ideally it includes:

  • runId
  • childSessionKey

Once we have that, we can pinpoint whether this is:

  • “agentId not allowed” (subagents.allowAgents)
  • “nested spawn blocked”
  • “spawn created but announce can’t be delivered / got skipped”
  • “spawn created then failed immediately” (will show in logs around that timestamp)
delicate tide
#

The fix is straightforward, but I want to confirm:

Should I add subagents.allowAgents: ["*"] to the main agent (allow any cross-agent spawn)?
Or whitelist specific agents only (["coding-agent", "planning-agent", "review-agent", "delivery-agent"])?
Do I need to restart the Gateway after updating the config, or does it hot-reload?

craggy tigerBOT
#

Whitelist specific agents in subagents.allowAgents by default.

  • ["*"] is great as a temporary “unblock me” / debugging setting, but it’s also the easiest way to accidentally allow a compromised prompt / workflow to spawn any agent (especially if any agent has broader tools, different channel bindings, etc).
  • A tight allowlist like ["coding-agent", "planning-agent", "review-agent", "delivery-agent"] follows least-privilege and is what you want for a “rock solid pipeline.”

Two nuance points that bite people:

  1. Put subagents.allowAgents on the agent that is making the sessions_spawn call (i.e. the agent bound to the Trello/webhook route). It doesn’t have to be main unless main is the one receiving the webhook / orchestrating.

  2. This setting only controls cross-agent spawn via the agentId parameter. If you can spawn everything under the same agent id (and just use label), you may not need cross-agent spawning at all.

Restart / hot-reload

Config is hot-reloaded, but it depends on reload mode.

From the FAQ:

  • gateway.reload.mode: "hybrid" (default): hot-apply safe changes, restart for critical ones
#
  • also supports hot, restart, off

So:

  • You often won’t need a manual restart, but some changes may trigger an automatic restart under hybrid.
  • If you want to be 100% sure you’re testing the new config immediately, do a manual restart anyway:
    openclaw gateway restart
    

If after whitelisting it still “202 → silence”, the next suspect is nested spawning (a spawned planning sub-agent trying to spawn coding), which allowAgents won’t fix.

delicate tide
#

how to enable telegram agents to respond in tts automatically

craggy tigerBOT
#

Enable auto‑TTS globally, then Telegram will receive replies as voice notes (round bubble).

1) Turn on auto‑TTS in config

In openclaw.json:

{
messages: {
tts: {
auto: "always" // off | always | inbound | tagged
}
}
}

Provider options:

  • No API keys: use Edge TTS (works once TTS is on; you can also set provider: "edge" explicitly)
#
  • With keys: set provider: "openai" or "elevenlabs" and add the relevant messages.tts.openai / messages.tts.elevenlabs config.

(Full examples: ~/.npm-global/lib/node_modules/openclaw/docs/tts.md)

2) Test from Telegram

In the Telegram chat, run:

  • /tts status (see if it’s enabled)
  • /tts always (enables it for that session if your config is still off)

Notes / gotchas

  • Telegram output format is special-cased: OpenClaw sends Opus voice notes so you get the round voice bubble.
  • messages.tts.auto is global (not Telegram-only). If you only want TTS sometimes, use:
  • auto: "inbound" (only speak after inbound voice notes), or
  • auto: "tagged" and only speak when replies include [[tts]].

If you tell me which provider you want (edge/openai/elevenlabs) and whether you want Telegram-only behavior, I’ll suggest the cleanest config.

delicate tide
#

lets use openai, can you make it so it just enables for the session if I type /tts always

craggy tigerBOT
#

Yes — set TTS up to use OpenAI, but keep auto‑TTS off by default, so it only turns on when you run /tts always in that chat/session.

1) Config: OpenAI provider, but default auto: "off"

In openclaw.json:

{
  messages: {
    tts: {
      auto: "off",          // default OFF globally
      provider: "openai",   // when you turn it on, use OpenAI
      openai: {
        // prefer env var OPENAI_API_KEY if you don't want it in the file
        apiKey: "YOUR_OPENAI_API_KEY",
        model: "gpt-4o-mini-tts",
        voice: "alloy",
#
      },
    },
  },
}

(You can omit apiKey here and just set OPENAI_API_KEY in the Gateway’s environment.)

2) In Telegram: enable for this session only

In the Telegram chat/thread you want spoken replies:

  • /tts always → enables auto‑TTS for that session
  • /tts off → disables it again
  • /tts status → confirm what’s currently active

This does not flip global config; it’s a per-session toggle.

#

If /tts always seems ignored

That almost always means commands aren’t authorized for your sender. Make sure your Telegram user is allowlisted (and that commands are enabled), then retry /tts status.