#OpenClaw feels degraded: chats dull, Codex/ACP jobs timing out, context pruning maybe too aggressive

1 messages · Page 1 of 1 (latest)

novel gorge
#

I’m trying to diagnose an OpenClaw regression. Gateway is healthy, but normal chats feel much worse than before: less initiative, weak tool use, poor
continuity after a few messages, and background Codex/ACP jobs often die or timeout.

Verified locally:

  • openclaw status --deep --json: healthy, task audit 0
  • openclaw doctor --non-interactive: bootstrap truncation, active-memory issues, Codex auth warning
  • task registry shows recent ACP disconnects/timeouts and cron timeouts
  • gateway logs show active-memory timeouts and session cleanup warnings

Config differences from an older “good” backup:

  • current thinkingDefault=adaptive; old was high
  • current context pruning TTL 3m; old was 1h
  • current keepLastAssistants=1
  • current compaction reserve 110000; old was 40000
  • active-memory now enabled and timing out
  • current agent boot files are huge/truncated

Question: could these pruning/compaction/active-memory settings explain normal chats feeling lobotomized? Any recommended baseline config for keeping regular
chats sharp while still using Supabase as task/state truth?

acoustic grailBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> âś… Mark Solution

slate vortexBOT
#

Yes — that combination can absolutely make regular chats feel “lobotomized,” even if the Gateway health checks are green.

First quick sanity check: what openclaw --version is this on? Version mismatch / beta drift is a common cause of weird context behavior.

What stands out from your diff:

  • thinkingDefault: adaptive vs old high: very plausible loss of initiative/tool persistence, depending on provider. If the old good setup used high, restore that first.
  • contextPruning.ttl: 3m: too aggressive for normal chat. Pruning only trims old tool results, not normal conversation text, but with short TTL it can repeatedly strip useful evidence from the model’s working context. Docs default/example is more like 5m; your old 1h is much safer for “sharp chat”.
  • keepLastAssistants: 1: I’d raise this. Docs example uses 3; keeping only one assistant turn makes follow-up continuity brittle, especially after tool-heavy turns.
  • compaction.reserveTokens: 110000: this is the giant red flag. Reserve tokens are headroom; higher reserve means compaction triggers much earlier (contextTokens > contextWindow - reserveTokens). If the old good value was 40000, go back near that unless you have a specific model/window reason.
  • active-memory timing out: yes, disable or narrow it while debugging. A memory layer that often returns timeout/empty can add latency/noise without giving useful recall. Keep Supabase as durable task/state truth; don’t require active-memory to be on for normal chats.
  • huge/truncated agent boot files: also very plausible. If boot context is being truncated, the assistant loses operating instructions before the conversation even starts. Shrink AGENTS/SOUL/etc. or use lighter per-agent boot files.

Baseline I’d test:

#
{
  agents: {
    defaults: {
      thinkingDefault: "high",
      contextPruning: {
        mode: "cache-ttl",
        ttl: "1h",
        keepLastAssistants: 3
      },
      compaction: {
        reserveTokensFloor: 20000,
        keepRecentTokens: 50000,
        // if you explicitly set Pi reserveTokens somewhere, put it back near old-good ~40000
        truncateAfterCompaction: true,
        notifyUser: true
novel gorge
#
  1. Agency OS architecture
    What is the best-practice OpenClaw architecture for a modular agency OS with separate agents for COO/state, website builds, Google Ads, outreach, and optional ACP/Codex
    implementation?
  2. Supabase state model
    Should Supabase be used as the canonical CRM/PM ledger for accounts, contacts, deals, projects, tasks, interactions, deliverables, blockers, and receipts, while live tools
    remain authoritative for current external state?
  3. Context strategy
    What is the recommended pattern for keeping agents sharp: tiny AGENTS.md boot files plus on-demand PROTOCOL.md files, versus large all-in-one bootstrap files?
  4. Cron design
    How should OpenClaw crons be structured for reliability: deterministic sync jobs, monitor jobs, agent review jobs, and reminders? Any best practices for avoiding overlap/
    timeouts?
  5. Multi-agent website builds
    What is the best OpenClaw pattern for a high-quality website build swarm: Morgan for scope/client truth, Kit for implementation, Mika for keyword/conversion intelligence, QA/
    design/content subagents, and artifact/deploy receipts?
slate vortexBOT
#

I’d build this as a hub-and-spoke system, not one giant “agency brain”. I checked the current docs shape before answering; the important pages are multi-agent routing, workspace/context, cron, subagents, and ACP:
https://docs.openclaw.ai/concepts/multi-agent
https://docs.openclaw.ai/concepts/agent-workspace
https://docs.openclaw.ai/concepts/context
https://docs.openclaw.ai/automation/cron-jobs
https://docs.openclaw.ai/tools/subagents
https://docs.openclaw.ai/tools/acp-agents

  1. Agency OS architecture
    Best pattern: one “COO/state” agent as the orchestrator + narrow worker agents.
  • COO / Morgan: owns client truth, priorities, policies, Supabase writes, task assignment, receipts, escalation.
  • Website / Kit: owns repo implementation, deploy previews, build fixes, PR/diff summaries.
  • Ads / Mika: owns keyword/ad/conversion intelligence, campaign recommendations, landing page feedback.
  • Outreach agent: owns lead/account/contact workflows, drafts, sequencing, inbox/status sync.
  • ACP/Codex: treat as an implementation harness, not the boss. Use it for repo work, long coding sessions, tests/builds, and persistent coding threads.
#

In OpenClaw terms: use separate configured agents when they need separate workspaces/auth/session history. Each agent gets its own workspace and agentDir; don’t reuse agentDir between agents. Use subagents for bounded parallel investigations, and ACP when you specifically want Codex/Claude/Gemini-style coding harness behavior.

  1. Supabase state model
    Yes — I’d use Supabase as the canonical internal CRM/PM ledger, with one caveat:

Supabase should be canonical for “what the agency believes / decided / promised / delivered”:

  • accounts, contacts, deals
  • projects, tasks, blockers
  • interactions and notes
  • deliverables and artifacts
  • receipts / run logs / deploy IDs
  • ownership, statuses, SLAs, next actions

But live tools remain authoritative for live external state:

  • Google Ads for actual campaign/ad/account state
  • GitHub/Vercel/Netlify/etc for code/deploy reality
  • email/CRM APIs for message delivery/open/reply status
#
  • Stripe/accounting tools for actual payments/invoices

Store external IDs, last synced snapshots, hashes, timestamps, and receipts in Supabase. Don’t pretend Supabase is the source of truth for something that can change externally without you.

  1. Context strategy
    Strong recommendation: tiny boot files + on-demand protocols.

Use AGENTS.md for:

  • identity of the agent
  • hard rules/boundaries
  • where state lives
  • what files/protocols to read for each workflow
  • when to ask vs act

Use PROTOCOL.md / workflow files for:

  • website build protocol
  • Google Ads protocol
#
  • outreach protocol
  • receipt schema
  • QA checklist
  • Supabase schema notes

Avoid giant all-in-one bootstrap files. OpenClaw injects workspace files into context at session start, large files get truncated, and tool schemas/history already consume context. Big boot files make agents dull and less precise. Use /context list and /context detail to see what’s actually being injected.

  1. Cron design
    I’d split cron into four classes:
  • Deterministic sync jobs: pull external state into Supabase. Should be idempotent, small, and preferably script/tool-driven. No creative agent reasoning unless needed.
  • Monitor jobs: check for anomalies, stale blockers, failed deploys, budget spikes, overdue replies.
  • Agent review jobs: isolated agent runs that summarize, reason, and propose actions from already-synced state.
  • Reminder jobs: lightweight main-session/system-event reminders.

Reliability rules:

  • Use isolated cron for reports/background jobs; use main-session cron for simple reminders.
#
  • Use --light-context and restricted --tools for deterministic jobs.
  • Set explicit timeouts for jobs that can hang.
  • Make every job idempotent with a job_run_id and receipt row.
  • Use Supabase leases/locks to avoid overlap: if the previous run is still active, skip and write a skipped receipt.
  • Stagger schedules. OpenClaw already staggers recurring top-of-hour jobs unless --exact is used, but I’d still avoid stacking all agency jobs at 0 * * * *.
  • Separate sync from review: sync first, then review the ledger. Don’t have a model both discover state and make strategic decisions in one fragile cron.
  1. Multi-agent website build swarm
    Good pattern:
  • Morgan owns scope/client truth: brief, acceptance criteria, brand constraints, project status, client commitments.
  • Mika does keyword/conversion intelligence: audience, SERP intent, offer hierarchy, CTA strategy, objections, page structure.
  • Kit implements: repo changes, components, content integration, tests/build, deploy preview.
  • QA/design/content subagents do bounded passes: visual QA, copy QA, mobile/responsive, SEO metadata, accessibility, performance.
  • ACP/Codex handles heavy repo work when you want persistent coding state or harness-level implementation.
  • Everything writes receipts: brief version, repo commit, build output, deploy URL, QA findings, unresolved blockers, final acceptance notes.
#

The key is that Morgan should not “remember” project truth from chat. Morgan should read/write the project ledger. Kit should not invent scope. Mika should not silently change implementation. Every agent gets a narrow role, a protocol, and a receipt format.

If this is for a real agency OS, I’d start with Morgan + Supabase + Kit/ACP first, then add Mika/outreach/ads once the receipt model is boring and reliable. Boring state beats clever swarms. 🦞

novel gorge
#
  1. For the hub-and-spoke model, what should Morgan own exactly versus what should be delegated immediately to Kit/Mika/Rex? Where is the line between orchestration and doing the
    work?
  2. What should the minimum Supabase schema be for a reliable agency OS: keep the 6-table CRM/PM core, or add first-class deliverables, blockers, state_receipts, agent_runs, and
    artifacts now?
  3. What is the best receipt format for OpenClaw agents so receipts are useful for dashboards/debugging but don’t make normal chat feel robotic?
  4. For cron jobs, what should run as deterministic scripts versus isolated agent review jobs? How should OpenClaw structure locks/leases/job_run_ids in Supabase?
  5. For multi-agent website builds, what exact handoff contract should exist between Morgan, Kit, Mika, QA/design/content subagents, and ACP/Codex?
  6. When should OpenClaw use native subagents versus ACP/Codex for website/build work? What should be the default and what are the failure modes?
  7. What should the ideal tiny AGENTS.md look like for each agent, and what should move into PROTOCOL.md files?
  8. How should context cards be designed so agents stay up to date without stale Supabase tasks poisoning normal chat?
  9. What should the approval matrix be for an agency OS: safe local changes, Supabase writes, ad changes, deploys, client messages, budget changes, destructive actions?
  10. What is the best dashboard read model: direct table reads, SQL views, materialized views, or generated summary artifacts?
  11. For Mika/Google Ads, what should be canonical in Supabase versus always checked live through vorti-ads/Google Ads API?
  12. What eval suite should prove the system is fixed before adding more features: website build, Google Ads audit/action loop, life ops day plan, cron failure recovery, and state
    receipt audit?
slate vortexBOT
#

My opinionated version: Morgan should be the agency’s operating system, not a universal doer. The mistake to avoid is letting Morgan become “the agent that does everything because it has the context.” That turns into stale context soup fast.

  1. Morgan vs Kit/Mika/Rex

Morgan owns:

  • client/account/project truth
  • priorities and sequencing
  • acceptance criteria
  • approval state
  • who should do what next
  • receipts and final status
  • escalation when reality conflicts with plan

Morgan should delegate immediately when the task requires domain execution:

  • Kit: repo/build/design-system/frontend implementation
  • Mika: Google Ads, keywords, landing-page conversion intelligence
  • Rex: outreach/sales ops/inbox/lead motion, assuming Rex is the outreach agent
#
  • QA/content/design subagents: bounded review passes
  • ACP/Codex: heavy repo implementation or persistent coding sessions

Line: Morgan decides “what/why/when/done”; workers decide “how” inside their lane. Morgan can inspect outputs, reject bad work, and update state. Morgan should not quietly edit campaign budgets, rewrite code, or send outreach just because it can.

  1. Minimum Supabase schema

Don’t stop at the 6-table CRM/PM core. Add operational tables now, but keep them boring.

Minimum I’d ship:

  • accounts
  • contacts
  • deals
  • projects
  • tasks
  • interactions
  • deliverables
#
  • blockers
  • artifacts
  • agent_runs
  • state_receipts
  • job_locks or job_leases

Why: without deliverables/artifacts/receipts/runs, you’ll have pretty CRM rows but no audit trail. The whole agency OS becomes hard to debug.

Keep the schema thin, though. Don’t over-model every subtype on day one. Use metadata JSON for edge cases until patterns stabilize.

  1. Receipt format

Use two layers:

Human chat receipt: short and natural.
Example:
“Done — Kit pushed the landing page update, build passed, preview is here: <url>. Blocker: mobile hero copy still needs approval.”

#

Machine receipt: structured row in state_receipts.
Fields I’d include:

  • id
  • job_run_id
  • agent_run_id
  • actor_agent
  • receipt_type
  • account_id / project_id / task_id nullable refs
  • external_system
  • external_id
  • action
  • before_snapshot JSON
  • after_snapshot JSON
  • status: proposed | completed | failed | skipped | needs_approval
  • confidence
  • artifact_ids
  • error
#
  • created_at

Normal chat should not dump JSON unless asked. Dashboards/debugging should read the structured receipt.

  1. Cron: scripts vs isolated agent review

Deterministic scripts:

  • sync Google Ads stats/campaign metadata
  • sync GitHub/Vercel deploy status
  • sync inbox/outreach delivery state
  • refresh account/contact/project/task snapshots
  • detect overdue/stale rows
  • write receipts

Isolated agent reviews:

  • summarize weekly project health
  • explain campaign anomalies
#
  • propose next actions
  • review stale blockers
  • inspect failed deploy/build context
  • produce client-ready summaries

Supabase lock pattern:

  • each cron starts with a job_run_id
  • acquire lease row by job_key
  • lease has owner, job_run_id, locked_until, heartbeat_at, status
  • if locked_until is in future, skip and write a skipped receipt
  • job heartbeats during long work
  • release on success/failure
  • stale leases expire automatically

Every external write should include the same job_run_id in its receipt chain.

  1. Website build handoff contract
#

Morgan → Mika:

  • account/project id
  • business goal
  • audience
  • offer
  • known constraints
  • competitors/keywords if known
  • required output: conversion brief + page strategy + SEO/meta recommendations

Mika → Morgan/Kit:

  • target intent
  • page sections
  • CTA hierarchy
  • keyword/theme map
  • objections to answer
  • copy angles
  • confidence + unknowns
#

Morgan → Kit:

  • scoped build brief
  • acceptance criteria
  • source of truth links
  • required artifacts
  • out-of-scope list
  • deploy target
  • approval requirements

Kit → ACP/Codex if needed:

  • repo path
  • exact implementation task
  • files/components likely involved
  • test/build command
  • constraints
  • expected artifact: diff summary + build result
#

Kit → QA/design/content:

  • preview URL
  • acceptance criteria
  • changed areas
  • known risks
  • exact review checklist

QA/design/content → Morgan/Kit:

  • pass/fail
  • severity
  • screenshots/URLs if relevant
  • required fixes
  • “ship/no ship” recommendation

Morgan closes:

  • final artifact/deploy receipt
  • client-facing summary
#
  • blockers/next steps
  • task status update
  1. Native subagents vs ACP/Codex

Default:

  • native subagents for research, QA, content review, strategy, summarization, audits
  • ACP/Codex for codebase mutation, long implementation, persistent repo context, test/build loops

Use native subagents when the work is bounded and can finish with a written answer.
Use ACP/Codex when the work needs a coding harness, repo memory, tool-heavy iteration, or a persistent thread.

Failure modes:

  • native subagents: shallow output, missed repo nuance, too many parallel opinions, context fork bloat
  • ACP/Codex: timeouts, harness auth/env problems, stuck builds, over-editing, persistent session drift

Guardrail: Morgan/Kit should always require artifacts: diff, build log, preview URL, QA result, receipt. No artifact = not done.

#
  1. Tiny AGENTS.md vs PROTOCOL.md

Each agent’s AGENTS.md should be tiny:

  • who you are
  • what you own
  • what you must not do
  • where canonical state lives
  • when to read which protocol
  • approval rules
  • receipt requirement

Move into PROTOCOL.md:

  • website build process
  • Google Ads audit/action process
  • outreach process
  • cron sync/review process
  • receipt schema
#
  • handoff contracts
  • QA checklists
  • dashboard/read-model notes

Tiny AGENTS.md keeps the model sharp. Protocol files are read on demand.

  1. Context cards

Use context cards as fresh summaries, not task dumps.

A good card has:

  • account/project id
  • current objective
  • active deliverables
  • top 3 open blockers
  • latest relevant receipts
  • explicit stale timestamp
#
  • “do not rely past” freshness boundary
  • links/ids for deeper fetches

Avoid injecting every open task into every chat. That poisons normal conversation. Instead: Morgan fetches the relevant project card when the conversation is about that project, and agents pull details on demand.

  1. Approval matrix

Safe without approval:

  • local read-only inspection
  • draft generation
  • internal summaries
  • non-destructive local edits in a branch/worktree
  • Supabase draft/proposed rows

Usually allowed with receipt:

  • Supabase status/task updates
  • internal artifacts
#
  • deterministic sync writes
  • QA findings

Needs human approval:

  • client-visible messages
  • deploy to production
  • ad campaign changes
  • budget/bid changes
  • outreach send/start sequence
  • deleting/archiving client records
  • destructive repo operations
  • anything irreversible or spend-affecting

Emergency rule: if it spends money, contacts a human, changes production, or deletes state, ask first.

  1. Dashboard read model
#

Use layered reads:

  • direct tables for admin/debug
  • SQL views for normal dashboard pages
  • materialized views for expensive rollups
  • generated summary artifacts for narrative/client-facing reports

Don’t make the dashboard parse chat transcripts. Dashboards should read tables/views/receipts. Narrative summaries are artifacts with provenance, not the primary database.

  1. Mika / Google Ads canonical split

Canonical in Supabase:

  • account/client mapping
  • campaign intent/strategy
  • proposed changes
  • approvals
  • budget policy/guardrails
  • historical synced snapshots
#
  • audit findings
  • action receipts
  • experiment notes
  • landing page recommendations

Always check live via vorti-ads / Google Ads API:

  • current campaign/ad group/ad status
  • live budgets/bids
  • spend/conversions/CPA/ROAS
  • disapprovals/policy status
  • search terms/performance freshness
  • whether a proposed change is still valid

Supabase is the ledger. Google Ads is the live machine.

  1. Eval suite before more features
#

Yes, prove these five before expanding:

Website build eval:

  • Morgan scopes
  • Mika gives conversion brief
  • Kit implements
  • ACP/Codex builds/tests if needed
  • QA catches issues
  • receipts/artifacts/deploy URL are correct

Google Ads audit/action loop:

  • sync live state
  • Mika audits
  • proposes action
  • approval gate triggers
  • approved change writes live
  • receipt captures before/after
#

Life ops/day plan:

  • Morgan turns messy tasks/calendar/context into a sane plan
  • no stale task hallucinations
  • clear next actions

Cron failure recovery:

  • lock prevents overlap
  • timeout writes failed receipt
  • next run recovers/skips safely
  • dashboard shows accurate status

State receipt audit:

  • for any dashboard claim, you can trace back to receipts/artifacts/external IDs
  • failed/skipped/needs approval states are visible
  • no “agent said it happened” without a receipt

If those pass, then add features. If they don’t, more agents will just create more expensive confusion. Boring ledger, crisp protocols, narrow agents — that’s the architecture. 🦞

novel gorge
#
  1. Normal Chat Intelligence
    How should OpenClaw be configured so regular chats stay sharp, autonomous, tool-using, and context-aware without turning every message into Supabase bookkeeping? and nonstop wrestling prompting back and forth
  2. Automatic Ingestion
    What is the best pattern for automatically ingesting useful chat events into Supabase or memory without poisoning context with raw transcripts, stale tasks, or low-
    value noise?
  3. Memory Pulling
    How should agents decide when to pull memory, skills, protocol files, Supabase context cards, live APIs, or repo files? What should be automatic vs explicit?
  4. Autonomous Execution
    What is the best-practice loop for letting Morgan autonomously inspect, plan, execute, verify, and write receipts without asking David for every obvious safe step?
  5. Self-Audit
    How should OpenClaw audit itself daily: failed jobs, stale tasks, missing receipts, broken crons, stale context cards, old blockers, auth expiry, bad memory, dashboard drift,
    and task truth mismatch?
slate vortexBOT
#

The trick is to separate chat intelligence from state hygiene. Normal chat should feel like a sharp assistant with tools, not like every message is being notarized into Supabase.

Docs worth keeping open while designing this:
https://docs.openclaw.ai/concepts/context
https://docs.openclaw.ai/concepts/memory
https://docs.openclaw.ai/concepts/agent-workspace
https://docs.openclaw.ai/automation
https://docs.openclaw.ai/automation/cron-jobs

  1. Normal chat intelligence

Default chat loop should be:

  • answer naturally
  • use tools when the answer depends on live state
  • pull only relevant context
  • write durable state only when something actually changed
#
  • keep receipts out-of-band

Do not make Morgan run a Supabase ceremony on every message. That’s how you get dull, hesitant chat.

I’d put this in Morgan’s tiny AGENTS.md:

In normal chat, optimize for useful forward motion. Do not write Supabase/memory for every message. Write state only for decisions, commitments, completed work, changed task/project status, approvals, blockers, or durable user preferences. For casual discussion, answer directly.

Also: keep boot files small. OpenClaw injects workspace files into context; large boot files get truncated and can make the model mushy. Use /context list and /context detail to inspect what’s actually being loaded.

  1. Automatic ingestion

Use an ingestion filter, not raw transcript sync.

Good ingestion candidates:

  • explicit decisions
  • approvals/denials
#
  • client commitments
  • new tasks/projects/blockers
  • status changes
  • deliverable links
  • external IDs
  • durable preferences
  • “remember this” moments
  • errors/failures that affect operations

Bad ingestion candidates:

  • every chat line
  • brainstorming without commitment
  • transient opinions
  • duplicate status chatter
  • raw tool logs
  • stale task dumps
#

Pattern:

  • chat happens normally
  • lightweight classifier/standing instruction marks “state-worthy” events
  • event becomes a compact structured candidate
  • Morgan or a deterministic hook writes only the candidate to Supabase/memory
  • raw transcript remains in session history, not the operating ledger

If unsure, write to a state_candidates/inbox table for review instead of poisoning canonical state.

  1. Memory/context pulling

Use a retrieval ladder:

  • Skills: automatic only when the task clearly matches a skill description. Otherwise don’t read them.
  • Protocol files: read when entering a workflow: website build, ads audit, outreach, cron repair, deploy, QA.
  • Memory search: pull when the question depends on prior preferences, decisions, people, recurring projects, or “what did we decide before?”
  • Supabase context card: pull when discussing a known account/project/task/deal, or before changing state.
#
  • Live APIs: pull before claiming current external reality: ads status, deploy status, inbox state, repo state, payments, etc.
  • Repo files: pull before implementation or code review. Don’t rely on memory for code.

Automatic should be lightweight: memory search/context card/live check when obviously needed. Explicit should be required for big workflows, writes, production changes, ad changes, client sends, and destructive actions.

  1. Autonomous Morgan loop

Morgan should be allowed to do safe obvious steps without asking David every time.

Best loop:

  1. Inspect: fetch relevant context card, memory, live state, repo/API status as needed.
  2. Classify: safe local/internal vs approval-required.
  3. Plan briefly: only enough to avoid thrashing.
  4. Execute safe steps: draft, inspect, sync, update internal task status, create proposed rows, run read-only checks.
  5. Verify: test/build/API re-read/dashboard query.
  6. Receipt: structured receipt for state-changing work.
#
  1. Report naturally: short human summary, not JSON.

Approval boundary:

  • safe to do: reads, drafts, internal proposed rows, deterministic syncs, non-destructive branch/worktree edits, status cleanup with receipt
  • ask first: client-visible messages, live ad changes, production deploys, budget changes, destructive deletes, irreversible actions, anything that spends money

The prompt should say: “Do not ask for permission for reversible internal housekeeping. Do ask before external/irreversible/spend/client-visible actions.”

  1. Daily self-audit

Make this an isolated cron review job, fed mostly by deterministic checks.

Deterministic audit scripts should check:

  • failed/stuck cron runs
  • stale job leases
  • missing receipts for completed tasks
  • tasks with no owner/next action
#
  • blockers older than threshold
  • context cards older than source receipts
  • dashboard rows that disagree with canonical tables
  • auth expiry / provider failures
  • deploy/build statuses
  • Google Ads sync freshness
  • agent_runs that timed out or never wrote a terminal receipt

Then an isolated Morgan review job should summarize:

  • what is broken
  • what was auto-fixed
  • what needs approval
  • what needs human attention
  • top 3 operational risks

Write audit receipts too. The audit itself should be auditable.

#

Recommended cadence:

  • deterministic syncs: frequent, small, idempotent
  • self-audit: daily isolated cron
  • weekly deeper review: isolated agent report
  • heartbeat: only lightweight awareness/reminders, not full agency bookkeeping

My strong recommendation: make normal chat fast and context-aware, make ingestion selective, and make the ledger boring. If every message becomes state work, Morgan will feel like paperwork. If every state change gets a receipt, the dashboard will still be trustworthy. That’s the balance. 🦞

novel gorge
#
  1. Self-Healing
    Which failures should OpenClaw auto-fix without approval, and which should only create a blocker? Example: retry failed sync, refresh context card, restart stuck job, clean
    stale lock, repair missing receipt, re-auth needed.
  2. Overnight Opportunity Finder
    How should an overnight agent scan Supabase, crons, repos, client accounts, Google Ads, Gmail, and artifacts to find automations or process improvements it can create the next
    day?
  3. Automation Backlog
    What schema/process should capture “this should be automated” ideas so the system can rank them by ROI, frequency, risk, and implementation difficulty?
  4. Skill Usage
    How should agents discover and use skills automatically without bloating bootstrap context or forgetting domain protocols?
  5. Context Cards
    What should a perfect context card contain so Morgan can stay current without injecting every task into every conversation?
  6. Safe Autonomy
    What permission model lets the system feel powerful while staying safe: auto-run local edits/tests/scripts/read-only API calls, but require approval for sends, spend,
    destructive changes, and production deploys?
  7. Agent Self-Improvement
    How should agents update their own protocols/lessons after mistakes without creating instruction bloat or contradictory rules?
  8. Quality Gates
    What automated evals should run nightly to prove Morgan, Kit, Mika, Rex, crons, memory, and Supabase are still working correctly?
  9. Client Website Swarm
    How should OpenClaw autonomously run a website build swarm overnight or in the background: scope, design, copy, implementation, QA, deploy preview, receipt, and next-day human
    review?
  10. Google Ads Autopilot
    What should Mika be allowed to do autonomously overnight: read-only audits, negative keyword candidates, landing page recommendations, budget pacing checks, conversion tracking
    checks, report drafts?
slate vortexBOT
#
  1. Self-healing

Auto-fix without approval when the action is internal, reversible, idempotent, and does not contact clients/spend money/change production:

  • retry failed read/sync with backoff
  • refresh stale context card from canonical state
  • clean expired/stale lock after locked_until
  • mark a stuck job as timed out if runtime evidence says it is dead
  • regenerate missing dashboard summary from receipts
  • repair a missing receipt only if the underlying evidence is deterministic: run log, external ID, deploy ID, commit SHA, API response
  • reopen/annotate stale tasks/blockers based on rules

Create blocker / ask human when:

  • re-auth is needed
  • external API permissions changed
  • Google Ads write failed or policy state changed
#
  • receipt cannot be reconstructed from evidence
  • production deploy is stuck/ambiguous
  • data conflict exists between Supabase and live source
  • repeated retry failure crosses threshold
  • anything requires spend, send, delete, production mutation, or client-visible action

Rule: auto-heal mechanics; escalate judgment.

  1. Overnight opportunity finder

Run it as an isolated cron, not heartbeat. Feed it mostly structured state, not raw chats.

Inputs:

  • Supabase: stale tasks, repeated blockers, repeated manual actions, missing receipts
  • crons: failed/skipped/slow jobs, noisy jobs, jobs with frequent manual intervention
  • repos: TODOs, flaky tests, repeated build failures, deploy friction
  • clients/projects: recurring deliverable types, overdue approvals, common handoff delays
#
  • Google Ads: repeated audit findings, pacing issues, missing negatives, tracking gaps
  • Gmail/outreach: repeated reply patterns, unanswered threads, manual follow-up opportunities
  • artifacts: reports that are manually recreated often

Output should be automation candidates, not code changes:

  • problem observed
  • evidence links/ids
  • proposed automation
  • expected ROI
  • frequency
  • risk
  • implementation difficulty
  • suggested owner
  • approval needed?
  • next smallest safe step
  1. Automation backlog schema/process
#

Create automation_candidates:

  • id
  • title
  • description
  • source: cron | chat | repo | ads | gmail | manual | audit
  • evidence JSON: receipt ids, task ids, URLs, logs
  • affected_account_id / project_id nullable
  • frequency_score
  • time_saved_score
  • revenue_impact_score
  • risk_score
  • implementation_difficulty
  • confidence
  • status: candidate | approved | in_progress | shipped | rejected | snoozed
  • owner_agent
  • proposed_protocol_change nullable
#
  • created_by_agent_run_id
  • created_at / updated_at

Ranking formula can be simple:
priority = (frequency + time_saved + revenue_impact + confidence) - (risk + difficulty)

Process:

  • nightly finder creates candidates
  • Morgan dedupes/merges
  • human approves high-risk candidates
  • Kit/ACP implements only approved ones
  • shipped automation gets receipts + eval coverage
  1. Skill usage

Use skills like lazy-loaded playbooks.

#

Bootstrap should only say:

  • which skills/protocols exist
  • when to read them
  • never paste the whole protocol into AGENTS.md

Automatic discovery:

  • if task clearly matches a skill description, read exactly that skill
  • if multiple match, pick the most specific
  • if none match, don’t read one

Domain protocols should live beside the agent workspace, e.g.:

  • protocols/website-build.md
  • protocols/google-ads.md
  • protocols/outreach.md
  • protocols/receipts.md
  • protocols/self-audit.md
#

Tiny AGENTS.md should contain routing rules: “For Google Ads work, read protocols/google-ads.md before acting.” Not the whole Google Ads protocol.

  1. Perfect context card

A context card should be a fresh, compact operational snapshot:

  • card id/version
  • account/project/deal ids
  • freshness timestamp + source receipt ids
  • current objective
  • current phase/status
  • owner + responsible agents
  • active deliverables, max 3-5
  • active blockers, max 3-5, with age/severity
  • recent decisions/approvals
  • latest artifacts: deploy URL, report, doc, campaign id, commit SHA
  • next recommended action
#
  • known risks
  • “must check live before” fields: ads budget, deploy status, inbox, repo state
  • stale-after timestamp

Do not include every task. Include pointers for deeper fetch. Cards should prevent stale context, not become another stale context dump.

  1. Safe autonomy permission model

Four tiers:

Green / auto-run:

  • read-only API calls
  • local inspection
  • tests/builds/lints
  • draft generation
  • branch/worktree edits
  • Supabase proposed/internal rows
#
  • deterministic syncs
  • refresh context cards

Yellow / auto-run with receipt:

  • internal task/status updates
  • blocker creation/resolution when evidence is clear
  • regenerate artifacts/summaries
  • retry jobs
  • non-production preview deploys if already authorized for that project

Orange / approval required:

  • client-visible sends
  • outreach sends/sequences
  • Google Ads changes
  • budget/bid changes
  • production deploys
  • changing acceptance criteria
#
  • deleting/archiving canonical records

Red / never autonomous:

  • destructive external deletes
  • bypassing policy/moderation/approval
  • credential exfiltration
  • disabling safety checks to “get unstuck”
  1. Agent self-improvement

Use a lessons pipeline, not direct prompt accretion.

After a mistake:

  • write a lesson_candidate with evidence
  • classify: bug, protocol gap, missing eval, memory issue, bad tool assumption
  • propose smallest protocol change
  • run/attach eval if possible
#
  • human approval for AGENTS.md/SOUL-level changes
  • merge into protocol only if it prevents recurrence

Keep AGENTS.md tiny. Put lessons in protocol changelogs or LESSONS.md, then periodically distill. Delete/merge old contradictory rules. Instruction bloat is real; every new rule should earn its place.

  1. Nightly quality gates

Run a deterministic + agent eval suite:

Morgan:

  • can fetch correct context card
  • does not hallucinate stale tasks
  • writes receipts for state changes
  • respects approval matrix

Kit:

  • can build/test sample repo
#
  • produces diff/build/deploy receipts
  • does not mutate outside scope

Mika:

  • live-read Ads state
  • produces audit without writing live changes
  • flags approval-required actions
  • validates conversion tracking/pacing checks

Rex:

  • drafts outreach without sending
  • respects suppression/approval rules
  • logs interaction candidates correctly

Crons:

  • lock/lease behavior
  • timeout behavior
#
  • skipped-run receipt
  • stale lock cleanup
  • failure notification

Memory/context:

  • relevant memory is retrieved
  • irrelevant stale tasks are not injected
  • context cards expire correctly

Supabase:

  • dashboard read model matches receipts
  • no completed task without terminal receipt
  • no artifact without owner/source
  1. Client website swarm

Overnight/background flow:

#
  1. Morgan selects approved project only.
  2. Morgan refreshes project context card and confirms scope/approval boundary.
  3. Mika produces conversion/keyword/page brief.
  4. Content subagent drafts page copy/structure.
  5. Design subagent creates design direction/checklist.
  6. Kit plans implementation and either edits directly or spawns ACP/Codex.
  7. ACP/Codex implements in branch/worktree, runs build/tests.
  8. Kit creates preview deploy only if preview deploy is pre-authorized.
  9. QA subagents run mobile/accessibility/content/SEO checks.
  10. Kit fixes safe issues.
  11. Morgan writes receipts and next-day review summary.

Overnight should stop at preview + recommendation, not production/client send, unless there is explicit standing approval.

Next-day human review should include:

  • preview URL
  • changed files/commit SHA
#
  • build/test result
  • QA summary
  • unresolved blockers
  • approval buttons/next actions if available
  1. Google Ads autopilot

Mika can autonomously do read-only and proposal work overnight:

Allowed:

  • read-only account/campaign audits
  • budget pacing checks
  • conversion tracking checks
  • policy/disapproval checks
  • search term analysis
  • negative keyword candidates
  • landing page recommendations
#
  • ad copy/extension drafts
  • anomaly detection
  • report drafts
  • Supabase proposed changes + receipts

Not allowed without approval:

  • applying negatives live
  • budget changes
  • bid strategy changes
  • campaign/ad/ad group status changes
  • new ads live
  • removing keywords
  • changing conversion actions
  • client-facing reports/sends

Best default: Mika writes proposed_ads_actions with evidence, confidence, expected impact, and risk. Human approves. Then a deterministic/applied job executes and writes before/after receipts.

#

The core shape stays the same everywhere: agents may inspect, draft, propose, verify, and record. They need approval to spend, send, delete, or ship to production. That’s how it feels powerful without becoming terrifying. 🦞

novel gorge
#
  1. Google Ads Autopilot
    What should Mika be allowed to do autonomously overnight: read-only audits, negative keyword candidates, landing page recommendations, budget pacing checks, conversion tracking
    checks, report drafts?
  2. Life Ops Autopilot
    How should Morgan handle life ops automatically: calendar, tasks, Garmin, finances, reminders, daily plan, blockers, and “what matters today” without nagging or hallucinating
    stale tasks?
  3. Dashboard + Receipts
    How should dashboards, receipts, artifacts, and live external IDs be linked so every claim can be traced without making chat robotic?
  4. Anti-Stupidity Guardrails
    What are the top OpenClaw patterns that make agents act dumb: giant boot files, stale memory, too many tasks injected, overactive summarization, broad crons, ACP misuse,
    missing receipts? What is the fix for each?
  5. Nightly Agency Brain
    What should the nightly “agency brain” produce every morning: automation candidates, client risks, revenue opportunities, broken systems, priority queue, suggested agent tasks,
    and ready-to-approve actions?
  6. Simple Final Architecture
    Given all of this, what is the simplest robust architecture for an autonomous agency OS that feels magical in chat but remains auditable, modular, safe, and easy to debug?
slate vortexBOT
#
  1. Google Ads Autopilot

Mika’s overnight autonomy should be read/analyze/propose, not mutate live ads.

Allowed overnight:

  • read-only account/campaign/ad group/ad/search term audits
  • budget pacing checks
  • conversion tracking checks
  • policy/disapproval checks
  • anomaly detection
  • negative keyword candidates
  • landing page recommendations
  • ad copy / extension drafts
  • report drafts
  • Supabase proposed action rows with evidence

Not allowed without approval:

#
  • applying negatives live
  • budget/bid changes
  • pausing/enabling campaigns, ads, ad groups, keywords
  • changing conversion actions
  • launching new ads
  • sending client reports

Best shape: Mika writes proposed_ads_actions, each with evidence, expected impact, risk, confidence, live external IDs, and “approval required”. A separate deterministic applier executes approved actions and writes before/after receipts.

  1. Life Ops Autopilot

Morgan should act like a calm chief of staff, not a nag bot.

Autonomous:

  • read calendar/tasks/Garmin/finance summaries
  • produce daily plan
  • detect conflicts/blockers
#
  • suggest priorities
  • draft reminders
  • update internal task status when evidence is clear
  • surface “what matters today”

Avoid hallucinated stale tasks by using freshness boundaries:

  • calendar: check live before scheduling claims
  • finance: check live before balance/spend claims
  • Garmin: check latest sync timestamp
  • tasks: only trust tasks with fresh receipts or live source links
  • context cards expire

Morning output should be small:

  • today’s top 3
  • hard calendar constraints
  • health/energy note if useful
  • blockers
#
  • one suggested next action

No endless “checking in” unless a reminder was explicitly requested.

  1. Dashboard + Receipts

Use chat for human summaries; use receipts for truth.

Link chain:

  • dashboard claim → SQL view/materialized view → canonical table row → receipt → artifact/live external ID

Example:

  • “Landing page shipped” in dashboard
  • project row says status=preview_ready
  • latest receipt says deploy_preview_created
  • artifact row stores preview URL, commit SHA, build log
  • external ID points to Vercel/GitHub/etc
#

Chat should say:
“Preview is ready: <url>. Build passed. Two QA notes remain.”

Receipt stores the machine detail. Don’t dump JSON into normal chat unless debugging.

  1. Anti-stupidity guardrails

Common dumb-making patterns + fixes:

  • Giant boot files → tiny AGENTS.md, lazy protocol reads
  • Stale memory → freshness timestamps, memory search only when relevant
  • Too many tasks injected → context cards with top blockers/actions only
  • Overactive summarization → keep source receipts/artifacts; summaries are views, not truth
  • Broad crons → split sync/review/reminder jobs; make them idempotent
  • ACP misuse → use ACP/Codex for repo mutation, not every strategy task
  • Missing receipts → require terminal receipt for every state-changing run
  • No live checks → check APIs before current-state claims
#
  • Too much autonomy → approval matrix: spend/send/delete/production require approval
  • Too many agents → Morgan orchestrates; workers stay narrow
  • Context fork bloat → isolated subagents by default; fork only when needed
  • Dashboard from chat → dashboard reads tables/views/receipts, never vibes
  1. Nightly agency brain

Every morning it should produce one clean operating brief:

  • top client risks
  • revenue opportunities
  • broken systems / failed jobs
  • stale blockers
  • automation candidates
  • priority queue for the day
  • suggested agent tasks
  • ready-to-approve actions
#
  • overnight artifacts created
  • what was auto-fixed
  • what needs human approval

Keep it ranked and short. The real detail lives behind links/IDs.

I’d structure it as:

  1. “Needs your approval”

  2. “Risks today”

  3. “Best opportunities”

  4. “Work queue”

  5. “System health”

  6. “Automation candidates”

  7. Simple final architecture

#

The simplest robust architecture:

  • Morgan = COO/orchestrator/state owner
  • Supabase = canonical agency ledger
  • live tools = canonical external reality
  • Kit = website/build implementation
  • Mika = ads/conversion intelligence
  • Rex = outreach/sales ops
  • ACP/Codex = coding harness for repo mutation
  • native subagents = bounded research/QA/review workers
  • cron = syncs, audits, reminders, reviews
  • receipts/artifacts = audit trail
  • context cards = small fresh snapshots
  • protocols = lazy-loaded workflow instructions
  • tiny AGENTS.md = role/routing/safety only

Normal chat path:

#

User asks → Morgan answers naturally → pulls memory/card/live API only if needed → delegates only when useful → writes receipts only for real state changes.

Background path:
Deterministic syncs update ledger → isolated review jobs reason over ledger → nightly brain produces brief → humans approve risky actions → workers execute → receipts close the loop.

Safety rule:
Agents can inspect, draft, test, propose, sync, and create internal artifacts autonomously. They need approval to spend, send, delete, alter live ads, or ship production.

That’s the whole system: magical chat on top, boring ledger underneath, narrow agents around the edges. Boring underneath is what makes the magic safe. 🦞

novel gorge
#

Out of all of this what is the biggest imediate changes that make my openclaw go beast mode and oen shot a bunch of deliverables while i fix the system in the backround so I can get work done for my clients full codex gpt almost autonomus churning out high quality speced revsions optmizations improvents for websites google ads and connect with the command center we have alot of this stuf fbuilt out already keep you output direct and to the point

slate vortexBOT
#

Biggest immediate “beast mode” changes:

  1. Stop trying to perfect the whole agency OS first
    Use a delivery lane now:
  • Morgan = scope + acceptance criteria + final review
  • Codex/ACP = implementation engine
  • Kit = build/test/deploy-preview coordinator
  • Mika = ads/conversion recommendations only, no live ad writes yet
  • Command Center = dashboard/receipts, not the thing every chat depends on
  1. Make every client task a tight spec packet
    Before Codex touches anything, Morgan should create:
  • goal
  • exact files/pages/account involved
  • acceptance criteria
  • out-of-scope list
  • test/build command
#
  • expected artifact: diff, preview URL, notes, receipt

Bad prompt: “improve the site”
Good prompt: “Improve hero + CTA clarity on /services for roofing leads. Keep brand colors. Build must pass. Produce preview URL and QA notes.”

  1. Use Codex/ACP for repo mutation, native agents for review
    Default split:
  • ACP/Codex: code changes, tests, build loops, repo edits
  • native subagents: copy QA, SEO review, conversion critique, mobile/accessibility review

Don’t make native agents do heavy repo implementation. Don’t make Codex do broad strategy.

  1. Create a one-shot deliverable workflow
    For each website deliverable:
  2. Morgan writes spec
  3. Mika gives conversion/SEO brief
  4. Codex implements
#
  1. Kit runs build/test
  2. QA subagents review preview
  3. Codex fixes issues
  4. Morgan summarizes + writes receipt

That’s the machine. Repeat it.

  1. Keep Supabase/receipts minimal while shipping
    For now, only write:
  • project/task status
  • artifact URL
  • commit/branch
  • preview URL
  • build result
  • blocker
  • final receipt
#

Do not force every chat/planning thought into Supabase. That will slow you down.

  1. Use approval gates only where they matter
    Auto-run:
  • local edits
  • branch/worktree changes
  • builds/tests
  • preview deploys if already allowed
  • internal receipts
  • report drafts

Require approval:

  • production deploy
  • client send
  • Google Ads live changes
  • budget changes
  • destructive deletes
#
  1. For Google Ads right now: Mika proposes, human approves
    Let Mika churn out:
  • audit findings
  • negative keyword candidates
  • landing page fixes
  • budget pacing warnings
  • conversion tracking issues
  • report drafts

Do not let it apply live changes yet. Get value from recommendations immediately without risking spend.

  1. Fix context dullness fast
    Do this now:
  • shrink giant AGENTS.md / bootstrap files
  • move workflows into PROTOCOL.md
  • stop injecting huge task lists
  • use context cards per project
#
  • run /context list and remove anything bloated
  • use fresh sessions for big deliverables
  1. Fix Codex timeout pain operationally
    Until the system is cleaner:
  • make tasks smaller
  • one branch/worktree per deliverable
  • require build/test command in prompt
  • ask Codex for incremental commits/checkpoints
  • don’t send huge agency context into coding jobs
  • keep Codex focused on repo + spec only
  1. Immediate operating mode
    Morning:
  • pick 3 client deliverables
  • Morgan writes specs
  • spawn Codex/ACP jobs
#
  • run QA subagents on outputs
  • ship preview links + receipts

Background:

  • slowly repair Supabase, crons, command center, evals

The fastest path is not “fully autonomous agency OS.” It’s a repeatable deliverable factory: tight specs → Codex implementation → QA swarm → preview/artifact/receipt → human approval for risky stuff.

If you want, paste one real client deliverable and I’ll turn it into the exact Morgan → Mika → Codex → QA handoff prompt.

novel gorge
#

what about the config settings

slate vortexBOT
#

First: paste openclaw --version before you actually apply this, because config keys/behavior have been moving. But for “beast mode delivery lane”, these are the big knobs I’d tune.

Core goals:

  • longer timeouts for Codex/ACP
  • less aggressive context pruning
  • small bootstrap injection
  • allow subagent/ACP concurrency, but not infinite chaos
  • keep Discord/thread-bound work usable

Suggested starting config shape:

{
  agents: {
    defaults: {
      // Give real work time to finish.
#
      timeoutSeconds: 1800,
      thinkingDefault: "medium",
      verboseDefault: "off",
      maxConcurrent: 3,

      // Keep boot context lean. Fix files first; don't just raise this forever.
      bootstrapMaxChars: 8000,
      bootstrapTotalMaxChars: 30000,

      // If chats feel dull / context gets nuked, relax pruning while debugging.
      contextPruning: {
        mode: "cache-ttl",
        ttl: "2h",
        keepLastAssistants: 6,
        minPrunableToolChars: 100000,
#
        softTrimRatio: 0.5,
        hardClearRatio: 0.8,
        softTrim: { maxChars: 8000, headChars: 2500, tailChars: 2500 },
        hardClear: { enabled: true, placeholder: "[Old tool result content cleared]" },
      },

      subagents: {
        allowAgents: ["*"],
        maxConcurrent: 6,
        runTimeoutSeconds: 1800,
        archiveAfterMinutes: 240,
        // Optional: use cheaper workers for QA/research, but use strong model if quality is bad.
        // model: "provider/model"
      },
    },
#
  },

  acp: {
    enabled: true,
    dispatch: { enabled: true },
    backend: "acpx",
    defaultAgent: "codex",
    allowedAgents: ["codex", "main", "kit"],
    maxConcurrentSessions: 3,
    runtime: { ttlMinutes: 240 },
    stream: {
      deliveryMode: "final_only",
      maxOutputChars: 50000,
      maxSessionUpdateChars: 1000,
      repeatSuppression: true,
#
    },
  },

  plugins: {
    entries: {
      acpx: {
        enabled: true,
        config: {
          timeoutSeconds: 1800,
          probeAgent: "codex",
        },
      },
    },
  },
}
#

Also if you’re using Discord threads for work sessions, make sure thread bindings are enabled for subagents/ACP in your Discord config. Exact shape can depend on version, but the relevant settings are around:

channels: {
  discord: {
    threadBindings: {
      enabled: true,
      spawnSubagentSessions: true,
      spawnAcpSessions: true,
      idleHours: 12,
      maxAgeHours: 72,
    },
  },
}

My immediate priorities:

#
  1. Raise ACP timeout
openclaw config set plugins.entries.acpx.config.timeoutSeconds 1800
  1. Raise normal agent timeout
openclaw config set agents.defaults.timeoutSeconds 1800
  1. Set subagent timeout
openclaw config set agents.defaults.subagents.runTimeoutSeconds 1800
  1. Relax context pruning or temporarily disable it while testing
#
openclaw config set agents.defaults.contextPruning.mode off

If disabling pruning makes chats feel smart again, re-enable it later with less aggressive thresholds.

  1. Restart after config changes
openclaw gateway restart
  1. Verify
openclaw status --all
openclaw doctor
/acp doctor
#
/context list
/context detail

Most important non-config fix: shrink the workspace boot files. If AGENTS.md, TOOLS.md, or huge protocol docs are injected every turn, no config setting will fully save you. Put workflows into protocol files and have agents read them on demand.

Short version: increase ACP/subagent timeouts, relax pruning, cap concurrency around 3 ACP / 6 subagents, keep boot context small, use final-only ACP streaming, and route big website jobs into Codex threads with tight specs.

novel gorge
#

I like never want teh acp to time out it can last 20 minutes if it needs to same with all the other agents im using telegram groups with topics yea disable it i manually compact context I also want the acp to be able to stream everything and use skills etc

slate vortexBOT
#

Yep — if your target is “let long jobs cook”, set it up like this.

Use 20 min = 1200s as the sane default. I would not truly make everything infinite; infinite jobs create zombie runs and stuck bindings. But 20–30 min is fine for beast-mode Codex work.

Config direction:

{
  agents: {
    defaults: {
      timeoutSeconds: 1200,

      // You said you manually compact: disable automatic pruning.
      contextPruning: { mode: "off" },

      subagents: {
#
        allowAgents: ["*"],
        maxConcurrent: 6,
        runTimeoutSeconds: 1200,
        archiveAfterMinutes: 240,
      },
    },
  },

  acp: {
    enabled: true,
    dispatch: { enabled: true },
    backend: "acpx",
    defaultAgent: "codex",
    allowedAgents: ["codex", "kit", "main", "openclaw"],
    maxConcurrentSessions: 3,
#
    runtime: { ttlMinutes: 240 },
    stream: {
      deliveryMode: "live",
      coalesceIdleMs: 100,
      maxChunkChars: 4000,
      maxOutputChars: 200000,
      maxSessionUpdateChars: 10000,
      repeatSuppression: false,
    },
  },

  plugins: {
    entries: {
      acpx: {
        enabled: true,
#
        config: {
          timeoutSeconds: 1200,
          probeAgent: "codex",
        },
      },
    },
  },

  channels: {
    telegram: {
      threadBindings: {
        enabled: true,
        spawnAcpSessions: true,
        spawnSubagentSessions: true,
        idleHours: 24,
#
        maxAgeHours: 72,
      },
    },
  },
}

Commands for the key bits:

openclaw config set agents.defaults.timeoutSeconds 1200
openclaw config set agents.defaults.contextPruning.mode off
openclaw config set agents.defaults.subagents.runTimeoutSeconds 1200
openclaw config set plugins.entries.acpx.config.timeoutSeconds 1200
openclaw config set acp.stream.deliveryMode live
#
openclaw config set acp.stream.maxOutputChars 200000
openclaw config set acp.stream.maxSessionUpdateChars 10000
openclaw config set acp.stream.repeatSuppression false
openclaw config set channels.telegram.threadBindings.enabled true
openclaw config set channels.telegram.threadBindings.spawnAcpSessions true
openclaw config set channels.telegram.threadBindings.spawnSubagentSessions true
openclaw gateway restart

For Telegram topics, spawn like:

/acp spawn codex --thread here --cwd /path/to/repo

or bind the current topic/conversation:

#
/acp spawn codex --bind here --cwd /path/to/repo

Important skills note: OpenClaw agents/subagents use OpenClaw skills. Codex ACP itself may not automatically read OpenClaw skills unless the harness is an OpenClaw-style agent. For Codex, put the essential workflow in the repo/task prompt or repo AGENTS.md, and have Morgan/Kit read OpenClaw skills/protocols before handing Codex a tight spec.

Best setup:

  • Morgan/Kit: OpenClaw skills + protocols
  • Codex ACP: tight implementation spec + repo AGENTS.md + long timeout + live stream
  • QA/review: native subagents with skills

That gets you long-running, streaming, high-output Codex without turning every normal chat into context sludge.

novel gorge
#

Real issues still present:

  • Active-memory hook timeouts are recurring.
  • qmd memory searches are timing out.
  • Bootstrap files are still getting truncated: AGENTS.md, MEMORY.md, USER.md.
  • Codex OAuth expires in about 13h.
  • memory-core says managed dreaming cron could not reconcile because cron service was unavailable.
  • Doctor still flags stale/missing session transcript cleanup.
  • Gateway is healthy, but the stuff making agents feel dull is still there.
slate vortexBOT
#

Yep — these are real blockers. Before changing anything, get the version so we don’t hand you stale keys:

openclaw --version
openclaw status --all
openclaw memory status --deep

Immediate “make it sharp again” order:

  1. Disable Active Memory for now

Recurring active-memory hook timeouts are exactly the kind of thing that makes every chat feel slow/dull. Turn it off while you ship work.

openclaw config set plugins.entries.active-memory.config.enabled false
#

If that key doesn’t exist on your version, use the chat command:

/active-memory off --global

Later, re-enable with queryMode: "message" or "recent", small summaries, and a capped timeout. But right now: remove it from the critical path.

  1. Fix QMD timeout or bypass QMD

QMD search timing out means memory recall is hurting you instead of helping. Either raise timeout hard:

openclaw config set agents.defaults.memorySearch.qmd.limits.timeoutMs 120000

Or temporarily switch away from QMD/broad session search until the machine is stable. Also prewarm QMD from the same service environment if possible:

#
qmd query "test"

Docs note: QMD default timeout is low; docs explicitly say use 120000 on slower hardware. https://docs.openclaw.ai/concepts/memory-qmd

  1. Stop bootstrap truncation

This is probably the biggest “dumb agent” cause on your list.

Short-term workaround:

openclaw config set agents.defaults.bootstrapMaxChars 30000
openclaw config set agents.defaults.bootstrapTotalMaxChars 120000
#

Real fix: shrink AGENTS.md, MEMORY.md, and USER.md.

  • AGENTS.md should be tiny role/rules/routing only
  • move workflows to protocols/*.md
  • move long durable facts out of injected boot path if they don’t need every-turn context
  • use memory search / context cards on demand

Raising caps helps, but if those files are huge you’re just stuffing the model with sludge.

  1. Disable Dreaming until cron/heartbeat is fixed

Dreaming cron reconciliation failing because cron is unavailable is background noise, but it’s still broken state. Disable it until core delivery is stable:

openclaw config set plugins.entries.memory-core.config.dreaming.enabled false
#

Later fix heartbeat/cron and re-enable. Dreaming rides heartbeat; if default-agent heartbeat isn’t firing, dreaming blocks. https://docs.openclaw.ai/concepts/dreaming

  1. Codex OAuth expiring in 13h

That won’t make chats dull, but it will kill overnight work. Refresh/sign in before long jobs. For production delivery, don’t queue a 20-minute Codex swarm if OAuth is near expiry.

  1. Stale/missing session transcript cleanup

Annoying, but lower priority than memory/bootstrap. Run doctor after the memory/config cleanup:

openclaw doctor
openclaw gateway restart
openclaw status --all
  1. Then restart and verify context
#
openclaw gateway restart
/context list
/context detail
openclaw memory status --deep

My blunt recommendation: for the next 24h, run “delivery mode”:

  • Active Memory off
  • Dreaming off
  • Context pruning off/manual compact only
  • QMD either fixed to 120s or bypassed
  • Bootstrap caps raised temporarily
  • Codex OAuth refreshed
  • tight Codex specs per deliverable
#

Once client work is moving again, clean the boot files and reintroduce memory features one at a time. Right now your agents are dull because every turn is dragging broken memory/context machinery behind it.

novel gorge
#

idk i want smart context aware not cli tool

slate vortexBOT
#

Totally fair. You don’t want “run 12 CLI commands”, you want the agent to feel smart in chat.

The move is not “turn memory off forever.” It’s:

  1. Keep chat hot path fast + smart
    Normal chat should get:
  • current conversation
  • tiny AGENTS/SOUL/USER
  • relevant project/client context card
  • memory search only when obviously useful
  • live API/repo checks when claiming current state

It should not run slow active-memory/QMD/dreaming machinery on every message if that machinery is timing out.

  1. Replace giant memory injection with context cards
    For each client/project, make a compact card:
  • current goal
#
  • active deliverables
  • top blockers
  • latest decisions
  • latest artifacts/preview URLs
  • stale-after timestamp
  • links/IDs for deeper lookup

Morgan pulls the card when the chat is about that project. That gives smart context without stuffing every task/transcript into the prompt.

  1. Use “smart retrieval rules,” not manual CLI
    Put this in Morgan’s instructions:
Before answering project/client work, silently decide if extra context is needed.
Pull:
- memory for prior decisions/preferences
#
- Supabase context card for project state
- live API/repo state before current-state claims
- protocol files only when entering that workflow
Do not pull memory for casual chat.
Do not write state unless a decision, task, blocker, deliverable, approval, or artifact changed.

That makes it autonomous without wrestling prompts every turn.

  1. Active Memory should be optional assist, not a blocker
    If active-memory hook timeouts are recurring, it’s making chat worse. Either tune it way down or keep it off for now.

Good active-memory profile later:

  • queryMode: message or recent
  • max summary: tiny
  • timeout: short
#
  • only enabled for Morgan/main, not every worker

If it times out, chat should continue normally. It should never be the thing that makes the agent dull.

  1. QMD should be background/deep recall, not every-turn dependency
    QMD is good for deep recall, but if searches time out, don’t put it in the normal chat path. Use built-in memory/context cards for fast work, then QMD for explicit “go search old history” moments.

  2. The target feel
    You ask:
    “Make the roofing landing page better.”

Morgan should automatically:

  • identify the project
  • pull the project card
  • read website protocol
  • ask Mika for conversion brief if needed
  • spawn Codex/Kit with tight spec
#
  • QA result
  • return preview + receipt

You should not have to say “query Supabase, read memory, check repo, write receipt.” That should be policy.

So yeah: don’t think of this as CLI cleanup. Think of it as moving broken/slow context systems out of the hot chat loop and replacing them with fresh context cards + smart retrieval rules. That’s how it gets context-aware without feeling like a broken database wrapper.

novel gorge
#

itll respond to stuff like this Agreed. This is recovery sprint, not “final QA.” The 853px bug surviving is clown-car territory.

State changed:

• project.updated: AGW website final integration sprint / 7ba4d322 / active
• task.created: AGW recovery sprint before next review link / d72eb23d / urgent / due May 14 EOD
• interaction.created: AGW visual QA recovery sprint intake / 018c2b55

Verification:

• Supabase readback: pass
• Audit: 0 missing next steps, 0 orphan followups, 0 dupes
• Dashboard refresh: pass

New review link is blocked until tablet overflow, mobile header/CTA, icons, gallery filters, form flow, CMS honesty, and image crops all pass.

INSTEAD OF ACTUALLY DOING FIGURING OUT SOLUTIONS AND SHIT THE PROMPT I ASKED WAS

slate vortexBOT
#

Yeah — that’s the failure mode exactly.

It’s acting like a state clerk, not an operator. It heard “recovery sprint” and did bookkeeping instead of asking: “what’s broken, where is the repo/preview, what do I inspect, what fixes do I make?”

Immediate fix: change Morgan’s operating rule from record-first to work-first.

Put this near the top of Morgan/Kit instructions:

## Execution priority
When the user asks for a deliverable, fix, sprint, optimization, review, or recovery:
1. Do the work or inspect the artifact first.
2. Identify concrete issues and likely fixes.
3. If tools/repos/previews are available, use them before writing state.
4. Only write Supabase/task/receipt updates after meaningful work happened, or if the user explicitly asked for planning/state tracking.
5. Never respond with only bookkeeping when the user asked for execution.
#

Bad response: “State changed / task created / verification pass.”
Good response: “I inspected the 853px breakpoint. The overflow is caused by X in file Y. I’m changing Z, then I’ll rerun build and QA.”

Add this too:

## Anti-clerk rule
Receipts are the tail, not the dog. Do not create tasks/interactions as a substitute for solving the problem. If blocked from executing, say exactly what is missing: repo path, preview URL, credentials, or approval.

For website work specifically, make the default loop:

#
Website recovery loop:
1. Open preview/repo.
2. Reproduce issue at named breakpoints.
3. Locate likely component/CSS/content source.
4. Patch in branch/worktree or hand to Codex with exact file/task spec.
5. Run build/test/lint if available.
6. Recheck preview/responsive states.
7. Then write artifact/receipt/status.

Also make Supabase writes conditional:

Only write Supabase immediately for:
- explicit user decision