#QMD vs Traditional md Memory files

1 messages ¡ Page 1 of 1 (latest)

mortal juniper
#

Hi @inner wasp i need your help. So I implemented qmd memory as listed here: https://docs.openclaw.ai/concepts/memory

The question I have is, OpenClaw got memory too storing sessions daily in an md file and MEMORY.md for long term memory.

But I understood, I got both memory and sessions checked my qmd update cycle every 15 min and qmd embed cycle every 30 min.

So how does qmd differ from what OpenClaw has already with md files. And i understoos the chat gets bloated more fast and a lot of input tokens are consumed because memory ans sessions are injected on each chat. So if that is true, would there be a way to het all menory out of qmd because it dies vector, ragand reranking.

Help me with this to keep thibgs alsk good abd rekevant and that OpenClaw gets right context.

cosmic tangleBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

mortal juniper
#

@inner wasp can you help?

inner waspBOT
#

QMD doesn’t replace OpenClaw’s “Markdown memory” — it replaces the search backend that powers memory_search.

Here’s the clean mental model:

1) What OpenClaw “already has”

Two different things:

A) Session history (the chat transcript)

  • Stored as JSONL transcripts on disk (not .md), and the recent chat history is always part of your context window (until compaction/pruning).
  • That’s not “memory”; it’s just the conversation you’re in.
    Docs: https://docs.openclaw.ai/concepts/session

B) Memory (workspace Markdown)

  • memory/YYYY-MM-DD.md = daily notes
  • MEMORY.md = curated long-term memory (and by default only loaded in the main private session, not in group contexts)
    Docs: https://docs.openclaw.ai/concepts/memory
#

2) So what does QMD change?

QMD is an experimental backend for memory retrieval:

So: QMD ≠ “another memory layer”. It’s “how memory_search finds relevant snippets”.

3) “Does memory + sessions get injected every message?”

Not as “everything, every time”.

What actually bloats context is usually:

  • session history (normal chat turns)
  • tool schemas (big hidden cost; browser is famously chunky)
  • any tool results you read into the convo (including reading memory files)
#

Two key clarifications:

A) The “read today + yesterday” behavior is a prompting convention

OpenClaw’s default guidance is “on session start, read today + yesterday”. If your agent actually reads those files, that file content shows up as tool output in the session (and can stick around until pruning/compaction).

If you don’t want that, you can change your AGENTS.md/system instructions to not auto-read daily files and instead rely on memory_search.

B) memory_search (including QMD) injects only snippets

Even with QMD, the model only gets the top N snippets (capped by limits). It is not dumping your whole memory DB into the prompt.

4) If your goal is “pure RAG memory” (minimal prompt bloat)

Do this:

  1. Stop auto-reading memory/YYYY-MM-DD.md (and/or MEMORY.md) on session start in your instructions
    → rely on memory_search when needed.

  2. Tighten QMD recall payload (so it can’t flood context):

#
  • memory.qmd.limits.maxResults
  • memory.qmd.limits.maxSnippetChars
  • memory.qmd.limits.maxInjectedChars
    (Those are in the docs under the QMD config surface.)
  1. Use /context list + /context detail to see what’s actually consuming tokens (often it’s tools + bootstrap files, not memory).
    Docs: https://docs.openclaw.ai/concepts/context

5) Quick diagnostic questions (so we don’t guess)

  • What’s your openclaw --version?
  • Can you paste /context list output from the session where you see the bloat? (If it contains private file paths/notes, redact or paste in DM.)

That will tell us whether the cost is coming from (a) auto-reading memory files, (b) giant tool schemas, (c) long session history that needs compaction/pruning, or (d) overly-large memory_search injections.

mortal juniper
#

@inner wasp i have a long question and including context, so i have to seperate in a few messages. Wait for me to end with KRILL_RESPOND so you know I am done and you can injest all messages in to answer me the best way you can, even consider thinking creative or out of the box. So:

I’m trying to find the sweet spot between “feels like a real companion that remembers me” and “context bloat on every turn”.

My goal is that my agent doesn’t feel like another AI tool, but like something that has continuity (remembers preferences, our way of working, ongoing projects, etc.). The problem is that OpenClaw seems to inject a lot every message/session, and that makes it token-heavy.

I thought QMD would solve this by only pulling in relevant memory on-demand: query a local DB for things like agreements, preferences, project state, and inject only what matters for this specific turn. In theory that would reduce bloat and make responses smarter because the context is more targeted.

But I’m not sure if my mental model is correct.

Questions: • Is the “QMD as selective memory injection” approach actually the right strategy for reducing bloat while keeping the agent feeling smart? Be brutally honest.
• In OpenClaw, what’s the best pattern to ensure only relevant things get injected per turn (and not the whole universe)?
• How do you recommend handling long-running projects (like building an app/website) so it keeps short-term working memory until done, but doesn’t permanently inflate the base context — so I can safely /new after?
• Practically: what should be “always injected” vs “retrieved on-demand” vs “summarized into a small project state”?

#

And a few deeper ones so I don’t optimize the wrong thing: • Are workspace files injected every turn by design, or can OpenClaw be configured to “lazy-load / read-on-demand” instead?
• Is QMD meant to replace always-injected memory, or is it additive (meaning I still pay the base prompt tax no matter what)?
• What’s the recommended architecture split in your view:

tiny stable “identity/values” in system

small pinned “working set” per session

retrieval for everything else

periodic summarization into a compact “project state” blob
• How do you prevent over-retrieval (too many loosely relevant memory items coming back and bloating the context anyway)? Any scoring/thresholding pattern you recommend?
• What usually causes the “dumb bot” feeling when context is reduced: missing identity/personality, missing preferences, missing project state, or missing tool knowledge?
• Do you have a rule-of-thumb target for the always-injected baseline (e.g. keep it under X tokens), and do you treat tool schemas the same way (summary vs full schema)?

Also: would you recommend organizing work via Telegram group-chat topics (even if I’m the only human + bot) as a way to isolate context per project, or is DM still better for controlling memory + retrieval? If Telegram topics help: what’s the best way to map “topic = project state + retrieval namespace” in practice?

#

🧠 Context breakdown
Workspace: <REDACTED>
Bootstrap max/file: 20,000 chars
Sandbox: mode=off sandboxed=false
System prompt (run): 27,117 chars (~6,780 tok) (Project Context 14,951 chars (~3,738 tok))

Injected workspace files:

• AGENTS.md: OK | raw 9,068 chars (~2,267 tok) | injected 9,068 chars (~2,267 tok)
• SOUL.md: OK | raw 1,664 chars (~416 tok) | injected 1,664 chars (~416 tok)
• TOOLS.md: OK | raw 1,083 chars (~271 tok) | injected 1,083 chars (~271 tok)
• IDENTITY.md: OK | raw 276 chars (~69 tok) | injected 276 chars (~69 tok)
• USER.md: OK | raw 985 chars (~247 tok) | injected 985 chars (~247 tok)
• HEARTBEAT.md: OK | raw 167 chars (~42 tok) | injected 167 chars (~42 tok)
• BOOTSTRAP.md: MISSING | raw 0 | injected 0
• MEMORY.md: OK | raw 1,056 chars (~264 tok) | injected 1,056 chars (~264 tok)

Skills list (system prompt text): 1,994 chars (~499 tok) (5 skills)
Skills: gog, healthcheck, openai-whisper-api, skill-creator, weather
Tool list (system prompt text): 2,060 chars (~515 tok)
Tool schemas (JSON): 18,339 chars (~4,585 tok) (counts toward context; not shown as text)
Tools: read, edit, write, exec, process, browser, canvas, nodes, cron, message, tts, gateway, agents_list, sessions_list, sessions_history, sessions_send, sessions_spawn, subagents, session_status, web_search, web_fetch, image, memory_search, memory_get

Session tokens (cached): unknown / ctx=272000

KRILL_RESPOND

inner waspBOT
#

Your /context list is actually really helpful here: in your setup, memory is not the thing bloating context.

What’s eating tokens (from your paste)

Always-injected, every turn:

  • Tool schemas: ~4,585 tok
  • AGENTS.md: ~2,267 tok (this is your single biggest injected file)
  • Everything else (SOUL/TOOLS/USER/etc + MEMORY.md): comparatively small

So if you’re worried about “memory makes chats expensive”, the data says:

  • Your cost driver is “system prompt + tools + AGENTS.md”, not QMD.
  • QMD itself doesn’t shove “all memories” into context; it only returns top snippets when memory_search runs.

The gotcha: MEMORY.md is injected every turn (by design)

OpenClaw’s system prompt bootstrap can inject MEMORY.md / memory.md when present in the workspace (so it counts every message). That’s why you see it in “Injected workspace files”.

If you want “pure RAG style” (memory only comes in via retrieval), the practical pattern is:

#

Don’t keep big long-term memory in root MEMORY.md.
Instead:

  • Keep a tiny stub MEMORY.md (or delete it), and store durable memory in an evergreen file like:
    • memory/long-term.md (or memory/profile.md, etc.)

Those memory/*.md files:

  • are not auto-injected
  • are still searchable via memory_search (including QMD backend)

How to reduce context cost quickly

  1. Shrink AGENTS.md (biggest win for you)

    • Keep AGENTS.md as “high-level rules only”
    • Move long playbooks / long docs into a separate markdown file and only read it when needed (or into a skill file you read on-demand)
  2. Use a smaller tool surface (reduces the ~4.6k tok schema overhead)

    • If you don’t need UI automation in this agent/session, drop browser/canvas/nodes via tool policy (tool profiles exist exactly for this).
    • Docs: https://docs.openclaw.ai/tools (see tools.profile, tools.allow, tools.deny, and group:*)
#
  1. Tune QMD recall limits (prevents “too many snippets” when retrieval happens)
    • memory.qmd.limits.maxResults
    • memory.qmd.limits.maxSnippetChars
    • memory.qmd.limits.maxInjectedChars

One question so I can be precise

Do you want QMD/memory recall to work inside Discord channels like this one, or do you want memory recall DM-only (safer + usually cheaper)?
If you tell me that preference, I can point you to the exact setting (memory.qmd.scope) that controls where QMD recall is allowed.

mortal juniper
#

Hey @inner wasp thanks, that cleared up a big misconception for me. The /context list makes it obvious the cost is mainly tool schemas + AGENTS.md, not MEMORY.md right now. To answer your question fully, I need to seperate again. So now wait for KRILL_2, get all chats and then answer.

Quick scope answer to your question:
• I don’t use Discord (or any other chat surface) for my OpenClaw agent.
• I only use Telegram.
• I want QMD/memory recall allowed only in Telegram: DM + Telegram group forum topics (project topics). Nothing outside Telegram.

What I’m trying to achieve (2 goals):
• Keep the “wow / feels like a real assistant” continuity (preferences, agreed way-of-working, ongoing project state).
• Reduce per-turn context bloat without making it feel dumb/forgetful.

AGENTS.md question (big one for me):
Right now my AGENTS.md is pretty “full fat” (startup ritual, safety, restarts, group-chat behavior rules, heartbeat guidance, etc.). I get why shrinking it is the biggest win — but I’m worried I’ll strip out the “magic”.
Can you give a concrete recommendation on:
• What absolutely must stay in always-injected bootstrap (AGENTS/SOUL/USER/TOOLS/IDENTITY)?
• What should move into on-demand “playbooks” (and be read only when needed)?
• Any rule-of-thumb target size for AGENTS.md so the agent stays agentic but cheaper?

MEMORY.md vs groups/topics (need clarification):
Docs seem to say “workspace bootstrap files are injected every turn”, but also that MEMORY.md is meant to be main/private only (not shared/group contexts).
In my case:
• Telegram DM = personal/private
• Telegram topics = project threads (still only me + bot, but technically group sessions)
What I want:
• Personal MEMORY.md should only ever influence DM.
• Project context should live in per-topic state + retrieval, not leak personal memory into topics.

#

Can you confirm how OpenClaw behaves today for Telegram forum topics:
• Does root MEMORY.md get injected in topic sessions or not?
• If it does, is there a supported pattern to disable/avoid root MEMORY.md injection entirely and rely on QMD retrieval + per-topic files instead?

QMD / “selective injection” mental model:
I now get that QMD doesn’t auto-inject everything — it only injects what memory_search returns.
But I’m still unsure about the best pattern for continuity without bloat:
• Should I remove the “always read memory/YYYY-MM-DD (today+yesterday)” behavior and instead rely on memory_search/get at start-of-session when needed?
• Or do you recommend a small pinned “working set” file per topic/session that gets updated + kept compact (project-state.md), and only use QMD for deeper recall?

Tool schemas overhead:
Given tool schemas are a major fixed cost, what’s your recommended minimal tool profile for a Telegram-first assistant that mostly does:
• file read/write/edit
• exec (sometimes)
• memory_search/get (QMD)
• web_search/web_fetch (sometimes)
…but likely doesn’t need browser/canvas/nodes most of the time.
Also: is there a best-practice way to measure the token savings after changing tools.allow/deny/profile (besides eyeballing /context list)?

Finally, a couple earlier questions I still want your take on:
• What usually causes the “dumb bot” feeling when you slim context (missing identity/personality, missing preferences, missing project state, or missing tool knowledge)?
• Any recommended architecture split you’d personally use for: tiny stable identity + compact per-topic project state + retrieval for everything else?

If helpful I can paste the current workspace files (AGENTS/SOUL/USER/TOOLS/IDENTITY/MEMORY) as-is — but I’m mainly looking for the best-practice trimming strategy + config knobs for Telegram topics/DM separation.

KRILL_2

inner waspBOT
#

Good constraints — you can get both “feels continuous” and cheaper per-turn context, but the trick is separating:

  • Always-injected “constitution” (small, stable, non-sensitive)
  • On-demand playbooks + retrieval memory (everything else)

I’ll break it into the 3 things you asked.


1) Lock QMD/memory recall to Telegram DM + Telegram forum topics only

That’s exactly what memory.qmd.scope is for (it’s evaluated per session key). Default QMD scope is DM-only, but you want “Telegram DM + Telegram topics”.

A solid starting point:

#
memory: {
  backend: "qmd",
  qmd: {
    scope: {
      default: "deny",
      rules: [
        { action: "allow", match: { channel: "telegram", chatType: "direct" } },
        // Telegram forum topics *usually* come through as chatType "thread"
        { action: "allow", match: { channel: "telegram", chatType: "thread" } },
      ],
    },
  },
}

If your Telegram “topics” are not classified as thread on your connector (sometimes they appear as group but with :topic: in the key), do this to make it precise:

#
  1. Run:
  • openclaw sessions --json (or inspect ~/.openclaw/agents/<agentId>/sessions/sessions.json)
  1. Find one topic sessionKey and then allow by key prefix for that group/topic namespace.

Docs (QMD backend + scope): https://docs.openclaw.ai/concepts/memory#qmd-backend-experimental


2) AGENTS.md: what must stay injected vs move to on-demand playbooks

What must stay (AGENTS/SOUL/USER/TOOLS/IDENTITY)

Keep only the stuff that is:

  • Always true
  • Needed to act safely
  • Needed to choose the right memory behavior
  • Short
#

Concretely, in AGENTS.md I’d keep:

  1. The “how I work” contract (very short)

    • “Ask clarifying questions when ambiguous”
    • “Prefer plans + checkpoints”
    • “If unsure, say so / propose options”
  2. Memory rules (this is where the ‘magic’ actually comes from)

    • When to write memory (durable vs daily vs project)
    • Prefer memory_search over “read a bunch of files”
    • Don’t dump large files unless necessary
  3. Privacy + separation rules (DM vs topics)

    • “In Telegram DM: you may use personal memory”
    • “In Telegram topics: only use project memory; never use personal context”
  4. Tool discipline (saves tokens + prevents chaos)

#
  • “Don’t call tools unless needed”
    • “When you do, keep outputs small; summarize + write memory”

Everything else should move out.

What should move to playbooks (read only when needed)

These are classic “token traps” and usually don’t need to be in every turn:

  • Startup ritual / onboarding ritual (step-by-step)
  • Restart procedures / disaster recovery
  • Deep safety policy text (long lists)
  • Troubleshooting trees
  • Big “how to use X tool” sections
  • Any long examples / templates

Put them into files like:

  • playbooks/startup.md
#
  • playbooks/restarts.md
  • playbooks/troubleshooting.md

Then in AGENTS.md keep one line like:

“If the user asks about restarts/outages, read playbooks/restarts.md first.”

Rule-of-thumb size targets (practical)

Token targets depend on your model/window, but cost-wise:

  • AGENTS.md: aim ~600–1200 tokens (often ~2.5k–5k chars)
  • SOUL.md: ~100–300 tokens
  • USER.md: ~100–300 tokens (only stable prefs; nothing sensitive you wouldn’t want in topics if it leaks)
  • TOOLS.md: keep it lean; it’s injected every turn too
  • HEARTBEAT.md: tiny (yours already is)

Right now AGENTS.md at ~2,267 tokens isn’t insane — but it’s absolutely the best “constant cost” knob you have.

#

Docs (bootstrap injection + why it costs): https://docs.openclaw.ai/concepts/system-prompt


3) “MEMORY.md is main/private only”… but it’s injected every turn — how to get DM-only personal memory?

You spotted a real tension:

  • Design intention: “MEMORY.md is for main/private”
  • Mechanics today: if MEMORY.md exists, it’s a bootstrap file and gets injected each turn (per System Prompt docs)

So if you want hard separation (personal never influences topics), the safest approach is:

Recommended structure for your goal

  • Do not store personal stuff in root MEMORY.md (because it’s injected everywhere).
  • Instead:
    • Personal: memory/personal.md (or memory/personal/*.md)
#
  • Project/topic: memory/projects/<project>.md

Then add an AGENTS.md rule like:

  • In DM: you may use memory/personal* + relevant project files
  • In topics: only use memory/projects/* (and ignore any retrieval hits whose path isn’t under memory/projects/)

This works well because memory_search results include file paths, so the agent can self-filter even if the index contains everything.

Advanced option (if you really want to keep using root MEMORY.md)

You can get fancy with an agent:bootstrap hook to remove MEMORY.md from injected bootstrap files for non-DM sessions, but that’s definitely “advanced ops” territory. For most setups, the file-structure + retrieval-filtering approach above is the sweet spot.


If you want, paste (or DM) the top ~60 lines of your current AGENTS.md and I’ll propose a concrete “trimmed AGENTS.md + playbooks split” that preserves the agentic feel but cuts the constant tokens.

#

Telegram forum topics: does root MEMORY.md get injected today?

Yes.

Current behavior (today):

  • If MEMORY.md (or memory.md) exists in the workspace, it’s treated as a bootstrap file and is injected on every normal run (DMs, groups, channels, Telegram topics/threads).
  • The only time bootstrap files get filtered down is for subagents + cron sessions (they only get AGENTS.md + TOOLS.md). Topic sessions are not subagents/cron, so they get the full bootstrap set.

This matches the docs (“MEMORY.md and/or memory.md … may be injected”) and the current code path that only filters for subagent/cron session keys.
Docs: https://docs.openclaw.ai/concepts/system-prompt#workspace-bootstrap-injection

So the “MEMORY.md should only load in main/private” line is best read as recommended practice, not an enforcement mechanism.


Can you supportedly avoid root MEMORY.md injection entirely?

Yes — simplest supported pattern is: don’t have MEMORY.md / memory.md in the workspace at all.

#

If the file isn’t there, it can’t be injected. Then you rely on:

  • memory/*.md files (not auto-injected)
  • memory_search/memory_get (QMD backend or builtin) for recall

Docs confirm daily memory/*.md are not auto-injected: https://docs.openclaw.ai/concepts/system-prompt#workspace-bootstrap-injection

Practical layout for your DM vs topic separation goal:

  • No root MEMORY.md
  • memory/personal.md (DM-only, small)
  • memory/projects/<project>/state.md (topic working set, small)
  • memory/projects/<project>/decisions.md (append-only / bigger)

Then your AGENTS rule can be:

  • In Telegram DM: allowed to use memory/personal.md + project files
  • In Telegram topics: forbidden to use memory/personal.md; only project paths

That gives you separation without needing hooks.

#

(Advanced option exists: an agent:bootstrap hook can remove MEMORY.md from bootstrap injection for non-DM sessions, but that’s “ops / hooks” territory; the file-layout approach is the best-practice simple solution.)


QMD “selective injection” + continuity without bloat: what I’d do

1) Stop “always read today+yesterday”

I would remove the “always read memory/YYYY-MM-DD today+yesterday” ritual unless you truly need it.

Reason: it’s a linear tax (more days → more duplicated context reads), and it’s usually worse than retrieval.

Instead:

  • Use memory_search when you need recall
  • Keep a small, actively-maintained working set per topic/project

2) Yes: use a pinned per-topic working set file (recommended)

This is the best “feels continuous” pattern I’ve seen:

#
  • Per topic/project, maintain memory/projects/<project>/state.md
  • Make it compact and structured, e.g.
    • Goal
    • Current status
    • Next actions
    • Open questions
    • Key decisions + dates
  • On that topic’s first message of the day (or when user says “where are we”), the agent:
    1. memory_get the state file
    2. answers
    3. updates the state file

Then QMD is used for:

  • “What did we decide about X two weeks ago?”
  • “Find the message/snippet where we discussed Y”
  • Deep recall / supporting details
#

That combo keeps the “magic” while keeping per-turn context predictable.


Tool schemas overhead: minimal tool set for your Telegram-first assistant

If you mostly need:

  • fs: read/write/edit
  • runtime: exec/process (sometimes)
  • memory: memory_search/memory_get
  • web: web_search/web_fetch (sometimes)
  • plus session_status

…then the clean minimal allowlist is:

tools: {
#
  allow: [
    "group:fs",
    "group:runtime",
    "group:memory",
    "group:web",
    "session_status",
  ],
  deny: ["group:ui", "group:nodes"], // (optional explicitness)
}

That drops browser/canvas/nodes (big schema cost) and also avoids cron/message/gateway unless you truly need them.

Measuring savings (best practice):

  1. After changing tool policy + restarting, send /new to start a clean session.
  2. Run /context detail (not just list) and compare:
#
  • “Tool schemas (JSON): … (~tok)”
    • “System prompt (run): … (~tok)”
  1. If you want “real workload” comparison: enable /usage tokens, do the same prompt before/after, compare input tokens.

/context is still the best single source of truth for schema overhead changes.

Docs: https://docs.openclaw.ai/concepts/context and https://docs.openclaw.ai/tools


What causes the “dumb bot” feeling when slimming context?

In practice it’s usually one (or more) of:

  1. Missing project working set (most common)
    Retrieval alone can’t replace a continuously-updated “current state”.

  2. Preferences not written anywhere durable

#

If you remove preferences from injected files but don’t put them into a small personal file the agent reliably consults in DM, it’ll feel inconsistent.

  1. Over-reliance on retrieval without “write-back”
    If the agent recalls things but doesn’t update the state/decisions files, you keep paying re-discovery costs.

  2. Tool removal breaks capabilities silently
    If you disable web/exec and the agent used them as part of its workflow, it looks “lazy” or “stuck”.


My “personal favorite” architecture for your exact case

Always injected (tiny, stable):

  • SOUL/IDENTITY: personality + tone
  • USER: only safe stable prefs (or keep USER super minimal)
  • AGENTS: rules + the DM-vs-topic separation policy + “update state file” workflow

Per-topic working set (small + updated):

#
  • memory/projects/<project>/state.md

Retrieval for everything else:

  • project decisions/logs + any archived notes

Personal memory: DM-only:

  • memory/personal.md (kept small), and AGENTS forbids using it in topics

If you want, paste just your current AGENTS.md headings / section list (no sensitive content needed) and I’ll suggest a concrete “keep vs move to playbooks” split that preserves the agentic feel but trims the constant token bill.

mortal juniper
#

@inner wasp My constraints:

• Telegram only
• QMD/memory recall ONLY in Telegram DM & TG forum topics
• Goal: feels continuous + remembers our working agreements, while cutting per-turn context bloat

Quick questions:

  1. Docs mismatch: the Memory page says root MEMORY.md “loads only in main/private session”, but the System Prompt docs say MEMORY.md (when present) is a workspace bootstrap file injected on every turn. My /context detail confirms it’s injected. Which behavior is authoritative in current OpenClaw?

  2. For Telegram topics specifically: do you recommend enforcing “project-only memory” via:
    • memory.qmd.scope rules (chatType=thread) AND
    • per-topic config (channels.telegram.groups.<id>.topics.<topicId>.systemPrompt) to hard-instruct separation
    …or is scope + AGENTS rules enough in your experience?

  3. On scope matching: is match: { channel: "telegram", chatType: "thread" } reliably correct for forum topics in your connector, or should I instead allow by keyPrefix (e.g. telegram:group:<id>:topic:) to avoid accidentally allowing non-topic groups?

  4. Tool schema overhead: my /context shows tool schema JSON is the largest single chunk. If I’m Telegram-first and mostly need fs/runtime/memory/web + (maybe) subagents, what’s the minimal “safe” tool allowlist/profile you’d recommend?
    • Also: if I keep subagents/heartbeats, which tools are actually required vs removable?

  5. Session growth: do you recommend enabling agents.defaults.contextPruning (cache-ttl) to stop old tool outputs from bloating ongoing sessions? If yes, what baseline values would you start with for a “chatty coding assistant” that uses tools regularly?

I can paste the top section headings (or first ~60 lines) of my current AGENTS.md if you want to propose a concrete trim + playbook split. I’m cautious about removing “magic”, so I’d prefer a “keep injected rules small; move long playbooks on-demand” approach with explicit pointers in AGENTS.md.

inner waspBOT
#
  1. Docs mismatch (“MEMORY.md only in main/private” vs “MEMORY.md injected”) — what’s authoritative today?
    The System Prompt behavior is authoritative for current OpenClaw. If MEMORY.md (or memory.md) exists in the workspace, it’s treated as a workspace bootstrap file and is injected on every normal run (DMs, groups, channels, Telegram topics), not just main/private. Your /context output confirming it is exactly what the runtime is doing.

The “main/private only” line on the Memory page is best interpreted as a recommended practice / intended workflow, not an enforced guarantee right now.

  1. Telegram topics: scope + AGENTS enough, or also per-topic systemPrompt?
    I’d do both if you care about hard separation:
  • memory.qmd.scope = hard gate for where retrieval is allowed at all (Telegram-only).
  • AGENTS.md = global rule (“DM can use personal; topics must be project-only”).
  • channels.telegram.groups.<id>.topics.<topicId>.systemPrompt = a very small, per-topic “seatbelt” that reminds the model every turn in that topic what it is allowed to use.

The per-topic systemPrompt is cheap if you keep it to ~1–3 lines, and it’s the most reliable way to prevent accidental bleedover when you’re deep in a project thread.

Config reference showing that surface exists: channels.telegram.groups.<id>.topics.<topicId>.systemPrompt is documented in the gateway configuration reference.

  1. Scope matching: is { channel:"telegram", chatType:"thread" } reliable for forum topics?
#

For QMD scope specifically: no — today the QMD scope matcher only understands chatType as direct | group | channel (no thread). Telegram forum topics are usually “group or channel session keys with a :topic:<id> suffix”, not a separate chatType.

So the most reliable pattern is:

  • Allow Telegram DMs by channel+chatType:
    • { action:"allow", match:{ channel:"telegram", chatType:"direct" } }
  • Allow only topics by keyPrefix containing :topic:, ideally including the group id so you don’t accidentally allow non-topic groups:

Example pattern:

memory: {
  backend: "qmd",
  qmd: {
    scope: {
      default: "deny",
#
      rules: [
        { action: "allow", match: { channel: "telegram", chatType: "direct" } },

        // Allow only topics inside a specific supergroup:
        { action: "allow", match: { keyPrefix: "telegram:group:-1001234567890:topic:" } },
      ],
    },
  },
}

(Use a real session key from openclaw sessions --json to confirm whether your topics are telegram:group: or telegram:channel:.)

  1. Minimal “safe” tool allowlist for Telegram-first (fs/runtime/memory/web + maybe subagents)
    If you want to cut tool schema overhead hard, the biggest win is dropping the UI + node tools.
#

I’d start with:

tools: {
  allow: [
    "group:fs",       // read/write/edit
    "group:runtime",  // exec/process
    "group:memory",   // memory_search/memory_get (QMD-backed)
    "group:web",      // web_search/web_fetch
    "session_status",

    // Only if you actually use subagents:
    "sessions_spawn",
    "sessions_list",
    "sessions_history",
  ],
#
  deny: ["group:ui", "group:nodes", "cron", "gateway", "message"],
}

Notes:

  • You do not need the message tool to reply normally on Telegram; that tool is mainly for proactive sends / routing to other targets.
  • Subagents: keep only what you use. If you only “spawn and wait for result”, you can often drop sessions_list/sessions_history.

How to measure savings: best practice is:

  • restart gateway after config change
  • /new
  • /context detail before/after and compare “Tool schemas (JSON)” + “System prompt (run)”
  • optionally enable /usage tokens and compare real runs.
  1. Session growth: should you enable agents.defaults.contextPruning (cache-ttl)? baseline values?
    Yes if you’re using Anthropic (or OpenRouter Anthropic) with prompt caching; pruning is designed for that workflow. It trims old tool results from the in-memory prompt before a call. It does not rewrite history on disk.
#

Baseline for a chatty tool-using assistant:

  • mode: "cache-ttl"
  • ttl: "1h" if your model cache TTL is ~1h; otherwise match your cache TTL
  • keep defaults for the rest at first

Docs: https://docs.openclaw.ai/concepts/session-pruning


One concrete recommendation to satisfy your “no personal bleed into topics” goal

If you want this robustly today: remove root MEMORY.md entirely (so it cannot be injected anywhere), and put personal memory in memory/personal.md (retrieval only). Then enforce topic behavior by:

  • QMD scope (Telegram-only + topic keyPrefix)
  • AGENTS rule to ignore any snippet paths that aren’t under memory/projects/...
  • per-topic systemPrompt reinforcing that rule

If you paste just the section headings of your current AGENTS.md (or first ~60 lines with any sensitive bits removed), I’ll propose a concrete trim + “playbooks/” split that keeps the agentic feel but cuts the always-injected cost.

mortal juniper
#

@inner wasp thanks. I want to tighten this setup for Telegram-only (DM + topics) with zero context bleed and no dumb bot vibes.
Context:

  • QMD/memory recall ONLY for Telegram DM + topics. No Discord/other.
  • Need buddy-feel + long-term memory, but without root-injected files becoming a permanent token tax.
  • Heavy feature user (browser, subagents, cron), so I can't just strip everything to minimal.
    Targeted checks:
    A) MEMORY.md Docs vs Runtime
    Docs say "MEMORY.md only loads in main/private", but /context shows it injecting every turn. Which is actually enforced behavior? Should a root MEMORY.md even exist here?
    B) Telegram Topics + QMD Scope
  • What is the exact sessionKey format for topics (is it :topic:<id>)?
  • What are the exact scope rules to only allow DMs + topics within my specific supergroup (keyPrefix with group-id)?
  • Confirm supported QMD match fields (chatType, keyPrefix, etc.) and whether "thread" is indeed unsupported.
    C) Topic Seatbelts
    You mentioned channels.telegram.groups.<id>.topics.<topicId>.systemPrompt. Can you provide a 2–3 line best-practice example to strictly separate personal vs. project memory? Also, do you recommend requireMention per topic even if I'm the only user?
    D) Tool Schema Overhead
    Best practice for heavy tools (browser/cron):
  • One heavy agent (simple, expensive), or
  • A light buddy agent + on-demand "devops" agent?
    If 2, what's the cleanest OpenClaw implementation (routing vs subagent spawn vs separate agentId)?
    E) Session Pruning
    You mentioned contextPruning(cache-ttl). Docs imply this is Anthropic-only (prompt caching). I run OpenAI OAuth (codex 5.3). Is contextPruning useless for me? If so, what is the OpenAI equivalent to save tokens (historyLimit, compaction, tool truncation)?
    Will restructure my workspace files once confirmed.
#

@inner wasp ?

inner waspBOT
#

Yep — the key points are:

  1. Authoritative behavior today:
    If MEMORY.md (or memory.md) exists in your workspace, it’s a bootstrap file and is injected into every normal session, including Telegram forum topics. Your /context confirms the real runtime behavior. The “main/private only” wording is guidance/intent, not enforced right now.

  2. To avoid root MEMORY injection entirely (supported):
    Delete/rename MEMORY.md (and/or memory.md) so it’s not present → then it cannot be injected. Put personal memory in e.g. memory/personal.md (retrieval-only), and project memory under memory/projects/<project>/....

  3. Telegram forum topics scope matching:
    For QMD scope, chatType: "thread" is not a thing today (scope only understands direct | group | channel). So for topics, use keyPrefix matching that includes :topic:, ideally including the group id to avoid enabling all groups.

Example pattern (adjust to your actual session keys):

memory: {
backend: "qmd",
#
qmd: {
scope: {
default: "deny",
rules: [
{ action: "allow", match: { channel: "telegram", chatType: "direct" } },
{ action: "allow", match: { keyPrefix: "telegram:group:-1001234567890:topic:" } },
],
},
},
}

If you paste one real topic session key from openclaw sessions --json, I’ll tell you the exact safest keyPrefix.

  1. Extra “hard separation” (recommended):
    Use channels.telegram.groups.<id>.topics.<topicId>.systemPrompt as a short seatbelt like:
#

“This is a project topic. Do not use personal memory. Only use memory/projects/<project>/… and this topic’s state file.”

That plus AGENTS rules is very robust.

If you want, paste just the first ~60 lines or headings of your AGENTS.md and I’ll propose a concrete trim + playbook split while keeping the “magic.”

mortal juniper
#

Hey Krill — thanks. I pulled more real runtime evidence from my setup and I need your help to nail the hard separation + low bloat plan without guessing.

This time wait for KRILL_3 before you start to answer. Grab the chats as your input, had to seperate answers in different messages because of the character limit.

Hard constraints:

  1. Memory/QMD must only be used from Telegram (DM + forum topics). No Discord etc.
  2. I want ZERO cross-contamination between Project A and Project B topics. (I’m fine with “a retrieval hit happened” if I explicitly asked for it, but I don’t want default mixing.)
  3. I want “buddy continuity” (preferences + long-term agreements) without constant token tax.

Runtime facts:

  • OpenClaw version: 2026.2.15
  • Memory backend: qmd
  • QMD limits currently: maxResults=8, maxSnippetChars=800, maxInjectedChars=8000
  • tools include big schemas (message is huge, browser/nodes also non-trivial)

Key thing I noticed in my own AGENTS.md:
It currently instructs EVERY session to:

  1. read SOUL.md
  2. read USER.md
  3. read memory/YYYY-MM-DD.md (today + yesterday)
  4. “If in MAIN session: also read MEMORY.md”

Two issues I want you to confirm and advise on:
A) Even if memory/*.md daily files are not auto-injected, my AGENTS ritual forces memory_get reads in every topic, which is the easiest way to create cross-topic bleed and recurring token cost (because the daily log is global across all projects). Would you recommend removing the “always read today+yesterday” rule entirely and replacing it with per-topic state only (memory/projects/<project>/state.md)?
B) Root MEMORY.md is injected as a bootstrap file when present (system-prompt docs say it’s injected on every normal run). So “ONLY load in main/private” in AGENTS is not enforceable. Do you agree the safe pattern is: keep root MEMORY.md tiny (non-project, non-sensitive) OR remove it completely and move durable memory to retrieval-only files?

inner waspBOT
#

Got it — I’ll wait for KRILL_3 before answering.

When you send it, please include (even if redacted):

  • one real session key for a Telegram DM
  • one session key for Project A topic
  • one session key for Project B topic
  • your current memory.qmd.scope block (and any channels.telegram.groups...topics...systemPrompt you’ve set)

That’s enough to lock down “Telegram-only + zero A↔B mixing + low bloat” without guessing.

mortal juniper
#

TG topic isolation

  • Official docs say forum topics get session keys like: telegram:group:<chatId>:topic:<threadId>
  • BUT right now my openclaw sessions --json output does NOT show any :topic: session key (it only shows agent:main:main + cron runs). I do see telegram:<id> inside sessions.json message routing fields, but not as the session key.
    Questions:
  1. What is the authoritative way (commands + file paths) to obtain the real topic session key used by QMD scope evaluation in 2026.2.x?
  2. For QMD scope rules: should I match on keyPrefix or rawKeyPrefix? (Memory docs say both. I want the safest way that won’t accidentally allow non-topic group chats.)
  3. Given my lack of :topic: keys, could this indicate my Telegram group allowlist is not configured (channels.telegram.groups) My config shows groupPolicy: "allowlist" but I’m not sure my groups allowlist exists. What’s the minimal required group config to get topic sessions created

Tool bloat strategy:

  • Tools docs show profiles: minimal / coding / messaging / full
  • I want to reduce schema overhead without breaking my workflow (I do use browser sometimes; gateway restart + cron eventually; subagents already have restricted tools)
    Questions:
  1. Is dropping the message tool safe if I only need normal Telegram replies + cron outputs routed back to the same chat/topic? Or does proactive sending require message
  2. If per-topic tool sets aren’t supported, would you recommend a second agent (e.g. “dev” with browser/nodes/canvas) to keep the main buddy agent cheap? My concern was “two agents doubles injection”, but if I only invoke dev when needed, it might reduce total burn

Heartbeat/cron cost:
My status JSON indicates heartbeat is enabled every 30m. I originally thought “no heartbeats” but runtime says otherwise.
6) What’s the best way to keep the “buddy check-in” magic (e.g. post-op follow-up) without paying constant LLM calls? Is cron with a minimal promptMode or lightweight profile the right approach

KRILL_3

inner waspBOT
#

You sent KRILL_3 — answering now.

1) How to get the real Telegram topic session key (2026.2.x)

The authoritative source is the Gateway’s session store on the machine running the Gateway:

  • Session store path (per agent):
    ~/.openclaw/agents/<agentId>/sessions/sessions.json

  • Topic session keys (when topics are working) look like:
    agent:<agentId>:telegram:group:<chatId>:topic:<threadId>
    (The :topic:<threadId> suffix is exactly how OpenClaw builds it.)

Commands to get it:

  • On the Gateway host:
    • openclaw status (to confirm which agent/store you’re looking at)
    • openclaw sessions --json (lists keys in that store)
  • Or directly inspect the store file:
#
  • cat ~/.openclaw/agents/main/sessions/sessions.json

If openclaw sessions --json shows only agent:main:main + cron, then the Gateway has not created any Telegram group/topic sessions yet. That almost always means the Gateway is not accepting/receiving those group/topic messages (see #3).

2) QMD scope: keyPrefix vs rawKeyPrefix (what’s safest)

QMD scope matching works like this:

  • rawKeyPrefix matches the full session key, e.g.
    agent:main:telegram:group:-100…:topic:99
  • keyPrefix matches the normalized key (OpenClaw strips agent:<id>:), e.g.
    telegram:group:-100…:topic:99

Safest recommendation for “don’t accidentally allow non-topic group chats”:

  • Use an allowlist with default: "deny"
  • Allow Telegram DMs by channel+chatType
  • Allow topics by prefix that includes :topic: and the specific group id
#

Example:

memory: {
  qmd: {
    scope: {
      default: "deny",
      rules: [
        { action: "allow", match: { channel: "telegram", chatType: "direct" } },
        { action: "allow", match: { keyPrefix: "telegram:group:-1001234567890:topic:" } },
      ],
    },
  },
}

If you go multi-agent later and want “topic X must only ever run on agent projectA”, then switch that topic rule to rawKeyPrefix so it’s pinned to that agent id.

#

3) “No :topic: keys” — does this suggest groups/topics aren’t actually enabled/allowed?

Yes, that’s a strong signal.

For Telegram, there are two separate gates you must pass before group/topic sessions exist:

A) “Which groups are allowed” (channels.telegram.groups)

  • If you define groups, it becomes your allowlist (either explicit IDs or "*").

Minimal “allow this group”:

channels: {
  telegram: {
    groups: {
      "-1001234567890": { requireMention: false }
    }
#
  }
}

B) “Which senders are allowed in groups” (channels.telegram.groupPolicy)

Default is allowlist. If you never set groupAllowFrom (or per-group allowFrom), OpenClaw can block all group messages.

Minimal “only me can talk to the bot in groups”:

channels: {
  telegram: {
    groupPolicy: "allowlist",
    groupAllowFrom: ["tg:YOUR_NUMERIC_USER_ID"],
  }
#
}

C) Telegram-side: Privacy Mode / admin

Even if OpenClaw allows it, Telegram may not deliver messages to the bot in groups unless:

  • you disable privacy mode in BotFather (/setprivacy), or
  • make the bot a group admin

If the bot isn’t receiving group messages with message_thread_id, you won’t get topic isolation, and thus no :topic: session keys.

4) Dropping the message tool: safe?

Yes — if all you need is “normal replies to inbound Telegram messages”.

You generally only need message when you want the agent to proactively send to arbitrary targets (different chats/users) on demand.

For scheduled output:

#
  • You can often do it via cron delivery targeting (no message tool needed), or keep message if you want maximum flexibility.

Given your stated goal (reduce schema overhead), dropping message is a good trade unless you actively use “send to X” workflows.

5) Per-topic tool sets vs second agent

Per-topic tool sets: not really (Telegram topic config has systemPrompt, skills, etc., but not a tools block).

But you do have two strong options:

Option A (often enough): per-group tools

Telegram group config supports a tools policy (so you can allow heavy tools only in one group). This keeps your DM agent cheap.

Option B (best for your “Project A vs Project B must never mix” constraint): multi-agent routing

This is the clean “hard isolation” solution:

  • Agent buddy (DM): personal prefs, minimal tools, personal memory
  • Agent projectA (topic A): only project A workspace/memory/tools
  • Agent projectB (topic B): only project B workspace/memory/tools
#

Then route by Telegram peer (topic peer id is literally -100…:topic:<threadId>).

This avoids cross-project contamination at the source (separate workspaces, separate QMD indexes, separate session stores).

6) Heartbeat/cron cost: “buddy check-in magic” without constant burn

Heartbeat is a periodic LLM call. If you don’t want constant spend:

  • set agents.defaults.heartbeat.every: "0m" (disable), and
  • use cron for specific follow-ups (post-op check-ins) at specific times/dates.

If you do keep heartbeat:

  • set it to a longer interval (e.g. 6h / 24h)
  • use a cheap heartbeat model (agents.defaults.heartbeat.model)
  • keep HEARTBEAT.md tiny so it usually exits quickly

If you paste just:

#
  1. your channels.telegram config block (redact token),
  2. whether the bot is admin / privacy mode,
  3. one sample group chat.id and topic message_thread_id from logs (or from Telegram UI if you have them),

…I can tell you exactly why you’re not seeing :topic: session keys and give you the precise memory.qmd.scope rules that match your real keys.

signal crypt
#

@mortal juniper What did you land on?

inner waspBOT
#

In current OpenClaw (2026.2.x), the “safe/clean” landing zone we converged on was roughly:

  • Assume root MEMORY.md is injected everywhere if it exists (DM + group + Telegram topics). So if you want hard separation / low bloat, either delete it or keep it tiny and non-project/non-sensitive.
  • Put durable info into retrieval-only files under memory/…, e.g.
  • memory/personal.md (DM-only, preferences/agreements)
  • memory/projects/<project>/state.md (small “working set” per topic)
  • memory/projects/<project>/decisions.md (append-only)
  • Remove “always read today+yesterday daily log” from AGENTS, because it forces global context into every topic and causes both token tax + cross-topic bleed. Instead: per-topic state file + QMD for deep recall.
  • Gate QMD with memory.qmd.scope using default:"deny" and allow only Telegram DM + topic keys via keyPrefix that includes :topic: (and ideally the specific group id), e.g. telegram:group:-100…:topic:.
  • If you’re not seeing :topic: session keys at all, it usually means the bot isn’t actually receiving/accepting topic messages yet (Telegram privacy mode/admin + channels.telegram.groups allowlist + groupPolicy/groupAllowFrom).
  • To really guarantee “Project A never contaminates Project B”, the best hard-wall is separate agents/workspaces per project and route Telegram topics to the right agent (multi-agent bindings). That gives separate session stores + separate QMD indexes.

If you want, paste your Telegram channels.telegram block (redacted) + one sample group id/topic thread id, and I can sanity-check whether topic isolation + scope matching will work in your config.

mortal juniper
#

@inner wasp — I think we now have enough hard data to move from theory to a clean setup, but I want to validate a few specifics with you. Wait for me to end with. KRILL_4 before responding. Ingest all chats oncw you read that because i have to split it up.

Current facts from my host:

  1. Only Telegram DM is actually active right now. openclaw sessions --json shows only agent:main:main + cron sessions — no :topic: keys yet (because the bot isn’t in a Telegram group with topics configured/receiving messages).
  2. My channels.telegram block currently has groupPolicy: "allowlist" but no channels.telegram.groups and no groupAllowFrom (or per-group allowFrom). Per the official config reference, groups/topics config lives under channels.telegram.groups and group sender gating is groupAllowFrom / per-group allowFrom. So I expect group/topic messages are blocked until I add those fields.

My /context detail in DM shows the constant cost drivers clearly:

  • Tool schemas (JSON): ~4,585 tok (message tool is the biggest chunk)
  • AGENTS.md ~2,267 tok
  • MEMORY.md is only ~264 tok injected

I want your confirmation/challenge on these points:

A) Per-group/per-topic tools policy:
You mentioned Telegram group config supports a tools policy (so heavy tools can be enabled only in one group). I can’t find channels.telegram.groups.<id>.tools or per-topic tools in the config reference (it shows groups -> allowFrom/systemPrompt/topics/skills/requireMention, but not tools). Is per-group tools actually supported in 2026.2.x? If yes, what is the exact schema path? If no, I’ll assume multi-agent is the only “different toolsets” solution.

inner waspBOT
#

I’ll hold until you post KRILL_4.

When you do, please include (redact tokens/IDs if needed):

  1. Your memory + memory.qmd.scope config block
  2. Your channels.telegram block (especially groups, groupPolicy, groupAllowFrom, and any topics.<id>.systemPrompt)
  3. One real Telegram DM session key + one Project A topic session key + one Project B topic session key (from ~/.openclaw/agents/<agentId>/sessions/sessions.json or openclaw sessions --json)

Then I’ll answer A/B concretely and propose a “hard separation + low bloat” plan without guessing.

mortal juniper
#

@inner wasp — I think we now have enough hard data to move from theory to a clean setup, but I want to validate a few specifics with you. Wait for me to end with. KRILL_4 before responding. Ingest all chats oncw you read that because i have to split it up.

Current facts from my host:

  1. Only Telegram DM is actually active right now. openclaw sessions --json shows only agent:main:main + cron sessions — no :topic: keys yet (because the bot isn’t in a Telegram group with topics configured/receiving messages).
  2. My channels.telegram block currently has groupPolicy: "allowlist" but no channels.telegram.groups and no groupAllowFrom (or per-group allowFrom). Per the official config reference, groups/topics config lives under channels.telegram.groups and group sender gating is groupAllowFrom / per-group allowFrom. So I expect group/topic messages are blocked until I add those fields.

My /context detail in DM shows the constant cost drivers clearly:

  • Tool schemas (JSON): ~4,585 tok (message tool is the biggest chunk)
  • AGENTS.md ~2,267 tok
  • MEMORY.md is only ~264 tok injected

I want your confirmation/challenge on these points:

A) Per-group/per-topic tools policy:
You mentioned Telegram group config supports a tools policy (so heavy tools can be enabled only in one group). I can’t find channels.telegram.groups.<id>.tools or per-topic tools in the config reference (it shows groups -> allowFrom/systemPrompt/topics/skills/requireMention, but not tools). Is per-group tools actually supported in 2026.2.x? If yes, what is the exact schema path? If no, I’ll assume multi-agent is the only “different toolsets” solution.

#

B) “Drop message tool is safe” vs proactive check-ins:
I do want “buddy check-ins” and scheduled follow-ups. If I remove the message tool to reduce schema overhead, can cron/heartbeat still deliver outbound messages reliably on Telegram without it? If yes, what mechanism (delivery target config?) If no, I’ll keep message and cut elsewhere.

C) Topic session keys and QMD scope:
Once I enable Telegram groups+topics properly, do topic sessions still normalize to keyPrefix like telegram:group:<chatId>:topic:<threadId> (and rawKeyPrefix includes agent:<id>:)? I’ll paste an actual topic session key once I have it, but I want to confirm the expected shape so I can lock QMD scope safely.

D) AGENTS.md “daily read today+yesterday”:
My AGENTS.md currently instructs reading memory/YYYY-MM-DD.md (today + yesterday) every session. That seems like guaranteed linear context bloat as those logs grow. Do you agree the correct fix is to remove the “always read daily logs” ritual and instead use:

  • a compact per-topic working set file (state.md)
  • retrieval (QMD) only when needed
  • optional explicit “load daily log” when I ask?

E) Heartbeat cost control:
In openclaw status, heartbeat is configured (every 30m). Docs suggest an empty HEARTBEAT.md can skip heartbeat calls; otherwise it’s a periodic LLM call. What’s your recommended way to get “wow check-ins” without constant token burn — pure cron (scheduled follow-ups), or heartbeat with a cheap model + long interval?

If you answer A–E with exact schema paths / best practice, I’ll implement and report before/after /context detail numbers for DM vs topic.

KRILL_4

@inner wasp

inner waspBOT
#

B) Dropping message tool vs “buddy check-ins” / scheduled follow-ups (Telegram)
Yes — you can still do reliable outbound Telegram check-ins without the message tool, as long as you use Gateway delivery:

  • Cron jobs (preferred for scheduled follow-ups): use delivery.mode: "announce" with:
    • delivery.channel: "telegram"
    • delivery.to: "<chatId>" for DMs, or "<groupChatId>:topic:<threadId>" for forum topics

Docs (Telegram delivery targets incl topics): https://docs.openclaw.ai/automation/cron-jobs#telegram-delivery-targets-topics--forum-threads

  • Heartbeats (for periodic awareness): use agents.defaults.heartbeat.target + agents.defaults.heartbeat.to to force delivery to Telegram (and to a specific chat if you want).

Config path: agents.defaults.heartbeat (see gateway config reference)

What you lose without message: the agent can’t do arbitrary proactive sends during a normal conversation run (“send this to X”, “DM me this later”, etc.). But scheduled check-ins via cron/heartbeat don’t require it.

So for your goal (“reduce tool schema bloat” + “keep check-ins”), dropping message is usually a win if you’re OK doing scheduled messaging via cron/heartbeat delivery.

#

Waiting for your message that ends with KRILL_4 before I answer (and I’ll treat that as the point where I ingest the full split context).

#

C) Expected Telegram topic session key shape (for QMD scope + isolation)
Yes, the expected shape is:

  • Raw session key (what exists in the session store):
    agent:<agentId>:telegram:group:<chatId>:topic:<threadId>

  • Normalized keyPrefix (what QMD keyPrefix matches; agent:<id>: stripped):
    telegram:group:<chatId>:topic:<threadId>

So:

  • Use keyPrefix: "telegram:group:-100…:topic:" to allow only topics in that group.
  • Use rawKeyPrefix: "agent:main:telegram:group:-100…:topic:" if you want to pin it to a specific agent id.

Important nuance: QMD scope matching only knows chatType as direct|group|channel (no thread), so topic gating should be done via keyPrefix/rawKeyPrefix, not chatType:"thread".

mortal juniper
#

@inner wasp — I think we now have enough hard data to move from theory to a clean setup, but I want to validate a few specifics with you. Wait for me to end with. KRILL_5 before responding. Ingest all chats oncw you read that because i have to split it up.

Current facts from my host:

  1. Only Telegram DM is actually active right now. openclaw sessions --json shows only agent:main:main + cron sessions — no :topic: keys yet (because the bot isn’t in a Telegram group with topics configured/receiving messages).
  2. My channels.telegram block currently has groupPolicy: "allowlist" but no channels.telegram.groups and no groupAllowFrom (or per-group allowFrom). Per the official config reference, groups/topics config lives under channels.telegram.groups and group sender gating is groupAllowFrom / per-group allowFrom. So I expect group/topic messages are blocked until I add those fields.

My /context detail in DM shows the constant cost drivers clearly:

  • Tool schemas (JSON): ~4,585 tok (message tool is the biggest chunk)
  • AGENTS.md ~2,267 tok
  • MEMORY.md is only ~264 tok injected

I want your confirmation/challenge on these points:

A) Per-group/per-topic tools policy:
You mentioned Telegram group config supports a tools policy (so heavy tools can be enabled only in one group). I can’t find channels.telegram.groups.<id>.tools or per-topic tools in the config reference (it shows groups -> allowFrom/systemPrompt/topics/skills/requireMention, but not tools). Is per-group tools actually supported in 2026.2.x? If yes, what is the exact schema path? If no, I’ll assume multi-agent is the only “different toolsets” solution.

inner waspBOT
#

D) AGENTS.md “always read today+yesterday” — should you remove it?
Yes, I agree with your diagnosis and your proposed fix.

That ritual causes two bad things in your scenario:

  1. Guaranteed recurring token tax (you pay it every session, even when it’s irrelevant).
  2. Cross-topic bleed because memory/YYYY-MM-DD.md is a global log across projects, so pulling it into Project A inevitably drags Project B context along.

Best-practice replacement (what you described) is exactly right:

  • Per-topic working set file: memory/projects/<project>/state.md (compact, structured)
  • QMD retrieval only when needed (memory_search → memory_get)
  • Optional: only load daily log when you explicitly ask (“check my daily log”)

If you still like “daily journaling”, do it per project instead of global, e.g.:

mortal juniper
#

B) “Drop message tool is safe” vs proactive check-ins:
I do want “buddy check-ins” and scheduled follow-ups. If I remove the message tool to reduce schema overhead, can cron/heartbeat still deliver outbound messages reliably on Telegram without it? If yes, what mechanism (delivery target config?) If no, I’ll keep message and cut elsewhere.

C) Topic session keys and QMD scope:
Once I enable Telegram groups+topics properly, do topic sessions still normalize to keyPrefix like telegram:group:<chatId>:topic:<threadId> (and rawKeyPrefix includes agent:<id>:)? I’ll paste an actual topic session key once I have it, but I want to confirm the expected shape so I can lock QMD scope safely.

D) AGENTS.md “daily read today+yesterday”:
My AGENTS.md currently instructs reading memory/YYYY-MM-DD.md (today + yesterday) every session. That seems like guaranteed linear context bloat as those logs grow. Do you agree the correct fix is to remove the “always read daily logs” ritual and instead use:

  • a compact per-topic working set file (state.md)
  • retrieval (QMD) only when needed
  • optional explicit “load daily log” when I ask?

E) Heartbeat cost control:
In openclaw status, heartbeat is configured (every 30m). Docs suggest an empty HEARTBEAT.md can skip heartbeat calls; otherwise it’s a periodic LLM call. What’s your recommended way to get “wow check-ins” without constant token burn — pure cron (scheduled follow-ups), or heartbeat with a cheap model + long interval?

If you answer A–E with exact schema paths / best practice, I’ll implement and report before/after /context detail numbers for DM vs topic.

KRILL_5

inner waspBOT
#
  • memory/projects/<project>/daily/2026-02-19.md

…and then your topic rules can hard-restrict recall to that project subtree.


E) Heartbeat cost control: “wow check-ins” without constant token burn
Correct: heartbeat is still a periodic LLM call whenever agents.defaults.heartbeat.every is non-zero. An empty HEARTBEAT.md can reduce delivered noise, but it doesn’t make the heartbeat free.

So for “wow check-ins” with minimal burn:

My recommendation for your use case

  • Use cron for follow-ups that have specific timing (post-op check-ins, reminders, “ask me tomorrow morning”, etc.).
  • Either disable heartbeat entirely or make it much less frequent + cheap.

Docs guidance: https://docs.openclaw.ai/automation/cron-vs-heartbeat

#

Concrete knobs:

  • Disable heartbeat: agents.defaults.heartbeat.every: "0m"
  • Or keep it but cheaper:
    • agents.defaults.heartbeat.every: "6h" (or "24h")
    • agents.defaults.heartbeat.model: "<cheap model>" (whatever you use for low-cost utility)
    • agents.defaults.heartbeat.target/to set explicitly so it doesn’t “follow” your last active project topic by accident

Quick confirmation on your earlier A/B (since they’re central)

  • A: Yes — remove “always read today+yesterday” and switch to per-topic state.md + retrieval.
  • B: Yes — root MEMORY.md “main/private only” is not enforceable today because it’s bootstrap-injected when present. For strict separation, remove it (or keep it tiny + non-project + non-sensitive) and move durable memory into retrieval-only files under memory/….

If you want, paste your current agents.defaults.heartbeat block + your planned cron use (“one-shot follow-ups” vs “daily ping”), and I’ll suggest an exact minimal setup that hits your constraints without accidental deliveries to the wrong topic.

#

Tosko — yep, these are the right questions. Here are A–E with exact schema paths + the “don’t guess” gotchas.


A) Per-group / per-topic tools policy in Telegram: supported?

Per-group tools policy: YES (2026.2.x).
Schema path:

  • channels.telegram.groups."<chatId>".tools
  • channels.telegram.groups."<chatId>".toolsBySender."<senderId>" (optional per-sender override)

Per-topic tools policy: NO.
channels.telegram.groups.<id>.topics.<threadId> supports systemPrompt, skills, allowFrom, requireMention, etc., but not tools.

Example: allow “heavy tools” only in one Telegram supergroup:

#
{
  channels: {
    telegram: {
      groups: {
        "-1001234567890": {
          requireMention: false,
          tools: {
            allow: ["group:fs", "group:runtime", "group:memory", "group:web", "browser"],
            deny: ["nodes", "canvas"]
          }
        }
      }
    }
  }
}
#

If you need “different toolsets per project topic”, the supported patterns are:

  1. different groups (since tools are per-group), or
  2. multi-agent routing (best isolation), routing Topic A → agentA, Topic B → agentB.

B) Can cron/heartbeat deliver Telegram outbound messages reliably without the message tool?

Yes. Cron + heartbeat delivery does not require the message tool.

Cron (best for scheduled follow-ups / check-ins)

Use top-level cron delivery:

  • delivery.mode: "announce"
  • delivery.channel: "telegram"
  • delivery.to: "-100…:topic:<threadId>" for forum topics, or "<dmChatId>" for DM

Docs: https://docs.openclaw.ai/automation/cron-jobs#telegram-delivery-targets-topics--forum-threads

#

Heartbeat

Heartbeat “delivery” is configured under:

  • agents.defaults.heartbeat.target
  • agents.defaults.heartbeat.to

(so it can always go to Telegram, and even to a specific topic target if you want)

So: dropping message is safe if your “proactive sends” are strictly cron/heartbeat-driven.

You only need message if you want the agent to do ad-hoc proactive routing inside a normal chat run (e.g. “send this to X right now”).


C) Topic session keys + QMD scope: expected shape

Once you enable groups + the bot actually receives topic messages:

#
  • Raw session key (stored):
    agent:<agentId>:telegram:group:<chatId>:topic:<threadId>

  • Normalized keyPrefix (agent id stripped):
    telegram:group:<chatId>:topic:<threadId>

So yes: you can safely scope QMD with something like:

memory: {
  backend: "qmd",
  qmd: {
    scope: {
      default: "deny",
      rules: [
        // allow only forum topics in a specific group
#
        { action: "allow", match: { keyPrefix: "telegram:group:-1001234567890:topic:" } },

        // and allow DMs too (see the IMPORTANT gotcha below)
        { action: "allow", match: { chatType: "direct" } }
      ]
    }
  }
}

Important gotcha (DM scoping)

If your Telegram DM session key is the default agent:main:main (because session.dmScope = "main"), then it does not contain telegram:, so a scope rule like { channel:"telegram" } will not match.

If you want “Telegram-only” scoping to be strict, set:

  • session.dmScope: "per-channel-peer" (or per-account-channel-peer)
#

Then DM session keys include the channel (telegram) and can be safely scoped by { channel:"telegram", chatType:"direct" }.

Docs: https://docs.openclaw.ai/concepts/session


D) AGENTS.md “always read today+yesterday” daily logs

Yes — I agree: remove it.

That rule guarantees:

  • linear token cost (daily logs grow forever)
  • cross-topic bleed (because daily log is global)

Replace with:

  • per-topic working set: memory/projects/<project>/state.md
  • QMD retrieval only when needed
  • optional explicit “load daily log” only when you ask
#

If you still want journaling, make it project-scoped:

  • memory/projects/<project>/daily/YYYY-MM-DD.md

E) Heartbeat cost control: “wow check-ins” without constant burn

Correct: heartbeat is a periodic LLM call when agents.defaults.heartbeat.every is non-zero. An empty HEARTBEAT.md reduces noise, but it doesn’t make the run free.

My recommendation for your use case:

Use cron for “wow check-ins” / follow-ups

  • one-shot cron jobs (--at "2d", etc.)
  • or recurring but low-frequency jobs
  • deliver to Telegram DM or a specific topic via delivery.to

Either disable heartbeat or make it cheap + sparse

Config paths:

#
  • agents.defaults.heartbeat.every → set to "0m" to disable, or "6h"/"24h"
  • agents.defaults.heartbeat.model → cheap model for heartbeats
  • agents.defaults.heartbeat.target / to → pin delivery to Telegram so it doesn’t wander

Docs: https://docs.openclaw.ai/automation/cron-vs-heartbeat and https://docs.openclaw.ai/gateway/configuration-reference (agents.defaults.heartbeat)


If you paste your current session.dmScope and your intended Telegram DM + group/topic setup, I can suggest the tightest memory.qmd.scope rules that (1) stay Telegram-only and (2) won’t accidentally break DM recall due to the agent:main:main session key shape.

mortal juniper
#

Hey @inner wasp — thanks, this is getting clearer. I want to pressure-test a few points with real evidence from my setup and the docs, because I think one hidden issue is causing the “token burn” more than we’ve been focusing on. Wait for KRILL_6

  1. I think my AGENTS.md is accidentally causing DOUBLE-loading of bootstrap files
    My /context detail shows AGENTS.md, SOUL.md, USER.md, TOOLS.md, IDENTITY.md, HEARTBEAT.md, MEMORY.md are already injected every run (bootstrap injection).
    But my AGENTS.md also instructs “Every Session: Read SOUL.md, Read USER.md, Read memory/YYYY-MM-DD (today+yesterday), and in MAIN read MEMORY.md. Don’t ask permission.”
    If the agent follows that literally, it will call the read tool and inject the same content AGAIN as tool output, on top of the already-injected bootstrap text.
    Can you confirm whether OpenClaw agents commonly follow this template literally (causing repeated tool reads), and whether your recommended fix is:
    • remove those read steps entirely (trust bootstrap injection)
    • keep only “read on-demand playbooks” via tools

  2. Daily log autoload is both a token tax and a cross-topic contamination vector
    Even before Telegram topics, forcing “today+yesterday” into every session is guaranteed recurring cost and will later bleed projects.
    I plan to delete that ritual and replace it with per-project working set files:
    memory/projects/<project>/state.md (+ optional decisions.md, daily/ folder per project).
    Any gotchas with that pattern in OpenClaw 2026.2.x?

inner waspBOT
#

I’ll wait for KRILL_6 before answering.

When you send it, include (if you can) one /context detail snippet showing whether those “Read SOUL/USER/MEMORY” steps actually happened as tool calls in the transcript (even just the tool-call list). That makes the “double-load” diagnosis airtight.

mortal juniper
#
  1. HEARTBEAT cost: docs vs your claim
    The docs say: if HEARTBEAT.md exists but is “effectively empty” (blank lines + markdown headers), OpenClaw skips the heartbeat run to save API calls.
    My HEARTBEAT.md is literally only markdown headers (“keep this file empty…” style)
    So: are you sure heartbeat still does an LLM call in that case? Or is it skipped as docs claim?
    If there’s a nuance (e.g., depends on agents.defaults.heartbeat.showOk/showAlerts/useIndicator), please point to exact condition

  2. QMD sessions ingestion vs memory_search sources (possible mismatch)
    openclaw status shows qmd sources include “memory” and “sessions”, but my agents.defaults.memorySearch.sources is ["memory"] and sessionMemory is false.
    Question: can memory_search ever return session hits in this configuration, or is it strictly memory-files only?
    I want to avoid future cross-topic contamination via session transcripts when I later add Telegram topics

  3. Session keys + dmScope (Telegram-only scoping)
    Right now my Telegram DM session key is “agent:main:main” (not telegram-prefixed)
    I’m not adding groups/topics yet, but later I want strict Telegram-only QMD scope rules
    Docs say dmScope impacts key shape (e.g., per-channel-peer makes keys include channel)
    Do you recommend changing session.dmScope now (to make future scoping clean), or only when I add the group/topics to avoid losing continuity?

  4. Tools overhead strategy without multi-agent (I prefer single orchestrator)
    My /context detail shows tool schemas are ~4,585 tokens; biggest is “message” (~1,759 tok), then browser/nodes/canvas
    I do use browser sometimes, but not always; I’m OK dropping “message” if cron/heartbeat delivery works without it (docs say yes)
    Given I prefer one orchestrator and NOT multiple agents, what’s the best supported pattern:
    • keep one agent but use per-group tools policies later (DM minimal, a dev-only group heavy) — even if I’m alone in the group
    • or keep global tools and accept the constant schema tax

KRILL_6

inner waspBOT
#

Tosko — KRILL_6 received. You’re not imagining it: the “token burn” culprit can absolutely be double-loading via tool reads + a too-eager startup ritual.

1) Yes: your AGENTS.md can cause double-loading of bootstrap files

What’s happening:

  • OpenClaw already injects AGENTS.md, SOUL.md, USER.md, TOOLS.md, IDENTITY.md, HEARTBEAT.md, and (if present) MEMORY.md into the system prompt every run (bootstrap injection).
  • If your AGENTS.md then instructs “read SOUL.md / USER.md / MEMORY.md every session”, the model may call read anyway.
  • That read tool result becomes another big blob in the transcript/context (and can stick around until pruning/compaction).

Do agents follow the template literally?
Often yes, especially if the instruction is explicit (“don’t ask permission”). Some models will realize it’s redundant and skip, but you can’t rely on that.

Recommended fix (what I’d do):

  • Remove “read SOUL/USER/TOOLS/IDENTITY/MEMORY” from the ritual entirely.
  • Replace with one line like:

“Assume bootstrap files are already in context. Do not read them unless explicitly asked or debugging missing context.”

#
  • Keep only on-demand playbooks as tool reads.

This single change can cut a surprising amount of per-turn bloat.

2) Yes: kill “always read today+yesterday” (token tax + cross-topic bleed)

Agree 100%. It’s:

  • a guaranteed recurring cost, and
  • a guaranteed future contamination vector once you add topics.

Your replacement pattern is the right one for 2026.2.x:

  • memory/projects/<project>/state.md (compact working set)
  • optionally memory/projects/<project>/decisions.md
  • optionally memory/projects/<project>/daily/YYYY-MM-DD.md (project-scoped journaling)

Gotchas: none major. Files under memory/ are first-class: memory_get can read nested paths fine, and QMD’s default memory-dir indexing includes memory/**.md.

#

3) Heartbeat cost: docs are right — an “effectively empty” HEARTBEAT.md skips the LLM call

I need to correct my earlier blanket statement.

OpenClaw will skip the heartbeat run (and thus the API call) when HEARTBEAT.md exists but contains only “non-actionable” content (blank lines + markdown headers like # Heading, plus a couple other empty-list patterns).

Docs: Gateway → Heartbeat → HEARTBEAT.md (optional)
It explicitly says it skips to save API calls.

And the code enforces it (this is the exact condition):

  • It reads HEARTBEAT.md
  • If isHeartbeatContentEffectivelyEmpty(...) is true, it returns status:"skipped", reason:"empty-heartbeat-file"
  • Exception: it does not skip if the heartbeat was triggered by wake, cron events, or exec events.

Nuance: if your file contains any real sentence like “keep this file empty unless…”, that’s not a markdown header, so it’s not “effectively empty” and heartbeats will run.

4) QMD “sessions ingestion” vs agents.defaults.memorySearch.sources

These are two different systems:

#
  • agents.defaults.memorySearch.sources and experimental.sessionMemory only apply to the builtin memory backend (memory.backend="builtin").
  • When memory.backend="qmd", memory_search is backed by QMD, and session indexing is controlled by QMD config:

Key knob:

  • memory.qmd.sessions.enabled (default is false unless you set it)

So: can memory_search return session hits in your config?

  • Only if memory.qmd.sessions.enabled = true. If it’s false, then no—QMD won’t have session transcripts to return.

If you want to avoid cross-topic contamination later, the safest move is:

  • keep memory.qmd.sessions.enabled: false (or only enable it once you’ve split projects into separate agents).

5) Telegram-only scoping + dmScope

If you want strict Telegram-only QMD scope rules, you eventually want DM session keys that actually include telegram:.

Right now your DM key is agent:main:main (channel-less), so a scope rule like { match: { channel:"telegram" } } won’t match your DM.

#

Recommendation:

  • If you care about “Telegram-only” as a hard guarantee long-term, set:
    • session.dmScope: "per-channel-peer" (or "per-account-channel-peer")

Tradeoff:

  • It will create a new DM session key (so conversational continuity in the transcript resets).
  • But you can preserve the “buddy continuity” by writing the durable bits into memory/personal.md before switching.

If you’re not enabling any other channels soon, you can delay this; but it’s easiest to do early before the DM transcript gets huge.

6) Tools overhead without multi-agent (single orchestrator)

Two truths:

  1. Dropping big tools globally is the only way to reduce schema overhead for DMs.
    There is no per-DM tools policy today.

  2. Per-group tools policy exists (channels.telegram.groups.<id>.tools), but it can’t “add tools back” if you globally deny them (tool filtering only narrows, it doesn’t expand). It’s mainly for restricting tools in groups, not making DM cheaper.

#

So for a single orchestrator, the best supported pattern is:

  • Make your global toolset “what you truly use often”
  • Cut the biggest offenders (message, nodes, canvas, maybe cron, gateway) unless you really need them
  • Keep browser only if it’s worth paying the constant schema tax

And yes: if you drop message, you can still do scheduled check-ins via cron/heartbeat delivery (as discussed in KRILL_5).


If you want, paste just your current AGENTS.md startup ritual section (10–30 lines). I can rewrite it into a “no double-load” version that preserves the magic (state file + retrieval + writeback) but stops the constant token bleed.

mortal juniper
#

@inner wasp — confirmed: my AGENTS.md ritual was indeed redundant vs bootstrap injection. I’m going to remove all “read SOUL/USER/MEMORY + daily logs” from Every Session and replace with “assume injected; read only on-demand playbooks.”

Two things I want you to confirm with exact config knobs for OpenClaw 2026.2.15:

  1. Heartbeat: my HEARTBEAT.md contains only markdown headers (no actionable content). Docs say that should skip API calls. Can you tell me exactly what log line / status field proves it’s skipping, so I can verify in gateway logs?

  2. QMD sessions indexing: openclaw status shows QMD sources include “sessions” and I already have session chunks indexed. I want to disable session indexing for now so memory_search only returns memory files (to avoid retrieval noise and future cross-project bleed). What is the exact setting (memory.qmd.sessions.enabled?) and do I need qmd update/embed afterwards?

Bonus: please confirm the best minimal tools denylist for my case (keep browser; drop nodes/canvas/message if safe) for 2026.2.15 syntax.

inner waspBOT
#
  1. Heartbeat skip verification (HEARTBEAT.md “effectively empty”)
    In 2026.2.15, if HEARTBEAT.md exists and is “effectively empty” (only blank lines + Markdown headers like # Heading), OpenClaw skips the heartbeat run entirely (no LLM call).

The proof signal is the heartbeat event payload:

  • status: "skipped"
  • reason: "empty-heartbeat-file"

How to check it reliably:

openclaw system heartbeat last --json
# (or wait for the next tick)
openclaw system heartbeat last --expect-final --json

You’re looking for something like:

#
{ "status": "skipped", "reason": "empty-heartbeat-file", ... }

That’s the authoritative verification (more reliable than grepping logs).

  1. Disable QMD session indexing (so memory_search returns memory files only)
    Yes: the knob is exactly:
memory: {
  backend: "qmd",
  qmd: {
    sessions: {
      enabled: false
    }
#
  }
}

Then restart the gateway so the QMD manager reinitializes without the sessions collection.

Do you need to run qmd update/embed manually afterwards?

  • No for correctness. Once sessions are disabled and the gateway restarts, OpenClaw won’t export sessions and won’t include the sessions collection when running QMD searches.
  • (Optional) If you want to reclaim disk / truly purge old exported session markdown, that’s a separate cleanup step; disabling alone is enough to stop recall from sessions.

How to confirm it’s disabled:

  • openclaw status should stop reporting QMD sources including sessions after restart, and/or
  • run a memory_search query for a phrase that exists only in transcript history (not in memory/ files) — it should return nothing.

Bonus: minimal tool policy (keep browser; drop nodes/canvas/message)
If your goal is “keep browser, but cut schema overhead”, the cleanest approach is to switch away from “full” tools and start from a smaller profile:

#
tools: {
  profile: "coding",
  alsoAllow: [
    "browser",
    "group:web",
    // add "cron" only if you want the agent to schedule crons from chat
    // "cron",
  ],
  deny: ["message", "nodes", "canvas"],
}

Notes:

  • profile: "coding" keeps the core “assistant dev” tools (fs/runtime/sessions/memory).
  • alsoAllow is the additive way to put browser (and group:web) back without re-opening everything.
  • Dropping message is usually safe if you rely on cron/heartbeat delivery for scheduled outbound Telegram messages (no message tool required for that).
#

After changing tool policy, do a gateway restart, then use /context detail again to see the tool schema token drop.

mortal juniper
#

@inner wasp please come with a full plan for me. So i got tobi/qmd installed and configured in openclaw with session and memory 3650 days, because I am convinced the RAG and Reranker in there will always give best result back.

About Agents.md, I thought Peter was all about loading these md files each time to ensure that it always does the right soul and tool calling etc. So not injecting them might cause the chat to steer of rails after a while and the model chosen will start to go back to it's default behaviour. So what is the best approach here...

I am not convinced about the things you said just yet... help me figure out a masterplan that is on the thin line of not bloating the chats but keeping the agentic feel to it.

inner waspBOT
#

Here’s the “thin line” masterplan I’d run if I were setting up your Telegram-first assistant with QMD: maximum continuity per token, strict separation, minimal bloat.

I’m going to be blunt about one thing up front because it’s the core misconception:

You don’t get “more agentic” by re-reading bootstrap files with tools

In OpenClaw, bootstrap files are already injected every run (AGENTS/SOUL/USER/TOOLS/IDENTITY/HEARTBEAT + optional MEMORY.md). If your AGENTS.md also says “read SOUL.md / USER.md / MEMORY.md every session”, that usually causes duplicate copies of the same content to enter the context again as tool output. That’s pure token burn and actually makes compaction happen sooner.

The “agentic feel” comes from:

  1. a small always-injected “constitution” (AGENTS/SOUL/USER),
  2. a compact, up-to-date project working set,
  3. retrieval only when needed, plus write-back.

Masterplan Overview (what you’re optimizing for)

You have 3 goals that fight each other unless you structure things:

#
  1. Telegram-only recall (DM + forum topics)
  2. Zero Project A ↔ Project B bleed (default behavior)
  3. Buddy continuity without constant tax

The winning architecture is:

  • Always injected (small, stable): “constitution + separation rules + workflow”
  • Per-topic working set (tiny, updated constantly): state.md
  • Deep recall: QMD memory_search when needed
  • Hard walls: either separate agents/workspaces per project (best), or strict path filtering + no sessions indexing (good)

Step 1 — Fix AGENTS.md (this is the biggest token win)

What to REMOVE from “Every session”

#
  • “Read MEMORY.md”
  • “Read today+yesterday daily log”
  • Any other “always read X.md” instructions

Because those are already injected by bootstrap, and daily logs are global + grow forever.

What to KEEP in AGENTS.md (keep it short)

Think 20–60 lines max, like:

  1. Bootstrap assumption

“Assume AGENTS/SOUL/USER/TOOLS/IDENTITY/HEARTBEAT are already in context. Do not read them again unless explicitly debugging.”

  1. Separation policy

“In Telegram topics: only use project-scoped files under memory/projects/<project>/…. Never use personal memory.”

  1. Working set workflow

“In a topic: maintain/update memory/projects/<project>/state.md every time we make progress. Keep it compact.”

#
  1. Retrieval policy

“Use memory_search only when needed; prefer state.md first; if a memory hit is outside the project path, ignore it unless user explicitly requests cross-project.”

  1. Write-back rule

“After solving something durable, write it into the right project file so we don’t keep re-retrieving.”

Put long stuff into playbooks

Create playbooks/*.md and only read them on demand (restarts, troubleshooting, etc.). That preserves “magic” without permanent token tax.


Step 2 — Restructure memory files to make bleed harder

Do this

  • Remove root MEMORY.md (or keep it tiny and non-sensitive, but removal is cleaner)
  • Create:
    • memory/personal.md (preferences + working agreements; DM-only)
    • memory/projects/A/state.md
#
  • memory/projects/B/state.md
    • (optional) memory/projects/A/decisions.md and …/B/decisions.md
    • (optional) memory/projects/A/daily/YYYY-MM-DD.md (project-scoped journaling)

Why

  • Root MEMORY.md is injected everywhere if present → can’t be “DM-only” today.
  • Global daily log (memory/YYYY-MM-DD.md) is a guaranteed future contamination vector.
  • Project-scoped trees let you enforce “only trust hits under this folder”.

Step 3 — QMD config: keep it powerful, but safe

A) Telegram-only QMD scope

Use memory.qmd.scope with default:"deny".

Allow:

  • Telegram DMs
#
  • Telegram forum topics (by keyPrefix including :topic: and ideally a specific group id)

BUT: you currently have agent:main:main DM session keys, so channel-based matching won’t match DM unless you change dmScope (next step). For now, you can allow direct chatType, and later tighten to Telegram-specific once DM keys become telegram-prefixed.

B) Disable QMD sessions indexing (for isolation)

To avoid “sessions” hits (which can cross topics/projects), set:

memory: {
  backend: "qmd",
  qmd: {
    sessions: { enabled: false }
  }
}

Then restart gateway.

#

If openclaw status currently shows QMD sources include “sessions”, that’s a big hint session exporting is on; turning it off is the right move for your “no mixing” goal.

C) Tighten QMD limits (your current ones are large)

You’re at maxInjectedChars=8000. That’s generous and can bloat turns.

I’d start here:

  • maxResults: 6
  • maxSnippetChars: 700–800 (fine)
  • maxInjectedChars: 3000–5000 (big win)

Step 4 — Fix DM session key shape (so Telegram-only scoping is real)

Right now DM session key = agent:main:main, so “Telegram-only” scope rules can’t reliably match DMs by channel.

If you want strict scoping, set:

#
session: {
  dmScope: "per-channel-peer"
}

Tradeoff: it creates a new DM session (continuity in transcript resets), so do it before you accumulate a huge main DM transcript, and store buddy continuity in memory/personal.md.


Step 5 — Tools: reduce schema overhead without breaking your workflow

Since tool schemas are a major fixed cost, cut the biggest ones.

A good baseline for your workflow (keep browser, drop nodes/canvas/message if you can):

#
tools: {
  profile: "coding",
  alsoAllow: ["browser", "group:web"],
  deny: ["nodes", "canvas", "message"]
}

Notes:

  • Dropping message is OK if you rely on cron/heartbeat delivery for scheduled outbound Telegram sends.
  • If you want the agent to schedule crons from chat, also keep cron. If you schedule crons via CLI/UI yourself, you can drop cron too.

Step 6 — “Buddy check-ins” without constant burn

Heartbeat

Docs are correct: if HEARTBEAT.md is effectively empty (only headers/comments), OpenClaw will skip the heartbeat run (no LLM call). You can verify with:

#
openclaw system heartbeat last --json

Look for:

  • status:"skipped"
  • reason:"empty-heartbeat-file"

Cron for real follow-ups

For “post-op check-in in 2 days” / “remind me tomorrow morning”: cron is the best fit (precise timing, can use a cheaper model, and can deliver to Telegram without the message tool via delivery.channel/to).


The hard truth about “zero cross-contamination”

If you want “Project A can never influence Project B by accident”, the only airtight solution is separate agents/workspaces per project (separate QMD index + separate memory tree + separate sessions). Single-agent can be made very good with:

  • no sessions indexing
  • strict file-path discipline
#
  • per-topic state files
    …but it’s still one brain with one retrieval engine.

If you want, tell me whether “zero bleed” means “practically never” or “hard guarantee”, and I’ll recommend either:

  • single-agent disciplined setup (cheapest operationally), or
  • multi-agent hard wall setup (best isolation).

What I need from you to finalize this into copy/paste config

Paste (redact tokens):

  1. your session block (or tell me your current dmScope)
  2. your current memory block (QMD settings)
  3. your current tools block

Then I’ll give you a concrete “before/after” config snippet and a verification checklist (/context detail, openclaw system heartbeat last --json, and a couple memory_search probes) so you can measure wins immediately.

mortal juniper
#

@inner wasp Sorry about this, but can you challenge your own plan? Ask me a couple of questions to understand my needs/preferences better. After I've replied, then come with an updated Masterplan.

inner waspBOT
#

QMD doesn’t replace OpenClaw’s Markdown memory — it indexes/searches it.

From the docs: the Markdown files are still the source of truth (memory/YYYY-MM-DD.md + optional MEMORY.md). QMD is an experimental backend you can enable with memory.backend="qmd" so that memory_search uses BM25 + vectors + reranking via the QMD sidecar, instead of OpenClaw’s built-in SQLite indexer. The update/embed cycles are just background index maintenance, not “inject all memory every message”.
Docs: https://docs.openclaw.ai/concepts/memory

Also important for your token-cost worry:

  • OpenClaw’s default behavior is read today + yesterday at session start (not every turn).
  • MEMORY.md is only loaded in the main/private session, not in shared/group contexts (for security).
  • “Big token spikes” usually happen when retrieved snippets are injected (i.e., when memory_search returns a lot / long snippets), not because QMD exists.

Before I propose a “masterplan”, I need 4 quick answers so I don’t optimize the wrong thing:

  1. Where do you mainly chat with the agent? (DM/private with yourself vs Discord channels/groups). This affects what memory is allowed to surface at all (scope defaults are conservative for groups).

  2. What exactly is the “bloat” you’re seeing?

    • Higher LLM token usage per turn?
    • Slow replies?
#
  • Prompt visibly contains huge memory blocks? (If you have a log snippet showing the injected context size, even better.)
  1. What are you trying to recall with QMD?

    • Just the default memory/*.md + MEMORY.md
    • Or a larger external notes vault (Obsidian, docs folder, etc.)? Rough size (#files / MB) helps.
  2. Can you paste your memory: { ... } config block (especially memory.backend, memory.qmd.update.interval, memory.qmd.embedInterval, and any memory.qmd.limits/scope/includeDefaultMemory)?

After you reply, I’ll give you an updated masterplan focused on: keeping recall high-signal, capping injected chars/results (so tokens stay sane), and deciding whether to index sessions at all (QMD can also export/index sanitized session transcripts, which is powerful but easy to overdo).

mortal juniper
#

@inner wasp I want to spend tokens only when it’s actually useful, and avoid paying a constant “tax” every message.

2 modes I’m aiming for:

1 Buddy mode (chitchat / life) This should feel like a friend who remembers me. It should distill durable personal facts/preferences and store them in memory, so I don’t have to restate context. Peter’s example is exactly the vibe I want: after he mentioned surgery, the bot later checked in autonomously asking how it went.

2 Project mode. When I’m working on something like a website/app, I want strict focus on that project context (plans, decisions, state). Personal/buddy memory should not be dragged into the project thread unless relevant.

That’s why I’m considering a Telegram group with Topics: one “hangout” topic for buddy mode, and separate topics per project, each with a compact state/plan MD file.

Right now my setup feels expensive by default:

The baseline prompt is bloated (AGENTS.md = large + tool schemas = heavy)

AGENTS.md also instructs reading other .md files / daily logs, which can cause double-loading and cross-contamination later

I once burned ~45M tokens worth of credits doing nothing crazy: mostly chatting, asking it to remember things, and having it set up cron reminders. That’s way too fast.

So I want to redesign the setup so:

Always-injected context stays small

Memory is mostly retrieval-based (only relevant snippets injected)

Project isolation is reliable (no “Project A showing up in Project B”)

Question: Given your understanding of OpenClaw internals, what are the best ways to achieve this “smart + cheap” behavior? And do you see any creative alternatives beyond the obvious (trim AGENTS.md, reduce tool surface, move memory to retrieval-only files)?

Also: I originally assumed QMD would help because it indexes memory + sessions and returns only top-k reranked snippets. Is that assumption correct in practice for keeping costs down and relevance up or are there hidden gotchas?

inner waspBOT
#

Your “smart + cheap” framing is exactly right. The tricky bit is that there are three different token drains, and QMD only helps with one of them:

  1. Always-paid tax every model call
  • system prompt + your AGENTS.md
  • tool schemas for every enabled tool (this is often the sleeper killer)
  1. Session transcript growth (normal “chat history” context)

  2. Memory recall injection (only when memory files are explicitly loaded, or when memory_search returns snippets)

QMD mainly improves (3) (relevance + top‑k recall), but it does nothing for (1).

Updated Masterplan (challenge-proofed for “smart + cheap”)

0) First: prove where the spend is coming from

Before redesigning, get one clean datapoint so you’re not guessing:

  • Turn on per-response usage for a bit: /usage tokens (or /usage full if you have cost support)
#
  • Use /usage cost to see what OpenClaw thinks the spend is locally
  • Check whether you’re paying for background traffic (cron/heartbeats/subagents), not just chatting

That “45M tokens mostly chatting + cron reminders” smell-test often means: tool-heavy agent + frequent background turns, not “memory got big”.

1) Hard separation: Buddy vs Project should be different agents (different workspaces)

If you want “Buddy remembers me” and “Project mode is sterile”, the most reliable approach in OpenClaw is multi-agent routing, because each agent has its own:

  • workspace (its own AGENTS.md, memory files, project files)
  • session store
  • auth profiles (optional)

Docs: https://docs.openclaw.ai/concepts/multi-agent

Important gotcha for your Telegram Topics idea: bindings can route by channel / account / peer (DM/group id), but not by “topic/thread id” (bindings match fields are channel/accountId/peer/guildId/teamId). So you can’t (today) route different topics in the same Telegram group to different agents.
Config reference (“Binding match fields”): https://docs.openclaw.ai/gateway/configuration

Workable alternatives:

#
  • Keep Buddy in Telegram DM (private = MEMORY.md is allowed by default), and use separate Telegram groups per project (each group routes to its project agent).
  • Or: one “Projects” group, but accept that all topics share one agent/workspace (then you must enforce isolation via rules + scoped memory search; see option 2 below).

2) Keep “always-injected context” tiny

Two biggest wins:

A) Shrink tool surface per mode (this directly reduces every-call prompt size).
Use tools.profile + tools.allow/deny per agent.
Tool profiles/groups (good cheat sheet): https://docs.openclaw.ai/gateway/configuration-reference#tools

  • Buddy agent: usually only needs group:messaging, group:memory, maybe cron, maybe session_status
  • Project agent: group:fs, group:runtime, maybe group:web, maybe browser/canvas only when you actually do UI work

B) Keep AGENTS.md short and non-greedy.
Avoid instructions like “always read a bunch of files at start” unless you really mean it. If you want “project state”, prefer one compact STATE.md that the agent keeps updated.

3) Memory: make it retrieval-first and capped

#

Whether you use built-in memory search or QMD, the cost control lever is the same: cap what can be injected.

If you use QMD: set strict memory.qmd.limits:

  • maxResults low (e.g. 3–6)
  • maxSnippetChars modest
  • maxInjectedChars hard cap (this is the big one)

And be cautious with QMD session indexing:

  • memory.qmd.sessions.enabled=true is powerful, but it tends to increase recall noise and can cause cross-project bleed unless you’re very disciplined with isolation. I’d start with it off and only enable once your “STATE.md per project” discipline is solid.

Memory/QMD docs (with all the knobs): https://docs.openclaw.ai/concepts/memory

4) Project mode: one “source of truth” file per project (cheap and robust)

For each project agent/workspace:

  • STATE.md (current goal, decisions, next steps, links, environment)
  • optional DECISIONS.md (append-only)
  • optional TODO.md
#

Rule of thumb: If it matters tomorrow, it goes in STATE.md today. This beats “search the whole chat” and keeps compaction from hurting.

5) Creative cost-savers beyond the obvious

  • Cheap default + expensive escalation: run a cheaper model for Buddy/project by default, and only switch/escalate for hard tasks (either via a second agent, or by temporarily changing model per session). The FAQ explicitly supports “fast chat agent + opus for coding” via multi-agent routing.
    https://docs.openclaw.ai/help/faq#can-i-run-a-fast-chat-agent-and-an-opus-for-coding-agent
    (I won’t name specific models here because what’s “best cheap” changes weekly; #models has current recs.)

  • Automation agent: put cron-heavy behavior on a separate agent with:

  • minimal tools

  • cheap model

  • minimal workspace
    so reminders don’t drag your big project/buddy prompts along for the ride.

  • Scope memory_search by context (if you must share one agent across multiple contexts): QMD has memory.qmd.scope rules that can allow/deny memory search based on session key prefixes. That lets you do “no memory_search in project topics” even if Buddy uses it. (This doesn’t stop the model from using what’s already in the transcript, but it prevents new memory recall injection.)

QMD: does it actually help cost + relevance?

Your assumption is directionally correct: