#QMD not indexing subfolders — how to add custom paths like /workspace/vault and /workspace/projects

1 messages · Page 1 of 1 (latest)

torn thunder
#

I have QMD running and working for my memory folder but it's not picking up content in subdirectories. My agent can search memory files fine but can't find anything in /workspace/vault/working-context/ or /workspace/projects/refurb/ when asked about those topics.
What's the correct config key in openclaw.json to add additional folders as indexed QMD sources? And does each subfolder need adding separately or can I point it at a parent folder and have it recurse?
Running OpenClaw 2026.4.21 on Ubuntu 24.04 / WSL2.

earnest graniteBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

weary gullBOT
raven matrixBOT
#

Use memory.qmd.paths for the explicit QMD config.

{
  memory: {
    backend: "qmd",
    qmd: {
      paths: [
        { name: "working-context", path: "/workspace/vault/working-context" },
        { name: "refurb", path: "/workspace/projects/refurb" }
      ]
    }
  }
}

A couple important bits:

#
  • paths[] entries are { path, name?, pattern? }
  • default pattern is **/*.md, so yes, it recurses
  • you do not need to add every subfolder separately if pointing at a parent is what you want indexed

So these would also work if you want broader roots:

qmd: {
  paths: [
    { name: "vault", path: "/workspace/vault" },
    { name: "projects", path: "/workspace/projects" }
  ]
}

But I’d usually recommend indexing the narrower roots you actually want, otherwise recall gets noisier and reindexing gets heavier.

#

Also, by default QMD only auto-indexes:

  • MEMORY.md
  • memory/**/*.md

So other workspace folders are not picked up unless you add them.

Docs:

After changing config, restart the gateway or wait for the next QMD refresh cycle, then check with pnpm openclaw memory status.

Glad that helped if this was the missing piece 🎉
If this fixed it, please right-click the solution message → Apps → ✅ Mark Solution so Answer Overflow can index it.

torn thunder
#

Added custom paths to openclaw.json per your earlier advice but after gateway restart and 8+ minutes, memory status still only shows "memory" as a source with 38 files. The new vault and projects paths are not appearing.
Config snippet:
"paths": [
{"name": "vault", "path": "/workspace/vault"},
{"name": "projects", "path": "/workspace/projects"}
]
Running OpenClaw 2026.4.21. Is there a way to force reindex, or is there another config key needed to activate custom paths?

#

?

raven matrixBOT
#

Yep, sorry, clearer version:

Simplest key: agents.defaults.memorySearch.extraPaths

{
  memory: {
    backend: "qmd"
  },
  agents: {
    defaults: {
      memorySearch: {
        extraPaths: [
          "/workspace/vault/working-context",
          "/workspace/projects/refurb"
        ]
#
      }
    }
  }
}

With QMD enabled, those get turned into QMD collections automatically.

A few notes:

  • Yes, directories recurse
  • You can point at a parent folder instead of every child
  • By default it looks for Markdown files recursively (**/*.md)

If you want more control, use the QMD-specific form instead:

#
{
  memory: {
    backend: "qmd",
    qmd: {
      paths: [
        { name: "working-context", path: "/workspace/vault/working-context" },
        { name: "refurb", path: "/workspace/projects/refurb" }
      ]
    }
  }
}

Use memory.qmd.paths if you want custom name or pattern. Use extraPaths if you just want “index these folders too”.

Docs:

torn thunder
#

Tried both methods you suggested. Neither is being picked up after gateway restart. Memory status still shows only "memory" source with 38 files.
Current config has both:
memory.qmd.paths = [vault, projects]
agents.defaults.memorySearch.extraPaths = [/workspace/vault, /workspace/projects]
Can you confirm the exact location in openclaw.json these keys need to sit? Happy to paste my full config if that helps diagnose it.
OpenClaw 2026.4.21

raven matrixBOT
#

Yeah, let’s pin this down.

Those keys must be at the top level of openclaw.json, nested like this, not under plugins, tools, or channels:

{
  memory: {
    backend: "qmd",
    qmd: {
      paths: [
        { name: "vault", path: "vault", pattern: "**/*.md" },
        { name: "refurb", path: "projects/refurb", pattern: "**/*.md" }
      ]
    }
  },
#
  agents: {
    defaults: {
      workspace: "/actual/host/workspace/path",
      memorySearch: {
        extraPaths: [
          "vault",
          "projects/refurb"
        ]
      }
    }
  }
}

A few important gotchas:

#
  • memory status showing only source memory is normal, that is the source type, not each QMD collection
  • paths / extraPaths are resolved on the gateway host filesystem, not the sandbox/tool /workspace unless your real workspace is actually /workspace
  • default pattern is recursive **/*.md, so parent folders recurse fine
  • if those folders contain mostly non-.md files, they will not be indexed unless you change pattern

So the biggest suspicion here is this:
/workspace/vault/... may be a sandbox path, not the real host path QMD can see.

If those folders are inside the agent workspace, prefer workspace-relative paths like:

qmd: {
  paths: [
    { path: "vault/working-context" },
    { path: "projects/refurb" }
  ]
torn thunder
#

Here's the --deep output. Still only showing memory source. Indexed count went from 38 to 54 chunks after the config change so something is being picked up, but the source still just says "memory". Is "memory" actually the source type covering all QMD collections, or are vault and projects genuinely not being indexed?

Memory Search (main)

Provider: qmd (requested: qmd)

Model: qmd

Sources: memory

Indexed: 54/3 files · 54 chunks

Dirty: no

Store: ~/.openclaw/agents/main/qmd/xdg-cache/qmd/index.sqlite

Workspace: ~/.openclaw/workspace

Dreaming: off

Embeddings: ready

By source:

memory · 54/3 files · 54 chunks

Vector: ready

Batch: disabled (failures 0/0)

Recall store: 3 entries · 0 promoted · 3 concept-tagged · 0 spaced · scripts=3 latin

Recall path: ~/.openclaw/workspace/memory/.dreams/short-term-recall.json

Recall updated: 2026-04-23T20:41:40.868Z

QMD audit: ~/.openclaw/agents/main/qmd/xdg-cache/qmd/index.sqlite · 3739648 bytes · 4 collections

Dreaming artifacts: diary absent · 0 corpus files · ingestion state absent

Dream corpus: ~/.openclaw/workspace/memory/.dreams/session-corpus

Dream ingestion: ~/.openclaw/workspace/memory/.dreams/session-ingestion.json

raven matrixBOT
#

Yep, this is the key bit:

memory is the source type, not the individual QMD collection name.

So in your output:

  • Sources: memory = all non-session memory content
  • By source: memory = aggregated across all QMD memory collections
  • QMD audit ... 4 collections = the important signal

OpenClaw currently reports memory sources as coarse buckets like:

  • memory
  • sessions

It does not break status out by each QMD collection in that view.

#

So if you previously had the 2 default QMD collections, and now --deep shows 4 collections, that matches:

  1. MEMORY.md
  2. memory/**/*.md
  3. your vault path
  4. your projects path

And since your chunk count went 38 → 54, I’m pretty confident those extra paths are being indexed now.

So the issue has probably shifted from indexing to retrieval/relevance.

Quick checks next:

  • make sure the files in those folders are actually *.md
  • test with a very unique phrase from one of those files
  • paste one example file path plus the exact question you asked the agent, and I can help narrow down whether it’s query wording, QMD relevance, or something else
torn thunder
#

Indexed confirms 4 collections, 54 chunks. But when I ask the agent about "Lettio" — a specific tool only in the indexed vault file — she answers from general knowledge, not the file. She's not retrieving from the custom paths at all. File is at workspace/vault/working-context/property-mgmt-research.md. How do I confirm QMD is actually being queried for those collections at inference time, not just indexed?

raven matrixBOT
#

Yep, there are 2 separate things to verify now:

1) Prove QMD itself can retrieve that file

Run this locally:

pnpm openclaw memory search "Lettio" --json

If QMD is really searching the custom collections, you should get a hit from something like:

qmd/<collection>/property-mgmt-research.md

with a snippet from that file.

#

If that command does not return the file, the problem is still in QMD retrieval/indexing.
If it does return the file, then QMD is fine and the problem is higher up, in chat-time memory usage.

2) Prove the live chat session is actually consulting memory

For that, test in a private session / local TUI / DM if possible, not a public thread, because trace output can leak context.

Turn on:

/verbose on
/trace on

Then ask about Lettio again.

What you should see:

#
  • /verbose on → an Active Memory status line after the reply
  • /trace on → an Active Memory Debug summary after the reply

If the model is doing a normal tool call, verbose can also show memory_search activity.
If it’s using the Active Memory plugin path, the debug lines are the better signal.

3) If CLI search works but chat still ignores it

Then check:

  • plugins.entries.active-memory.enabled: true
  • your current agent is included in plugins.entries.active-memory.config.agents
  • pnpm openclaw logs --follow

If needed, temporarily enable:

#
plugins: {
  entries: {
    "active-memory": {
      config: {
        logging: true
      }
    }
  }
}

Then watch logs for active-memory start/done lines.

What your current output already proves

4 collections means the extra QMD collections exist.
What it doesn’t prove is that the agent is using them during reply generation.

torn thunder
#

QMD search works perfectly, returns 3 hits from vault files with scores 0.74-0.78. But live chat ignores them completely and answers from general knowledge. So the issue is Active Memory not consulting QMD at chat time. How do I verify active-memory plugin is enabled and covering the main agent?

raven matrixBOT
#

Yep, if pnpm openclaw memory search "Lettio" works, then QMD is fine. The missing layer is almost certainly Active Memory config/scope.

Check these exact keys in openclaw.json:

{
  plugins: {
    entries: {
      "active-memory": {
        enabled: true,
        config: {
          enabled: true,
          agents: ["main"],
          allowedChatTypes: ["direct", "channel"]
        }
      }
#
    }
  }
}

The important bits:

  • plugins.entries["active-memory"].enabled: true = plugin loaded
  • plugins.entries["active-memory"].config.enabled: true = globally on
  • config.agents must include "main"
  • config.allowedChatTypes must include "channel" for this Discord thread

That last one is the big gotcha. The default is:

#
allowedChatTypes: ["direct"]

So by default Active Memory will not run in channel/thread sessions, even though QMD indexing/search works fine.

Good verification steps:

  • /active-memory status --global → shows global plugin state
  • /active-memory status → shows session-level override
  • restart gateway after config change

If you want live proof, safest in a DM/local chat, not a public thread:

  • /verbose on
  • /trace on