#[Plugin] episodic-claw โ€“ long-term memory for OpenClaw agents, built on actual "arXiv" research

1 messages ยท Page 1 of 1 (latest)

flat plaza
#

Hey folks,
I've been hanging around this community for a bit and figured it was time to actually share something.

I built a memory plugin for OpenClaw called episodic-claw. It's v0.1.1, it has rough edges, and I won't pretend otherwise. But the thing it's trying to solve is real, and the approach is grounded in actual arXiv research rather than vibes.

The problem with most memory plugins

They either clip a sliding window (so you just lose old context), or they do naive RAG (dump everything into a vector store and hope the right chunk floats up).

Neither one really fixes agent amnesia. The agent isn't remembering anything, it's just doing text similarity on whatever you stuffed into a database.

What episodic-claw does instead

It pulls from three papers on episodic memory in language agents:

  • Bayesian Surprise scoring (arXiv:2310.08560) detects when a topic actually shifts and seals that chunk as an episode. No fixed schedule, no noise, just real conversation boundaries.
  • HNSW vector search (arXiv:2407.09450) means recalled memories are semantically ranked, not keyword matched.
  • A D0/D1 memory hierarchy (arXiv:2502.06975) lets raw episodes get distilled into summaries over time, so recall doesn't slow down as memory grows.

In practice: every prompt your agent sees already has the most relevant past episodes prepended, automatically. It'll
remember what it was working on three weeks ago without you touching anything.

#

The stack

  • TypeScript plugin layer on the OpenClaw extension API
  • Go sidecar handles embedding and vector indexing (Pebble DB + HNSW)
  • Gemini Embedding API for vectors (your key stays in your env, never stored)
  • MPL-2.0, free forever, forks have to stay open

What doesn't work yet (being honest)

  • You need a Gemini API key (free tier is fine though)
  • The Go binary (~24 MB) downloads from GitHub Releases at install time
  • D1 distillation is designed but not shipped yet
  • Cross-agent memory sharing is Phase 6, not today

Install openclaw plugins install clawhub:episodic-claw
Or just clone it: https://github.com/YoshiaKefasu/episodic-claw

If you run into anything weird, open an issue or drop it here. And if you're actually building something on top of this, I really want to hear about it.

#

[Plugin] episodic-claw โ€“ long-term memory for OpenClaw agents, built on actual "arXiv" research

ornate token
#

Prepending is a solid choice, been dropping yaml prepend since early 2025 with good results. Vector search also the only way to go, keyword is always a backup option, but should never work with preper vector encoding, so solid there as well.

Very interested in your memory hierarchy, bayesian surprise scoring. looks like solid embedding API, though I'd probably default to free (I have AI memories going back to 2021 on the gpt-3 early betas, pure raw storage direct imports, plus every text readable note I ever created going back fairly long, so there isn't really any API that's cheap, switching costs for me run roughly 2 weeks of dedicated encoding, and that's if the database plays nice (postres, supabase, lancedb I've modified that came with claw.)

Will definitely check out and see your implementation under the hood, especially curious if you chose optional local db options over the session md files (no offense, Peter, but that was a rookie choice most of us solved in 2024 for internal memories, lol... markdown is insufficient for backups, docker sandboxes, and built in recoverability... only the newbs wipe out databases with AI, we generally force AI to go through a dedicated gateway/agent/api with set limits, and we backup 1-2-3, including internal differential backups, etc.)

I hate providing feedback when I haven't tried a thing, but I do want to say what I'll be looking for:

sqlite/pg local db support basic, on-prem ms/my sql with good credential store method, remote api supabase/vercel/etc options with api-key credential store options, and how this handles native vector encoding handled on platform vs pre-vector encoding on machine. Will also be curious if you have hard coded, variable coded, or dynamic choice for the quality of bayesian scores (adjusting up and down from say a .8 baseline depending on # of memories, detecting memory collision so that same memory not repeatedly too soon, etc.

Gimme a few days to test! And TY for the contribution!

flat plaza
# ornate token Prepending is a solid choice, been dropping yaml prepend since early 2025 with g...

Hey DrDew, thanks for the detailed breakdown, this is exactly the kind of feedback worth having before someone with 5 years of memory data tries it and hits a wall.

On the markdown vs db debate: episodic-claw is markdown-first by design. Episodes are written as readable markdown files on disk first, and Pebble DB is just the index on top of them. If the DB gets corrupted or wiped, you don't lose anything. You just rebuild the vector index from the existing episode files. Closer to "git repo with a search index" than "database that happens to store text." That said, I hear you on the broader recoverability point and I want to make the rebuild path more explicit in the docs.

On embedding API costs with that volume of data: yeah, Gemini free tier won't survive 5 years of GPT-3 era memories without hitting limits. A local embedding option (nomic-embed or a llama.cpp endpoint) is a real gap I want to close.

On DB backends: right now it's Pebble only because I wanted zero config for the initial release. sqlite/pg, on-prem SQL, Supabase and other remote options are all on the table if there's enough interest from the community. If people actually want it, I'll build it.

Dynamic Bayesian thresholds are a concrete improvement I hadn't framed that clearly before you wrote it out. Adaptive baseline based on memory density and collision detection to avoid re-surfacing the same episode too soon, adding that to the roadmap properly. Take your time testing. If something breaks with your data volume, I genuinely want to know ๐Ÿ‘

ornate token
# flat plaza Hey DrDew, thanks for the detailed breakdown, this is exactly the kind of feedba...

Absolutely, I'll bounce it around fairly hard for you. I cloned the repo, and I'll make sure to specifically look for break points for you. Yeah, the free does break, but I have two pre-vectorized backups using two local models, I'll check on an import and a slight mod just to use one of those. My tokens are tied up with a software project this weekend, but I'll make a breakdown markdown and have the model do some auto testing when my tokens free up. Good stuff, and love that you made this research based. I'll ping you in this thread when ready

flat plaza
# ornate token Absolutely, I'll bounce it around fairly hard for you. I cloned the repo, and I'...

Appreciate it, seriously. Having someone actually stress test it with real data is way more valuable than lab conditions.

Fair warn though, the implementation comments and /docs are written in Japanese. Everything is in there, pretty detailed, but if you hit a wall reading through it just throw it at your agent. It'll probably handle it better than a Google Translate paste anyway lol ๐Ÿ˜ . Ping me whenever you're ready ๐Ÿ‘