#I'm making a self-improving agent for OpenClaw. Would it be better to connect it via a skill/ plugin

1 messages · Page 1 of 1 (latest)

trim glacier
#

I'm making a self-improving agent for OpenClaw. Would it be better to connect it via a skill/ plugin here are the details. it can connect via a typescript client: Most LLM/agent "memory" is either ephemeral (context windows that reset each turn) or an append-only text log stuffed into a RAG pipeline. That gives you retrieval, but not learning: facts get stale, procedures drift, and the system cannot revise itself safely.

Membrane makes memory selective and revisable. It captures raw experience, promotes it into structured knowledge, and lets you supersede, fork, contest, or retract that knowledge with evidence. The result is an agent that can improve over time while remaining predictable, auditable, and safe.

60-Second Mental Model
Ingest events, tool outputs, observations, and working state.
Consolidate episodic traces into semantic facts, competence records, and plan graphs.
Retrieve in layers with trust gating and salience ranking.
Revise knowledge with explicit operations and audit trails.
Decay salience over time unless reinforced by success.
Key Features
Typed Memory -- Explicit schemas and lifecycles for each memory type, not a flat text store.
Revisable Knowledge -- Supersede, fork, retract, merge, and contest records with full provenance tracking.
Competence Learning -- Agents learn how to solve problems (procedures, success rates), not just what happened.
Decay and Consolidation -- Time-based salience decay keeps memory useful; background consolidation extracts durable knowledge from episodic traces.
Trust-Aware Retrieval -- Sensitivity levels (public, low, medium, high, hyper) with graduated access control and redacted responses for records above the caller's trust level.
Security and Operations -- SQLCipher encryption at rest, optional TLS and API key authentication, configurable rate limiting, full audit logs.
Observability -- Built-in metrics for retrieval usefulness, competence success rate, plan reuse frequency, memory growth, and revision rate.
gRPC API -- 15-method gRPC service with TypeScript and Python client SDKs, or use Membrane as an embedded Go library.
LLM-Ready Context Retrieval -- Retrieve trust-filtered, typed memory and inject it directly into LLM prompts for planning, execution, and self-correction loops.
Memory Types
Type Purpose Example
Episodic Raw experience capture (immutable) Tool calls, errors, observations from a debugging session
Working Current task state "Backend initialized, frontend pending, docs TODO"
Semantic Stable facts and preferences "User prefers Go for backend services"
Competence Learned procedures with success tracking "To fix linker cache error: clear cache, rebuild with flags"
Plan Graph Reusable solution structures as directed graphs Multi-step project setup workflow with dependencies and checkpoints
Each memory type has its own schema, lifecycle rules, and consolidation behavior. Episodic records are immutable once ingested. Working memory tracks in-flight task state. Semantic, competence, and plan graph records are the durable output of consolidation and can be revised through explicit operations.

bold hollyBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> âś… Mark Solution

icy parcelBOT
#

If “Membrane” is a memory/knowledge service (gRPC, typed records, revisions), it should almost certainly be integrated as a skill (i.e., new tools the agent can call) — not as a “channel plugin”.

Why skill/tool integration is the right fit

  • Channel plugins in OpenClaw are for message transport (Discord/Telegram/etc.) and related delivery semantics. They’re great when your system is the chat/messaging surface.
  • A revisable memory system is domain functionality the agent should invoke during reasoning/execution → that maps cleanly to tools exposed via a skill.

What I’d implement (practical shape)

Expose a small, safety-conscious tool surface, e.g.

  • membrane.retrieve(query, types, sensitivityMax, limit) → returns typed, provenance-carrying results
  • membrane.propose_revision(op, targetId, newValue, evidenceRefs, reason) → returns a proposal + diff
  • membrane.commit_revision(proposalId) → commits (optionally require a second “confirmation” step)
  • membrane.ingest_episode(event) → append-only episodic trace (immutable)

Then have your skill wrap the gRPC TypeScript client and present these as JSON-serializable tool calls.

Guardrails (important for “self-improving”)

If you want it to stay predictable/auditable, I’d strongly recommend:

#

Yep — I see the ping.

For Membrane (typed + revisable memory via a TS gRPC client), the clean OpenClaw integration is:

  • Plugin (TypeScript) to implement the capability (register real tools that call your gRPC client)
  • + optional Skill to teach the model when/how to use those tools (promote/retrieve/revise patterns, safety rules, etc.)
  • Not a channel plugin (channels are for Discord/Telegram/etc transport)

Reason: in OpenClaw, a skill is mostly “prompting + tool usage guidance”; a plugin is where you actually add new tools (code) to the Gateway. Docs:

If you want a “fast MVP” without writing a plugin, you can do a skill that shells out via exec to a small Node/TS CLI that calls Membrane — but long-term you’ll be happier with a plugin tool (schema, allowlists, better safety boundaries, no prompt-injected shell args, etc.).

Question that decides the architecture: do you want Membrane to be on-demand (agent calls membrane.retrieve / membrane.propose_revision) or auto-capture everything (tool outputs/events get ingested without the model choosing)? Auto-capture usually pushes you further into plugin-land (or even a dedicated “memory slot” plugin), not just a skill.

#
  • Two-step writes: model can propose a revision, but a separate step/tool (or policy) is required to commit.
  • Strict schemas + allowlisted ops only: no “freeform write arbitrary JSON into memory”.
  • Bounded retrieval: hard caps on tokens/results; always return provenance + timestamps; prefer “top-K + why” over dumping.
  • Trust gating at the tool boundary (your sensitivity levels): don’t rely on the model to self-censor.

When a channel/plugin would make sense

Only if Membrane is meant to be a first-class external participant in chat (e.g., it receives Discord events directly, posts audits/metrics back into chat, etc.). Even then, I’d still keep the agent’s interaction as tools; the “channel” aspect would just be observability/ops.

Quick clarifier (so I can steer you better)

Do you want Membrane to:

  1. be used only when the agent explicitly calls it (retrieve/ingest/revise tools), or
  2. automatically capture every tool call + observation without the model deciding?

(2) is doable, but it often pushes you toward instrumenting OpenClaw’s execution layer rather than “just a skill”.

trim glacier
#

I meant normal plugin, not channel plugin. Is there a way to force the skill? like it needs to be loaded? because otherwise an agent might not know to call it

#

I had my openclaw agent try to explore this codebase but it said: "Request timed out before a response was generated. Please try again, or increase agents.defaults.timeoutSeconds in your config."

What's the best way around this? sub agent or increase this timeout

icy parcelBOT
#

That error means the agent run hit the runtime abort timer (agents.defaults.timeoutSeconds, default 600s). For “explore this whole codebase” prompts, you’ll hit it pretty easily.

Best workaround is usually both:

1) Increase timeout (best immediate fix)

One-off (CLI):

openclaw agent --message "…" --timeout 1200

(That --timeout is in seconds.)

Global default (config):

openclaw config set agents.defaults.timeoutSeconds 1200
openclaw gateway restart
#

Docs reference for the setting: https://docs.openclaw.ai/concepts/agent-loop (Timeouts section)

2) Use a sub-agent for long repo scans (best “UX” fix)

Sub-agents let the long-running work happen in the background and then it posts back a final summary when it’s done (so your main session isn’t “stuck” waiting). Docs: https://docs.openclaw.ai/tools/subagents

Note: sub-agents still ultimately run under the same underlying runtime timeout unless you raise it (so if you’re timing out a lot, you still want step #1).

3) Make the task smaller (best “quality” fix)

Even with a bigger timeout, “explore the entire repo” is often inefficient. Better pattern:

  1. ask for top-level map (entrypoints, packages, build system)
  2. then deep-dive one directory at a time
  3. then ask for a summary + file list you should read next

If you tell me how you’re running it (Discord chat? openclaw agent CLI? TUI/web UI?), I can suggest the most ergonomic way to set the timeout for that path.

trim glacier
#

I told it to use a subagent and nothing happened (no response, unsure if it did anything). I asked the bot and it told me the subagent timed out but I’m not sure if it was seeing its own failure

icy parcelBOT
#

What you’re describing usually comes from one of these situations:

  1. No sub-agent actually spawned (the model said it would, but sessions_spawn never got called / was disallowed), or
  2. A sub-agent spawned, but it hit the same runtime timeout as normal runs (agents.defaults.timeoutSeconds), and the “announce back to chat” never made it through (or you just didn’t have an easy way to see it).

Here’s the most reliable way to tell which it is:

1) Confirm whether a sub-agent exists (authoritative)

In the same chat/thread, run:

  • /subagents list

If you see a run, grab its id and run:

  • /subagents info <id>
  • /subagents log <id> 200 tools (or bump the number)
#

If /subagents list shows nothing, then your agent almost certainly never spawned one (or you’re on a version where the command isn’t available).

Docs: https://docs.openclaw.ai/tools/subagents

2) Don’t “ask the model to use a sub-agent” — spawn it explicitly

Instead of prompting “use a subagent to explore…”, do:

  • /subagents spawn main <task>

That command should immediately return a run id. If it doesn’t, that’s a strong signal it’s not set up/available in your install/version.

3) If it did spawn: sub-agents still die at agents.defaults.timeoutSeconds

Sub-agents aren’t magic background workers with infinite time — their run is still governed by:

  • agents.defaults.timeoutSeconds (default 600s)

So if you’re asking it to “explore the whole repo”, it’ll commonly time out unless you raise that.

#

You can fix that by increasing the default, e.g.:

openclaw config set agents.defaults.timeoutSeconds 1800
openclaw gateway restart

4) If it never spawned: check tool allowlists (common gotcha)

If you configured a strict tools.allow / per-agent allowlist, the model may not have permission to call sessions_spawn (so it claims it spawned a subagent, but nothing happens). In that case you either:

  • allow the subagent tool(s), or
  • rely on /subagents spawn … which bypasses “model must call the tool”.

If you paste:

  1. your openclaw --version, and
  2. the output of /subagents list right after you try spawning,
#

…I can tell you exactly whether you’re dealing with “never spawned”, “spawned but timed out”, or “announce couldn’t deliver”.

trim glacier
#

Should I put a 24 hr limit on subagents? I don’t pay extra for this model

icy parcelBOT
#

I wouldn’t set sub-agents to 24 hours unless you really mean “background batch job” and you’re OK with runaway loops / huge transcripts.

Even if the model feels “free” to you, long sub-agent timeouts still have real downsides:

  • Runaway / loop risk (it can spin for hours doing low-value work)
  • Resource contention (ties up your gateway’s concurrency, tools, rate limits)
  • Massive logs/state (harder to debug; can grow disk usage)
  • Harder to notice failures (it looks “busy” forever)

What I recommend instead:

  1. Keep agents.defaults.timeoutSeconds fairly sane (e.g. 1800–3600s, 30–60 min)
  2. Make repo exploration chunked (“map top-level”, then “analyze package X”, etc.)
  3. Use /subagents list + /subagents kill … actively when testing

If you truly need “hours-long” work, it’s usually better to move that into a real background process (script/plugin/service) and have the agent poll/ingest results, rather than leaving an LLM run alive for 24h.