Architecture of Openclaw | Friends of the Crustacean 🦞🤝 | Page 1

trim stratus Mar 20, 2026, 11:59 AM

#

Hey krill I got a terribly scoped question for you.
How does Openclaw actually work?
Like I know it's a wrapper that calls LLM's with some context + the user prompt and then allows them to run untill (?). But can we go through how exactly that context is build, and what kind of design decisions were made for that ?

night nightBOT Mar 20, 2026, 11:59 AM

#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

trim stratus Mar 20, 2026, 12:12 PM

#

@neat leaf can you help me with this ?

neat leafBOT Mar 20, 2026, 12:13 PM

#

Yep — absolutely.

Short version of the “until (?)” part first:
A run keeps looping assistant → tool call(s) → tool result(s) → assistant... until the model emits a final assistant response with no more tool calls (or it gets stopped/aborted/times out).

How context is built per run:

Resolve session (DM/group/thread key)
Load session history from that session transcript
Build system prompt (OpenClaw-owned sections: tooling, safety, runtime/time, etc.)
Inject workspace files (AGENTS.md, SOUL.md, USER.md, TOOLS.md, etc., with size caps)
Attach tool schemas (important: these also consume context)
Add channel context (e.g. recent channel history, topic/sender metadata depending on connector)
Fit into model window via compaction/pruning rules if needed
Send to model and run the loop

Big design choices they made:

#

Gateway is source of truth for sessions/state (not clients)
Per-session serialized runs to avoid races/corrupt history
Prompt is rebuilt each run (deterministic + debuggable)
Skills are listed, loaded on-demand to keep base context smaller
Compaction + pruning split (persistent summary vs transient trimming)
Pluggable context engine (legacy default, plugin engines possible)

If you want, I can do a concrete “single Discord message walkthrough” next (exactly what gets assembled, in what order, and where each piece comes from).

trim stratus Mar 20, 2026, 12:43 PM

#

Please do

neat leafBOT Mar 20, 2026, 12:44 PM

#

Perfect — here’s the concrete walkthrough for your message in this Discord thread.

1) Inbound + routing

Your Discord message hits the Gateway as an inbound envelope (provider=discord, chat type channel/thread context).
Gateway resolves a session key (usually shaped like):

agent:<agentId>:discord:channel:<channelOrThreadId>
(unless thread bindings/focus remap it)

So this thread has its own running conversation state, separate from DMs or other channels.

2) Pre-model processing

Before any LLM call:

#

Slash commands/directives are parsed first (/model, /think, etc.)
Mention/allowlist rules are checked
Queue mode is applied (serialize/collect/steer behavior)
Run is accepted and assigned a runId

3) Context assembly (actual stack)

For that run, OpenClaw assembles context roughly in this order:

A) System prompt (OpenClaw-owned)

Built fresh every run; includes sections like:

Tooling available
Safety/guardrails
Skills catalog (metadata only)

#

Workspace/docs/runtime/time info
Reply behavior rules

B) Project-context file injection

From workspace (if present), e.g.:

AGENTS.md, SOUL.md, TOOLS.md, IDENTITY.md, USER.md, etc.

These are hard-capped/truncated by config so one giant file can’t nuke the context window.

C) Session history

Loads prior turns from this session transcript (.jsonl), including:

user/assistant messages
tool call/result records
previous compaction summary (if any)

#

D) Channel context extras

Discord channel history/topic metadata can be added per config (historyLimit, etc.), as untrusted context (not elevated to system prompt authority).

E) Tool schemas

Function/tool schemas are attached to the model request (this is a hidden-but-real token cost).

F) Current message

Your current “Please do” message is appended as the active user turn.

4) Budget enforcement

Now OpenClaw checks context pressure:

If window is tight: compaction and/or pruning logic kicks in
Compaction = persistent summary in transcript

#

Pruning = transient trimming (mostly old tool-result bulk)

Then model call starts.

5) Agent loop (“until what?”)

Run loop is:

model streams assistant output
if model emits tool calls, OpenClaw executes them
tool results are fed back into the same run
repeat

It ends when model returns a final assistant response with no more tool calls (or stop/timeout/abort/error).

#

6) Persist + stream

During run, Gateway emits assistant/tool/lifecycle events; on completion it persists:

transcript entries
token/cost/session metadata
updated session pointers

Why it’s designed this way (important bits)

Gateway is source of truth → clients stay dumb/reliable
Per-session serialized execution → no racey state corruption
Rebuilt prompt each run → deterministic/debuggable

#

Skills listed, loaded on demand → lower baseline token burn
Context-engine pluggability → future retrieval/assembly strategies without rewriting core loop

If you want, next I can do the same walkthrough with a tiny pseudo payload (showing exactly what the outbound model request shape looks like, including where tools + history sit).

trim stratus Mar 20, 2026, 12:51 PM

#

Go ahead

neat leafBOT Mar 20, 2026, 12:53 PM

#

Awesome — here’s the pseudo payload walkthrough.

Outbound model request shape (conceptual)

Provider wire format differs (Anthropic/OpenAI/etc), but logically it’s this:

// 1) Build system prompt
const baseSystemPrompt = buildAgentSystemPrompt({
tools, skills, runtime, workspaceFiles, time, safety, ...
});

// 2) Assemble context via active context engine (legacy by default)
const assembled = contextEngine.assemble({
sessionId,
messages: sessionTranscriptMessages,

#

tokenBudget,
});
// assembled.messages = history selected for this run
// assembled.systemPromptAddition? optional extra prefix

const systemPrompt =
(assembled.systemPromptAddition ? assembled.systemPromptAddition + "\n\n" : "") +
baseSystemPrompt;

// 3) Final message stack for this turn
const modelMessages = [
...assembled.messages, // prior session turns (possibly compacted/pruned)
{ role: "user", content: inboundUserText }, // current message
];

#

// 4) Attach tool schemas (counts toward context)
const toolSchemas = resolveAllowedToolSchemasForRun(...);

// 5) Send request
await provider.stream({
model,
systemPrompt,
messages: modelMessages,
tools: toolSchemas,
thinking,
verbosity,
...
});

#

Loop behavior after request starts

while (runActive) {
const event = await stream.next();

if (event.type === "assistant_text_delta") {
emitStream("assistant", event.delta);
}

if (event.type === "tool_call") {
emitStream("tool", { phase: "start", ...event });

const result = await executeTool(event.name, event.input);

emitStream("tool", { phase: "end", resultMeta: ... });

#


// feed tool result back into model context in same run
appendMessage({ role: "tool", toolCallId: event.id, content: result });
continue;
}

if (event.type === "assistant_final" && !event.hasFurtherToolCalls) {
break;
}
}

Then OpenClaw persists transcript + session metadata (sessions.json + JSONL turn lines), emits lifecycle end, done.

#Architecture of Openclaw