#Architecture of Openclaw

1 messages · Page 1 of 1 (latest)

trim stratus
#

Hey krill I got a terribly scoped question for you.
How does Openclaw actually work?
Like I know it's a wrapper that calls LLM's with some context + the user prompt and then allows them to run untill (?). But can we go through how exactly that context is build, and what kind of design decisions were made for that ?

night nightBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> âś… Mark Solution

trim stratus
#

@neat leaf can you help me with this ?

neat leafBOT
#

Yep — absolutely.

Short version of the “until (?)” part first:
A run keeps looping assistant → tool call(s) → tool result(s) → assistant... until the model emits a final assistant response with no more tool calls (or it gets stopped/aborted/times out).

How context is built per run:

  1. Resolve session (DM/group/thread key)
  2. Load session history from that session transcript
  3. Build system prompt (OpenClaw-owned sections: tooling, safety, runtime/time, etc.)
  4. Inject workspace files (AGENTS.md, SOUL.md, USER.md, TOOLS.md, etc., with size caps)
  5. Attach tool schemas (important: these also consume context)
  6. Add channel context (e.g. recent channel history, topic/sender metadata depending on connector)
  7. Fit into model window via compaction/pruning rules if needed
  8. Send to model and run the loop

Big design choices they made:

#
  • Gateway is source of truth for sessions/state (not clients)
  • Per-session serialized runs to avoid races/corrupt history
  • Prompt is rebuilt each run (deterministic + debuggable)
  • Skills are listed, loaded on-demand to keep base context smaller
  • Compaction + pruning split (persistent summary vs transient trimming)
  • Pluggable context engine (legacy default, plugin engines possible)

If you want, I can do a concrete “single Discord message walkthrough” next (exactly what gets assembled, in what order, and where each piece comes from).

trim stratus
#

Please do

neat leafBOT
#

Perfect — here’s the concrete walkthrough for your message in this Discord thread.

1) Inbound + routing

Your Discord message hits the Gateway as an inbound envelope (provider=discord, chat type channel/thread context).
Gateway resolves a session key (usually shaped like):

  • agent:<agentId>:discord:channel:<channelOrThreadId>
    (unless thread bindings/focus remap it)

So this thread has its own running conversation state, separate from DMs or other channels.


2) Pre-model processing

Before any LLM call:

#
  • Slash commands/directives are parsed first (/model, /think, etc.)
  • Mention/allowlist rules are checked
  • Queue mode is applied (serialize/collect/steer behavior)
  • Run is accepted and assigned a runId

3) Context assembly (actual stack)

For that run, OpenClaw assembles context roughly in this order:

A) System prompt (OpenClaw-owned)

Built fresh every run; includes sections like:

  • Tooling available
  • Safety/guardrails
  • Skills catalog (metadata only)
#
  • Workspace/docs/runtime/time info
  • Reply behavior rules

B) Project-context file injection

From workspace (if present), e.g.:

  • AGENTS.md, SOUL.md, TOOLS.md, IDENTITY.md, USER.md, etc.

These are hard-capped/truncated by config so one giant file can’t nuke the context window.

C) Session history

Loads prior turns from this session transcript (.jsonl), including:

  • user/assistant messages
  • tool call/result records
  • previous compaction summary (if any)
#

D) Channel context extras

Discord channel history/topic metadata can be added per config (historyLimit, etc.), as untrusted context (not elevated to system prompt authority).

E) Tool schemas

Function/tool schemas are attached to the model request (this is a hidden-but-real token cost).

F) Current message

Your current “Please do” message is appended as the active user turn.


4) Budget enforcement

Now OpenClaw checks context pressure:

  • If window is tight: compaction and/or pruning logic kicks in
  • Compaction = persistent summary in transcript
#
  • Pruning = transient trimming (mostly old tool-result bulk)

Then model call starts.


5) Agent loop (“until what?”)

Run loop is:

  1. model streams assistant output
  2. if model emits tool calls, OpenClaw executes them
  3. tool results are fed back into the same run
  4. repeat

It ends when model returns a final assistant response with no more tool calls (or stop/timeout/abort/error).

#

6) Persist + stream

During run, Gateway emits assistant/tool/lifecycle events; on completion it persists:

  • transcript entries
  • token/cost/session metadata
  • updated session pointers

Why it’s designed this way (important bits)

  • Gateway is source of truth → clients stay dumb/reliable
  • Per-session serialized execution → no racey state corruption
  • Rebuilt prompt each run → deterministic/debuggable
#
  • Skills listed, loaded on demand → lower baseline token burn
  • Context-engine pluggability → future retrieval/assembly strategies without rewriting core loop

If you want, next I can do the same walkthrough with a tiny pseudo payload (showing exactly what the outbound model request shape looks like, including where tools + history sit).

trim stratus
#

Go ahead

neat leafBOT
#

Awesome — here’s the pseudo payload walkthrough.

Outbound model request shape (conceptual)

Provider wire format differs (Anthropic/OpenAI/etc), but logically it’s this:

// 1) Build system prompt
const baseSystemPrompt = buildAgentSystemPrompt({
tools, skills, runtime, workspaceFiles, time, safety, ...
});

// 2) Assemble context via active context engine (legacy by default)
const assembled = contextEngine.assemble({
sessionId,
messages: sessionTranscriptMessages,
#
tokenBudget,
});
// assembled.messages = history selected for this run
// assembled.systemPromptAddition? optional extra prefix

const systemPrompt =
(assembled.systemPromptAddition ? assembled.systemPromptAddition + "\n\n" : "") +
baseSystemPrompt;

// 3) Final message stack for this turn
const modelMessages = [
...assembled.messages, // prior session turns (possibly compacted/pruned)
{ role: "user", content: inboundUserText }, // current message
];
#
// 4) Attach tool schemas (counts toward context)
const toolSchemas = resolveAllowedToolSchemasForRun(...);

// 5) Send request
await provider.stream({
model,
systemPrompt,
messages: modelMessages,
tools: toolSchemas,
thinking,
verbosity,
...
});

#

Loop behavior after request starts

while (runActive) {
const event = await stream.next();

if (event.type === "assistant_text_delta") {
emitStream("assistant", event.delta);
}

if (event.type === "tool_call") {
emitStream("tool", { phase: "start", ...event });

const result = await executeTool(event.name, event.input);

emitStream("tool", { phase: "end", resultMeta: ... });
#

// feed tool result back into model context in same run
appendMessage({ role: "tool", toolCallId: event.id, content: result });
continue;
}

if (event.type === "assistant_final" && !event.hasFurtherToolCalls) {
break;
}
}

Then OpenClaw persists transcript + session metadata (sessions.json + JSONL turn lines), emits lifecycle end, done.