#Pruning for all providers?

1 messages · Page 1 of 1 (latest)

forest hull
#

I have turned on context window pruning for my instance of Openclaw, but it's not actually running.
As I understand it, context window pruning only runs if the model you are using is from a provider that uses the "anthropic-messages" API format or is from a short list of other providers.
Is this correct? Is it really the case that context-window pruning can't be turned on for any model provider?
If that's the case, is there a workaround? Is there any discussion to make pruning available for use by all models?
If, in fact, pruning can be done for all models, what might be preventing it from happening in my case. I've put the requisite keys and settings in my openclaw.json file and set the TTL rather low (just 4 minutes). The configuration file is being read with no errors. So it seems to me pruning should be taking place.
Thanks!

jovial ibexBOT
bitter pondBOT
# forest hull I have turned on context window pruning for my instance of Openclaw, but it's no...

I checked the current docs/source. Short version: yes, cache-TTL session pruning is not universal right now. The pruning extension is only registered when:

  1. agents.defaults.contextPruning.mode === "cache-ttl"
  2. the active provider/model is considered cache-TTL eligible
  3. a later LLM call happens after the TTL has expired

That eligibility is tied to prompt-cache semantics: Anthropic-family APIs / anthropic-messages, Google prompt-cache eligible models through google-generative-ai, and providers/plugins that explicitly opt in. For an arbitrary OpenAI-compatible provider, config alone won’t force this pruning path on.

A few gotchas that make it look like “nothing happened”:

  • It is not a background timer. A 4 minute TTL means “on the next model call after 4 minutes,” not “prune exactly at minute 4.”
  • The first eligible turn after enabling usually records the cache-touch timestamp; pruning can happen on a later turn after TTL expiry.
  • It only trims old tool results, not normal chat text.
  • It is in-memory only; the transcript on disk will still look unpruned.
  • Defaults skip pruning if there are fewer than keepLastAssistants: 3 assistant messages.
  • Defaults may not do much unless the context is large enough: softTrimRatio: 0.3, hardClearRatio: 0.5, and minPrunableToolChars: 50000.
  • Some tools can be denied from pruning via config, and image blocks are preserved.
bitter pondBOT
# forest hull I have turned on context window pruning for my instance of Openclaw, but it's no...

For general “make this work with any provider” behavior, the closest workaround today is compaction, not cache-TTL pruning. Pruning is currently cache/provider-aware; making it a generic provider-independent feature would be a reasonable GitHub issue: https://github.com/openclaw/openclaw/issues

Docs:
https://docs.openclaw.ai/concepts/session-pruning
https://docs.openclaw.ai/gateway/configuration-reference#agentsdefaultscontextpruning

For your specific case, first thing I’d ask for is openclaw --version, plus the provider id and configured API adapter for the model. Version/provider mismatch is very likely here.

forest hull
#

I'm running Openclaw v2026.5.12, and the model I'm using where I'm testing whether or not pruning takes place is openrouter/owl-alpha, which uses the openai-completions API.

bitter pondBOT
# forest hull I'm running Openclaw v2026.5.12, and the model I'm using where I'm testing wheth...

That combination explains it: openrouter/owl-alpha using openai-completions is not currently cache-TTL pruning eligible, so contextPruning: { mode: "cache-ttl" } will be read from config but the pruning extension will not be registered for that run.

For OpenRouter, the current eligibility path is model-family specific, not “all OpenRouter models.” Anthropic-family refs like openrouter/anthropic/..., plus the explicitly supported cache-aware families/providers, can qualify. openrouter/owl-alpha via openai-completions does not.

Practical options:

  • Use compaction for provider-independent context reduction.
  • Test pruning with an eligible model/provider if your goal is to verify the cache-TTL pruning feature itself.
  • If you want cache-TTL pruning decoupled from prompt-cache eligibility and available for OpenAI-compatible providers too, that’s worth filing as a feature request: https://github.com/openclaw/openclaw/issues

Docs for the exact knobs/behavior:
https://docs.openclaw.ai/concepts/session-pruning
https://docs.openclaw.ai/gateway/configuration-reference#agentsdefaultscontextpruning

forest hull
#

Can you think of any trick or workaround that would enable pruning for the provider-model combination I've mentioned?

bitter pondBOT
# forest hull Can you think of any trick or workaround that would enable pruning for the provi...

For that exact combo, there isn’t a clean config-only trick that I’d recommend.

What can work, but crosses into “custom extension / patch” territory: a provider runtime plugin can implement isCacheTtlEligible and return true for a provider/model. Core checks that hook before the built-in eligibility rules. But that’s really meant for providers/proxies that know their prompt-cache semantics, not as a user-facing override knob.

Things I would not do:

  • Don’t change api to anthropic-messages for owl-alpha; the request shape won’t match the model endpoint.
  • Don’t alias the model as openrouter/anthropic/...; that may pass a string check in some paths but it misrepresents the upstream model and can create weird behavior elsewhere.

Best practical workaround today is still compaction. If you specifically want tool-result trimming for OpenAI-compatible models, I’d frame the feature request as: “allow provider-independent session pruning, separate from prompt-cache TTL eligibility.” That’s a cleaner design than tricking the cache eligibility path.

Docs for the plugin hook, if you want to inspect the extension route:
https://docs.openclaw.ai/plugins/architecture
https://docs.openclaw.ai/plugins/sdk-provider-plugins

forest hull
#

Can an agent manipulate its own context window?

bitter pondBOT
# forest hull Can an agent manipulate its own context window?

Depends what you mean by “context window.”

If you mean the model’s actual context-window size: no, the agent can’t expand or manipulate that at runtime. That limit comes from the selected model/provider metadata, with optional config overrides like contextWindow / contextTokens. The agent can’t make a 128k model become a 1M model from inside the conversation.

If you mean “what gets sent inside the window”: partly, yes, but through OpenClaw mechanisms rather than direct self-editing. An agent can trigger/benefit from things like:

  • /compact or auto-compaction, which persists a summary and frees space
  • using narrower tools / smaller file reads / less verbose command output
  • pruning, when enabled and provider-eligible, which trims old tool results in-memory
  • session resets/new sessions, when you want a clean context

Important distinction: compaction is persistent in the transcript; pruning is per-request/in-memory and does not rewrite the session file.

Docs:
https://docs.openclaw.ai/concepts/context
https://docs.openclaw.ai/concepts/compaction
https://docs.openclaw.ai/concepts/session-pruning

forest hull
#

Also, just to be clear, if pruning and compaction are both turned on, and compaction occurs, the token window that will be compacted is the token window that already has been pruned? It is not some pre-pruning token window that gets compacted, but the post pruning window -- right?

bitter pondBOT
# forest hull Also, just to be clear, if pruning and compaction are both turned on, and compac...

No — that’s the part I’d correct.

Cache-TTL pruning is applied to the in-memory messages for an LLM request. It does not rewrite the session transcript. Compaction, on the other hand, is prepared from the session history/transcript and then writes a persistent compaction summary.

So if pruning and compaction are both enabled:

  • pruning may reduce what the next normal model call sees
  • compaction still works from the session’s stored/validated history, not from a permanently pruned transcript
  • because pruning is in-memory only, old tool results can still be available to the compaction process later
  • the compaction summarizer may do its own chunking/history budget management to fit the summarization call, but that is separate from cache-TTL pruning

So the mental model is: pruning affects per-request prompt assembly; compaction affects persistent session history. It’s not “prune first, then compact the already-pruned window” in the durable sense.

Docs say this pretty directly:
https://docs.openclaw.ai/concepts/session-pruning
https://docs.openclaw.ai/concepts/context
https://docs.openclaw.ai/concepts/compaction

forest hull
#

Interesting. Thanks for the clarification.
To be honest, I did not find the documentation clear on the issue I just asked about. I read it -- several times -- but did not get out of it what you just wrote.
Back to what agents can do with their own context window ... Can an agent read its own context window and edit it?
Alternatively, can an Openclaw cron job be written that reads the context window and, under certain conditions, manipulates the context window in a way similar to what happens with pruning?

bitter pondBOT
# forest hull Interesting. Thanks for the clarification. To be honest, I did not find the doc...

You’re not wrong about the docs; that distinction could be much clearer.

For the agent itself:

  • It can inspect diagnostics about context, e.g. /context list, /context detail, /status.
  • It can inspect session history when it has session tools.
  • It generally cannot read the exact final serialized “context window” the provider receives, including all hidden provider/tool-schema/wrapper details.
  • It cannot directly edit its current in-flight context window like a text buffer.

What it can do is influence future context indirectly: run/request /compact, use smaller tool calls, avoid huge outputs, write durable memory/workspace notes where appropriate, start a fresh session, etc.

For cron: a cron job can run an agent task, and depending how it is configured it can operate in an isolated/custom/main session. But a normal cron job is not a supported “context mutator” for another live session’s assembled prompt. You could theoretically write custom code that edits session files, but I would not recommend that: it is easy to break transcript/tool-call pairing, race session locks, or corrupt compaction state.

The clean extension point for what you’re describing is a context engine plugin. That gets an assemble() lifecycle hook before model runs, so it can choose which messages to include, summarize, trim, or inject. That’s basically the supported place to implement provider-independent pruning-like behavior.

Docs:
https://docs.openclaw.ai/concepts/context

bitter pondBOT
forest hull
#

Any suggestions for what would be the best text for a feature request related to this issue?

bitter pondBOT
# forest hull Any suggestions for what would be the best text for a feature request related to...

Yep. I’d frame it as a request to separate “tool-result pruning” from “prompt-cache TTL eligibility,” because that’s the core design issue.

Draft:

Title: Provider-independent session pruning for old tool results

Feature request

OpenClaw currently has cache-TTL session pruning via agents.defaults.contextPruning, but pruning only appears to activate for providers/models that are prompt-cache/TTL eligible, such as Anthropic-family providers, some Google models, and provider plugins that opt in.

I would like a provider-independent pruning mode that can trim old tool results before model calls even when the active model/provider does not support prompt caching. For example, openrouter/owl-alpha using the openai-completions adapter can accumulate large tool outputs, but contextPruning: { mode: "cache-ttl" } does not register pruning because the provider/model is not cache-TTL eligible.

Motivation

The current pruning behavior is useful for prompt-cache cost control, but the same mechanism is also valuable for ordinary context-window management:

  • reduce old tool-result bloat
bitter pondBOT
# forest hull Any suggestions for what would be the best text for a feature request related to...
  • delay or avoid compaction
  • make smaller-context or non-cache providers more usable
  • preserve normal conversation text while trimming low-value historical tool output

Compaction helps, but it is persistent summarization and has different semantics. I’m looking for a lightweight per-request pruning path that works across providers.

Possible design

Add a provider-independent mode or explicit override, for example:

{
  agents: {
    defaults: {
      contextPruning: {
        mode: "tool-results", // or "always" / "provider-independent"
bitter pondBOT
bitter pondBOT
# forest hull Any suggestions for what would be the best text for a feature request related to...
}

The important part is that users can opt into pruning old tool results for non-cache providers without pretending the provider has Anthropic/Google prompt-cache semantics.

Open questions

  • Should provider-independent pruning run on a TTL, on context-size thresholds, or both?
  • Should it reset/cache-touch metadata only for cache-eligible providers?
  • Should the existing tools.allow/deny, softTrim, hardClear, and keepLastAssistants behavior be reused unchanged?

Relevant docs

https://docs.openclaw.ai/concepts/session-pruning
https://docs.openclaw.ai/concepts/context
https://docs.openclaw.ai/concepts/compaction

bitter pondBOT
forest hull
#

Thanks!
Can we mark this question / help request "solved"?

bitter pondBOT
# forest hull Thanks! Can we mark this question / help request "solved"?

I can’t mark it solved from my side here, but yes — this looks solved.

Please right-click the message that best answers it → Apps → ✅ Mark Solution. I’d pick the message explaining that openrouter/owl-alpha via openai-completions is not cache-TTL pruning eligible, or the feature-request draft if that’s the final outcome. That helps Answer Overflow index it for the next person.

forest hull
#

Got it.