This is on a local only Minisforum MS-S1 Max that I have used the github.com/xcodelyokox/quickstartclaw to help set up. It was working fine until I tried switching models (finally settled on llama 3.1b-instruction) and have tried everything, including completely blowing away wsl and removing any traces of Ubuntu before reloading again today and still having the same issues:
Running OpenClaw 2026.5.18 with LM Studio + llama-3.1-8b-instruct on Ubuntu 24.04 under WSL2.
Symptoms:
- First prompt often succeeds.
- LM Studio completes inference successfully.
- Immediately afterward OpenClaw says “compacting context”.
- User prompt disappears from UI/session.
- Session status becomes “failed”.
- No further requests reach LM Studio until gateway/session restart.
Observed:
- Context can instantly jump to 98% after tiny prompts.
- Manual compaction sometimes temporarily fixes it.
- Session JSON/trajectory files appear to get stuck/corrupted.
- Earlier runs showed stale/irrelevant compaction summaries injected into context.
- Multiple sessions worsened behavior.
- Reducing context to 32768, simplifying AGENTS.md, disabling heartbeat behavior, reducing maxConcurrent/subagents to 1 helped somewhat but did not solve it.
- Tailscale/gateway itself appears stable.
- LM Studio itself appears stable.
Question:
- Is there a known regression in compaction/session lifecycle in 2026.5.18?
- Is there a way to fully disable compaction/summarization?
- Are there recommended settings for local small-context models (32K) to avoid session corruption?
I actually know that disabling compaction is probably a no-go, but is anyone else having this problem? Do I need to greatly increase the context to make the model usable? Right now, it can answer one question, maybe two (and I mean like "What is 2+2?") before I need to restart the gateway.