phantom-progress that's persisted through extensive remediation. OpenClaw 2026.4.26.
Agent claims actions without invoking the tool โ "I'll patch X" / "I checked memory and..." โ with 0 tool calls in the trajectory. Most pronounced on Discord and webchat (messaging surfaces).
What I tried (clean dev profile + sequential tuning):
embeddedPi.executionContract: "strict-agentic" enabled
Primary model: gpt-5.5 โ gpt-5.3-codex (codex auth was silently failing on legacy auth.json with an expired token; fixed)
Added memory-core plugin โ Dreaming was never installed
hooks.internal.enabled: true (was false โ silently neutering boot-md/command-logger/session-memory despite child entries being true)
Embeddings: broken local (no node-llama-cpp) โ text-embedding-3-large via OpenAI
Disabled GPT-5 friendly overlay (promptOverlays.gpt5.personality: "off")
Active-memory โ precision-heavy, then disabled entirely as A/B
Trimmed AGENTS.md (10K โ 6K) and TOOLS.md (11K โ 4K) to remove duplicates and stale content
Curated MEMORY.md from 15K โ 3K durable spine
Where I am:
Substrate is measurably cleaner (cacheRead dropped 96K โ 4.6K with Phase 1 edits โ 95% reduction; output prose tighter)
Phantom progress still persists on messaging turns. Test: "Pick three wiki entries โ Read each one before deciding" โ 0 tool calls, even with active-memory completely disabled
Traced to selection-D9uTvvsw.js:1474: resolvePlanningOnlyRetryInstruction early-returns when didSendViaMessagingTool=true. Strict-agentic doesn't fire on Discord/webchat surfaces by design
Only reliable fix so far: explicit operational framing per prompt ("invoke Edit on X and confirm with file size"). Not sustainable for natural conversation
Questions:
Anyone fully solved this without per-prompt operator framing?
Is the messaging-surface gate intentional?
Happy to share trace data (trajectory dumps, before/after token counts).
Thanks.