Question: is this a known 2026.5.12 native Codex harness/app-server cleanup bug? If yes, is the recommended workaround to move these agents off native Codex temporarily, reset the affected session row, or restart the gateway/app-server?
OpenClaw version: 2026.5.12 (f066dd2)
Install: pnpm, stable/default
Runtime/model path: openai/gpt-5.5 via native Codex app-server / Codex harness
Gateway: local loopback, Telegram direct sessions
Relevant log pattern:
08:52:53 warn agent/embedded
codex app-server connection closed during startup; restarting app-server and retrying
08:52:54 warn agents/harness
{"harnessId":"codex","provider":"openai","modelId":"gpt-5.5","error":"codex app-server client is closed"}
Codex agent harness failed; not falling back to embedded PI backend
Then for the affected agent/session:
08:56:20 warn diagnostic
stalled session: sessionId=b5f02d3e... sessionKey=agent:chan:telegram:direct:... state=processing age=145s queueDepth=1 reason=active_work_without_progress classification=stalled_agent_run activeWorkKind=embedded_run lastProgress=codex_app_server:notification:rawResponseItem/completed terminalProgressStale=true recovery=none
Later, after a reset/new attempt:
09:16:46 warn diagnostic
stalled session: sessionId=ec349745... sessionKey=agent:chan:telegram:direct:... state=processing age=123s queueDepth=1 reason=active_work_without_progress classification=stalled_agent_run activeWorkKind=embedded_run lastProgress=codex_app_server:notification:rawResponseItem/completed terminalProgressStale=true recovery=none
The weird part: the transcript shows the agent did send a Telegram message and produced a final assistant “Sent.”, but the session store still says status="running". After that, /stop and RPC abort report no active abortable run, while the lane/session remains marked busy. So it looks like the abort controller/final event is gone, but session/lane cleanup did not complete.