#2026.5.16 beta regression: trivial tool calls stall while text-only turns work

1 messages · Page 1 of 1 (latest)

latent mortar
#

I’m trying to test/upgrade past 2026.5.7, but I’m hitting a repeatable tool-call stall in the 2026.5.16 betas.

Setup

• macOS LaunchAgent gateway
• Telegram forum topic
• OpenClaw source checkout / local build
• Model/auth tested:
• openai/gpt-5.5 via Codex OAuth
• also tried Pi runtime path
• Currently rolled back to 2026.5.7

What works

• Plain text replies work
• /status works
• Telegram reply context works
• On 2026.5.7, trivial tool calls work

What fails
On both v2026.5.16-beta.2 and v2026.5.16-beta.3, a trivial agent tool call like asking the agent to run date stalls/aborts. Text-only turns still work.

Regression matrix

• 2026.5.7: plain reply ✅, /status ✅, date tool call ✅
• 2026.5.16-beta.2: plain reply ✅, /status ✅, date tool call ❌ stalls
• 2026.5.16-beta.3: plain reply ✅, /status ✅, date tool call ❌ stalls

I tested this under both:

• Runtime: OpenAI Codex
• Runtime: OpenClaw Pi Default

Both showed the same user-visible symptom: text works, tool calls stall.

Relevant log shape
For the Codex path, logs showed:
classification=stalled_agent_run
activeWorkKind=embedded_run
lastProgress=codex_app_server:notification:thread/tokenUsage/updated
recovery=none
In another run, last progress looked like:
codex_app_server:notification:rawResponseItem/completed
So this seems related to missing terminal/progress handling in the embedded run path. I noticed #82775 / PR #82816 look close for Codex app-server, but since I also saw the visible tool-call stall with Runtime: OpenClaw Pi Default, I’m not sure whether there’s a shared embedded-run/tool bridge regression too.

Happy to provide more logs/status snippets if useful.

novel forgeBOT
# latent mortar I’m trying to test/upgrade past 2026.5.7, but I’m hitting a repeatable tool-call...

This looks like a good regression report, and the cross-runtime part is the important bit.

I checked the linked GitHub context: #82775 / PR #82816 are specifically about the Codex app-server path where Codex goes quiet after a post-tool raw assistant completion / shutdown and OpenClaw never receives a terminal event. That matches your Codex log shape, especially:

activeWorkKind=embedded_run
lastProgress=codex_app_server:notification:rawResponseItem/completed

But if Runtime: OpenClaw Pi Default shows the same “text works, trivial tool call stalls” symptom, I would not assume #82816 fully covers it. It may be the same supervisor-visible failure mode with a different embedded runner/tool bridge underneath.

The most useful next data would be:

  • exact openclaw --version and package/build source for each beta you tested
  • the stalled-session/status block from the Pi Default run, especially activeWorkKind, lastProgress, terminalProgressStale, and any recovery/classification fields
  • gateway logs from the moment the tool call starts until the stall classifier fires
  • whether a non-shell tool stalls too, if you have one easy safe repro, e.g. memory/status/read-only tool vs date

If you can, please put this into a GitHub issue, or add it to the closest existing one if maintainers prefer consolidation: https://github.com/openclaw/openclaw/issues

novel forgeBOT
latent mortar
#

Thanks — I checked the local logs again and one correction: even during the run where /status showed Runtime: OpenClaw Pi Default, the stalled-session log still referenced codex_app_server, e.g. activeWorkKind=embedded_run and lastProgress=codex_app_server:notification:thread/tokenUsage/updated, recovery=none.

So it may not be a separate Pi-native tool bug. It might be config/status/routing showing Pi while the actual tool-call turn still goes through Codex app-server.

I’ll file a GitHub issue with the regression matrix unless you think #82775/#82816 is the right place to consolidate.

novel forgeBOT
# latent mortar Thanks — I checked the local logs again and one correction: even during the run ...

That correction makes it much more likely this belongs with the Codex app-server stall family, not a separate Pi-native tool bug.

I’d consolidate into #82775 / #82816 context if GitHub lets you comment there, because the key signature is still codex_app_server + embedded_run + no terminal recovery. The extra bit you found is useful though: /status reporting Runtime: OpenClaw Pi Default while the actual tool-call stall logs show codex_app_server. That’s worth calling out explicitly because it may be either status/routing confusion or a fallback/selection path that still routes tool turns through Codex.

Suggested wording to include:

  • macOS LaunchAgent gateway, Telegram forum topic
  • 2026.5.7 works; 2026.5.16-beta.2 / beta.3 stall on trivial tool call
  • text-only and /status still work
  • even when /status says Runtime: OpenClaw Pi Default, stalled-session logs show activeWorkKind=embedded_run and lastProgress=codex_app_server:notification:thread/tokenUsage/updated, recovery=none
  • ask whether this is expected routing/fallback behavior or a status mismatch

If the existing issue is too narrowly framed around rawResponseItem/completed / op.dispatch.shutdown, a new issue is fine, but reference #82775 and #82816 so maintainers can decide whether to merge it.

latent mortar
#

thx