#telegram 2026.5.14 (77e5a49)
1 messages · Page 1 of 1 (latest)
That symptom usually means Telegram is receiving the message and OpenClaw has started a run, but the run never finishes. Telegram uses per-chat/per-thread sequencing, so once one run is stuck, later messages in that same chat can look like “typing forever” and not get answered.
First, can you confirm the exact version with:
openclaw --version
Then run this while reproducing it once:
openclaw status --all
openclaw channels status --probe
openclaw logs --follow
Look especially for model/provider errors, tool-call stalls, Polling stall detected, getUpdates, sendChatAction, or queue lines. If it’s currently wedged, a quick recovery step is:
openclaw gateway restart
If it starts working after restart and then wedges again after the next message, paste the relevant log chunk around that first stuck Telegram message. Please redact the Telegram bot token / provider keys.
Refs: https://docs.openclaw.ai/help/faq and https://docs.openclaw.ai/channels/telegram#troubleshooting
Here is what is happening. Build: 2026.5.14 (77e5a49), macOS 15.7 x64, Node 22.22.2
Auth: openai-codex OAuth (valid until May 25)
Plugins: codex + acpx both enabled, permissionMode=approve-all
Symptom: embedded_run stalls after model completes. Last progress is codex_app_server:notification:thread/tokenUsage/updated or rawResponseItem/completed, then no further progress. Auto-recovery aborts at 625s but next message stalls identically.
Network: curl to both api.openai.com and chatgpt.com/backend-api responds in <0.2s
Not network: Telegram API also reachable in 0.6s from curl
Workaround: Disabling Codex plugin and using xai/grok-4 works (when xAI isn't overloaded)
Thanks, that narrows it down a lot. This does not look like Telegram or network anymore.
The important bit is: rawResponseItem/completed / thread/tokenUsage/updated is not the terminal event OpenClaw is waiting for. In the Codex app-server harness path, OpenClaw releases the run when it sees turn/completed for the active thread/turn. So if the last event is rawResponseItem/completed and no turn/completed follows, the model can be done while OpenClaw still thinks the embedded run is active.
Immediate recovery/check:
ps aux | rg 'acpx|codex|app-server'
openclaw logs --follow | rg 'codex_app_server|turn/completed|embedded_run|agent harness selected|timeout|abort'
If you see stale ACPX/Codex child processes after the run has timed out, kill those specific PIDs, then restart the gateway:
openclaw gateway restart
For the workaround, I’d avoid the native Codex app-server harness for now and force PI for embedded turns, then start a fresh chat session because harness choice is pinned per session:
openclaw config set agents.defaults.embeddedHarness.runtime pi
openclaw config set agents.defaults.embeddedHarness.fallback pi
openclaw gateway restart
Then send /reset or /new in the Telegram conversation before testing again. If you want Codex OAuth without the app-server harness path, use the openai-codex/... model route and keep embeddedHarness.runtime: "pi"; avoid legacy codex/... refs or embeddedHarness.runtime: "codex" for this setup.
This is probably worth a GitHub issue with exactly the signature you posted: “Codex app-server embedded run stalls after rawResponseItem/completed; no turn/completed; recovery aborts but next run repeats.” Include the build, macOS/Node versions, and the last ~100 log lines around the stuck turn. https://github.com/openclaw/openclaw/issues
Docs refs: https://docs.openclaw.ai/plugins/codex-harness and https://docs.openclaw.ai/tools/acp-agents#troubleshooting
Setting agents.defaults.embeddedHarness.runtime to pi didn't change the runtime path. The stall is identical — still shows codex_app_server:notification:item/completed as last progress. The fallback key was rejected as unrecognized. Same stall pattern, same 129s before first detection. What's the correct config to force the PI runtime and bypass the Codex app server entirely?
You’re right — my previous config was from newer docs/current main, not your 77e5a49 build. In 2026.5.14 (77e5a49), embeddedHarness.fallback does not exist, and agents.defaults.embeddedHarness.runtime is not the effective override for this path.
For that build, force PI at the provider/model runtime policy level:
openclaw config set models.providers.openai-codex.agentRuntime.id pi
If your selected model is listed under agents.defaults.models, also set the model-specific override. Replace the model key with the exact one you’re using from /model status or openclaw status --all:
openclaw config set agents.defaults.models '{"openai-codex/gpt-5.5":{"agentRuntime":{"id":"pi"}}}' --strict-json --merge
Then restart and start a fresh Telegram session:
openclaw gateway restart
In Telegram, send:
/reset
or:
/new
Also remove/ignore the bad newer key:
openclaw config unset agents.defaults.embeddedHarness.fallback
Verification: after the next message, logs should show selectedReason: forced_pi / selectedHarnessId: pi if debug logging is on, and you should not see new codex_app_server:* progress for that turn.
If that still selects Codex, the likely cause is a model-specific runtime entry elsewhere, so check:
openclaw config get models.providers.openai-codex.agentRuntime
openclaw config get agents.defaults.models
openclaw config get agents.list
Hmm nope "Error: Config validation failed: models.providers.openai-codex.baseUrl: Invalid input: expected string, received undefined"
Yep, that error means my provider-level command created a partial models.providers.openai-codex entry. In this build, entries under models.providers.* are custom provider configs and require baseUrl, so don’t use that path for the built-in Codex OAuth provider.
Clean up the bad partial entry first:
openclaw config unset models.providers.openai-codex
Then use only the model-level override under agents.defaults.models:
openclaw config set agents.defaults.models '{"openai-codex/gpt-5.5":{"agentRuntime":{"id":"pi"}}}' --strict-json --merge
openclaw gateway restart
If your active model is not exactly openai-codex/gpt-5.5, use that exact model key instead. Check it with:
openclaw status --all
or in chat:
/model status
After restart, send /reset or /new in Telegram so the old session binding doesn’t keep using the previously selected harness.
Sorry for the bad provider-level path; the key distinction in 77e5a49 is: models.providers.*.agentRuntime only works if you define a full custom provider object with baseUrl + models, while agents.defaults.models["provider/model"].agentRuntime.id is the lighter override for a built-in provider model.
Still no: Config path not found: models.providers.openai-codex. Nothing was changed. Run openclaw config get <path> first if you are unsure of the path.
That unset result is okay. It means the bad provider-level write failed validation and did not persist, so there is nothing to remove there.
Run these instead:
openclaw config get agents.defaults.model
openclaw config get agents.defaults.models
Find the exact active model key. If it is openai-codex/<model>, set the runtime override on that exact key, for example:
openclaw config set agents.defaults.models '{"openai-codex/gpt-5.5":{"agentRuntime":{"id":"pi"}}}' --strict-json --merge
openclaw gateway restart
If your active model key is different, replace openai-codex/gpt-5.5 with the exact key from config/status.
Important: if the active model is a legacy codex/... key, switch away from that. codex/... is the native Codex app-server route. For PI + Codex OAuth, the model ref needs to be openai-codex/..., with the matching agents.defaults.models["openai-codex/..."] runtime override above.
Then send /reset or /new in Telegram before retesting, because the old conversation can keep a pinned harness/session binding.
The agentRuntime.id: "pi" override on openai/gpt-5.4 and openai/gpt-5.5 was applied and the gateway restarted
Short messages initially worked (session-memory saved, turn/started progressed)
But it's still showing codex_app_server:notification in the stall logs — the PI runtime doesn't seem to be fully taking over
The agent:main:main background session also stalls with the same pattern
There was also a config validation error earlier: models.providers.openai-codex.baseUrl: Invalid input: expected string, received undefined
That baseUrl error might be the real clue — the openai-codex provider may need its baseUrl explicitly set. Ask them about that.
I would not chase baseUrl here. For 77e5a49, openai-codex is a built-in OAuth provider and the code has its own default Codex Responses base URL. Adding models.providers.openai-codex creates a custom provider config shape, and that path requires baseUrl + models; it can shadow the built-in provider. So that earlier error is more likely config-shape fallout than the root cause.
The fact that you still see codex_app_server:* means the actual selected runtime is still codex for the turn that is stalling. In this build the runtime selector is provider/model-policy based; session pins and OPENCLAW_AGENT_RUNTIME=pi do not reliably override OpenAI/Codex defaults.
Can you paste these outputs, redacting secrets/profile emails if needed?
openclaw config get agents.defaults.model
openclaw config get agents.defaults.models
openclaw config get agents.list
openclaw config get models.providers.openai
openclaw config get models.providers.openai-codex
The key thing to find is the exact provider/model ref being used by agent:main:main and Telegram. If it is still codex/... or a model entry with agentRuntime.id: "codex", it will keep using the Codex app-server. If it is openai/gpt-5.4 or openai/gpt-5.5, the matching entry needs to be exactly:
{
"openai/gpt-5.4": { "agentRuntime": { "id": "pi" } },
"openai/gpt-5.5": { "agentRuntime": { "id": "pi" } }
}
Also: I see a newer main commit named fix(codex): fail fast after quiescent app-server turns, which sounds very close to your stall signature. So if you can update to a build containing that fix, that may be cleaner than continuing to fight runtime selection on 77e5a49.
I am now running OpenClaw 2026.5.14 (9e9a825) and I still have the same issue.