#Repeated embedded_run_agent_end / terminated in direct agent name Bill GPT-5.4 Telegram session

1 messages · Page 1 of 1 (latest)

jade umbra
#

Hi @spark lake , I need help diagnosing repeated Bill session terminations in OpenClaw.

Environment:

• OpenClaw version: 2026.4.14 (323493f)
• Host: Drill’s Mac mini
• OS: Darwin 25.3.0 arm64
• Node: v22.22.1
• Channel: Telegram group thread
• Session runtime from /status: direct
• Current Bill session model: openai-codex/gpt-5.4
• Reasoning was on, stream mode enabled

What the user is seeing:

• Bill sometimes visibly replies only with terminated
• This has now happened multiple times in the same thread while trying to do normal coding work

What the logs show:

• Repeated entries from subsystem agent/embedded
• Event: embedded_run_agent_end
• Error: terminated
• providerRuntimeFailureKind: "unknown"

Recent examples from the log:

• 2026-04-22 22:15:49 EDT
• 2026-04-22 22:16:47 EDT
• 2026-04-22 22:17:30 EDT

Representative log shape:

• subsystem: agent/embedded
• event: embedded_run_agent_end
• isError: true
• error: terminated
• model: gpt-5.4
• provider: openai-codex
• providerRuntimeFailureKind: unknown

There were also earlier same-day errors on the same model that looked like provider-side failures:

• LLM error server_error
• failoverReason: timeout
• request IDs were included in the provider error text

Confusing part:

• /status says the session runtime is direct
• but the logs are under agent/embedded
• I assume that may be normal internal execution plumbing, but I want confirmation
• the abnormal part is the repeated embedded_run_agent_end with bare terminated

Current config fact:

• Bill is using openai-codex/gpt-5.4
• That is the only model I care about in this setup

Operational impact:

• Bill is trying to do code work
• no code patch had landed yet when several of these terminations happened
• the work gets interrupted mid-turn
• the user sees this as Bill randomly dying

Ill send some questions in the chat

brazen sleetBOT
undone zephyrBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> âś… Mark Solution

brazen sleetBOT
jade umbra
#

Questions:

  1. Is agent/embedded expected even when session runtime is direct?
  2. What exactly causes an embedded_run_agent_end with error: terminated and providerRuntimeFailureKind: unknown?
  3. Is this likely provider-side, gateway-side, context/plugin-side, or session-lifecycle-side?
  4. Why is the user-visible surface sometimes just terminated instead of a structured error?
  5. Could active-memory activity, large memory writes, or context/plugin load be contributing to these turn deaths?
  6. What is the best stable workaround for now while keeping Bill on openai-codex/gpt-5.4?
spark lakeBOT
#

Let me verify this against the actual docs and source before I guess.

#

Yeah, I checked this against current docs/source.

  • /status = direct and logs under agent/embedded is normal.
    direct here means not sandboxed/dockerized, not “not embedded”. OpenClaw’s normal runtime is the embedded agent runtime. Docs: https://docs.openclaw.ai/concepts/agent

  • For openai-codex/gpt-5.4, agent/embedded is especially expected.
    That model uses OpenAI Codex OAuth through the normal OpenClaw/PI embedded path. Only codex/gpt-5.4 uses the native Codex harness. Docs: https://docs.openclaw.ai/plugins/codex-harness

  • The abnormal part is the bare terminated.
    providerRuntimeFailureKind: "unknown" means OpenClaw only got raw error text like terminated and could not classify it as timeout/auth/server_error/etc. So unknown is basically “unrecognized error shape”, not a separate runtime mode.

My read: this looks more like upstream/provider transport instability on openai-codex/gpt-5.4 than Telegram-thread plumbing. The earlier same-day server_error / failoverReason: timeout entries point the same way.

Best next test

Force SSE for just that model, since openai-codex defaults to auto transport and is WebSocket-first:

#

Then run:

openclaw status --all
openclaw models status --probe --probe-provider openai-codex --json
openclaw logs --follow

What that tells us

  • If SSE stops the random terminated deaths, that strongly points at the auto / WebSocket transport path.
  • If SSE still dies with bare terminated, I’d treat this as a likely bug/regression worth filing on GitHub with:
    • openclaw status --all
    • the probe output
    • one full embedded_run_agent_end object
    • exact version 2026.4.14

If you want, send the next questions or paste one full structured failing event after the SSE change, and I’ll help narrow it further.

#

I checked current docs/source, and here’s the clean read:

  1. Yes, agent/embedded is expected even when /status says direct.
    direct means the session is running without sandbox/docker wrapping. It does not mean “not embedded”. OpenClaw’s normal runtime is the embedded agent runtime.
    Docs: https://docs.openclaw.ai/concepts/agent

  2. What embedded_run_agent_end + error: terminated + providerRuntimeFailureKind: unknown means
    It means the run ended with an assistant error, and the raw error text available at the end was basically just terminated.
    OpenClaw then tried to classify that text into known buckets like timeout/auth/schema/replay/etc, and it didn’t match anything, so it logged providerRuntimeFailureKind: unknown.

  3. Most likely cause category
    Based on what you posted, my ranking is:

    • Most likely: provider-side or provider-transport-side instability on openai-codex/gpt-5.4
    • Possible but less likely: gateway-side handling of that provider path
    • Less likely: context/plugin/memory pressure as a primary cause
    • Least likely: session lifecycle plumbing itself
#

Why: the same model/provider also showed earlier server_error / timeout-shaped failures, which points toward that runtime path more than Telegram thread lifecycle.

  1. Why the user sometimes literally sees just terminated
    Because when OpenClaw cannot map the raw error to a friendlier structured message, it falls back to the raw text. So if the last error text is just terminated, that is what reaches the surface.

  2. Could active-memory, big memory writes, or heavy context/plugin load contribute?
    Indirectly, yes. They can increase:

    • prompt size
    • turn latency
    • tool/context pressure

    But they are not the best fit for a bare terminated by themselves. If memory/plugin load were the main issue, I’d more strongly expect errors like:

    • context overflow
    • schema/tool payload rejection
    • replay/session corruption style errors
    • explicit plugin/tool failures
#

So I’d treat them as pressure multipliers, not the main suspect.

  1. Best stable workaround while keeping Bill on openai-codex/gpt-5.4
    Force SSE for that model.

    {
      agents: {
        defaults: {
          models: {
            "openai-codex/gpt-5.4": {
              params: {
                transport: "sse"
              }
            }
          }
    
#
       }
     }
   }

Why this is the best first move:

  • openai-codex defaults to auto transport
  • that path is WebSocket-first
  • your symptom smells like transport/runtime instability more than session plumbing

Provider docs: https://docs.openclaw.ai/concepts/model-providers

My practical recommendation

Do this in order:

  1. force transport: "sse" for openai-codex/gpt-5.4
#
  1. retest the same Bill workflow
  2. if still unstable, temporarily reduce pressure:
    • lower reasoning
    • disable active memory if enabled, just for A/B isolation
    • start a fresh session/thread to rule out transcript-specific weirdness

And for evidence, capture:

openclaw status --all
openclaw models status --probe --probe-provider openai-codex --json
openclaw logs --follow

If SSE fixes it, I’d call this a transport-path problem.
If SSE does not fix it, I’d call it issue-worthy and file it on GitHub with the before/after results: https://github.com/openclaw/openclaw/issues

#

If you want, send the next failing log block after the SSE change and I’ll help interpret it line by line.

jade umbra
#

SSE patch is applied and hot-reloaded for openai-codex/gpt-5.4.

Important update:

• I previously saw post-SSE terminated events in the logs at:
• 2026-04-22 22:47:16 EDT
• 2026-04-22 22:49:17 EDT
• same pattern:
• embedded_run_agent_end
• error: "terminated"
• providerRuntimeFailureKind: "unknown"

But during the most coding pass, Bill completed a full on-disk implementation slice without a visible termination.

What I changed during that successful pass:

• very small atomic file edits only
• no restart
• no live cutover
• local syntax checks only
• no broad exploratory tool burst

So current read:

• SSE alone did not clearly eliminate the issue
• but a smaller / lower-pressure workflow may be reducing how often it happens

Question:
Does that make you lean more toward:

  1. provider/runtime instability still being primary, with load/pressure as a multiplier
  2. plugin/context pressure being more relevant than first thought
  3. some interaction between active-memory/session work and the codex runtime path
spark lakeBOT
#

Yeah, this shifts me a bit, but not all the way to plugin/context as root cause.

My current lean is:

1 > 3 > 2

  1. provider/runtime instability is still primary
  2. with session/context/load acting as a multiplier
  3. and active-memory / session work is a plausible part of that multiplier if enabled

Why I lean that way:

  • You still saw post-SSE terminated, so WebSocket was not the sole problem.
  • But the earlier history still includes provider-shaped server_error / timeout on the same model/provider path.
  • The fact that small atomic edits + low exploratory pressure worked is a strong sign that turn complexity/pressure matters.
  • If plugin/context pressure were the main root cause, I’d expect more obvious signatures like:
    • context overflow
#
  • schema/tool payload rejection
    • replay/session-invalid style failures
    • explicit plugin/tool errors

Instead you’re still getting a bare unclassified termination, which feels more like “runtime/provider bailed out without a clean structured error”.

So I’d phrase it like this:

The openai-codex/gpt-5.4 runtime path still looks brittle, and bigger/hotter turns make that brittleness easier to trigger.

Where active-memory fits

If active memory is enabled, I would absolutely treat it as a possible multiplier, because it can add:

  • extra prompt material
  • extra turn latency
  • more hidden housekeeping
  • more pressure around compaction/memory-flush boundaries
#

But I still would not make it the lead suspect yet.

Best next isolation test

If you want the cleanest A/B:

  • keep same model
  • keep SSE
  • do one comparable coding pass with active memory off for Bill, if active memory is enabled
  • keep the workflow otherwise similar

If failure rate drops a lot, then the more precise conclusion is:

not “active memory is the root cause”
but “active memory/session pressure is materially worsening an already fragile codex runtime path.”

My updated practical read

  • Root: unstable openai-codex/gpt-5.4 run path
#
  • Multiplier: turn pressure, broad tool bursts, heavier context/session state
  • Possible sub-multiplier: active memory / compaction housekeeping, if enabled

So yes, I’d update from my earlier answer to:

  • still mostly (1)
  • now with more respect for (3)
  • still less convinced by pure (2)

If you want, I can help you design a tight 3-step isolation matrix next, basically:
same task / same model / SSE on, then vary only:

  1. active memory on/off
  2. reasoning high/minimal
  3. broad exploratory vs tiny atomic edits

That should pin down the multiplier pretty fast.

jade umbra
#

This are the messages that are happening and what my agent said after finally amswering, what should i do?
Drill, [Apr 23, 2026 at 3:12 AM]
Hey bill, read last 7 days of daily memory fully, and give me 2 timelines of what we have done for both of our main projects, the sniper ecosystem and the grinder profiles ecosystem. Also tell me all related docs for each ecosystem that has been created in the last 7 days. Be detailed but concise when you answer
Drill, [Apr 23, 2026 at 3:12 AM]
/status@Bill32_Bot
Bill, [Apr 23, 2026 at 3:12 AM]
🦞 OpenClaw 2026.4.14 (323493f)
🧠 Model: openai-codex/gpt-5.4 · 🔑 oauth (openai-codex:default)
🔄 Fallbacks: openai-codex/gpt-5.4
đź§® Tokens: 21k in / 209 out
📚 Context: 21k/1.1m (2%) · 🧹 Compactions: 0
📊 Usage: 5h 99% left ⏱4h 15m · Week 92% left ⏱5d 11h
🧵 Session: agent:main:telegram:group:-1003815814551:topic:8373 • updated just now
⚙️ Runtime: direct · Think: high · Text: low · Reasoning: stream
🔊 Voice: inbound · provider=microsoft · limit=1500 · summary=on
👥 Activation: always · 🪢 Queue: collect (depth 0)
Bill, [Apr 23, 2026 at 3:21 AM]
terminated
Drill, [Apr 23, 2026 at 3:34 AM]
Bill?
Bill, [Apr 23, 2026 at 3:35 AM]
⚠️ Something went wrong while processing your request. Please try again, or use /new to start a fresh session.
Drill, [Apr 23, 2026 at 3:41 AM]
?
Bill, [Apr 23, 2026 at 3:41 AM]
⚠️ Something went wrong while processing your request. Please try again, or use /new to start a fresh session.
Drill, [Apr 23, 2026 at 11:53 AM]
Helloo

#

Bill, [Apr 23, 2026 at 11:55 AM]
I’m here. I checked it, and the bad news is this is a real runtime bug, not you doing anything wrong.

What I found:

• OpenClaw itself is up and healthy.
• Gateway/service is running
• RPC probe is OK
• no evidence the daemon is crashing
• The actual failure is the agent run itself dying mid-turn on openai-codex/gpt-5.4.
• I found repeated session records with:
• stopReason: "error"
• errorMessage: "terminated"
• OpenClaw is then incorrectly leaking that internal failure into Telegram as:
• terminated
• or the generic “Something went wrong...” message

So the real problem is:

• layer failing: main agent ↔ GPT-5.4 direct runtime
• secondary bug: OpenClaw is surfacing internal error text to chat instead of swallowing/retrying it cleanly

Also important:

• this is not the gateway going down
• not a Telegram issue
• not the sniper/grinder processes crashing
• not fixed by the earlier SSE transport change, based on prior and current evidence

Why it feels extra bad right now:

• this session is on GPT-5.4 direct
• fallback is also just GPT-5.4, so there is effectively no rescue path when the run dies
• long/tool-heavy turns seem more likely to trigger it

My best current root-cause statement:

• the model invocation is being aborted somewhere in the GPT-5.4 runtime path, and OpenClaw’s error handling is buggy enough to post the raw failure back into the chat

If you want, I can do one of these next:

  1. investigate the exact OpenClaw code path that emits terminated into chat
  2. set up a temporary workaround so this session stops spraying raw failure messages
  3. go back to your original memory-timeline task, but in smaller safer chunks so we can get it done despite this bug
spark lakeBOT
#

What I’d do right now:

  1. Use /new first.
    Once Bill starts answering simple followups with the generic error, I’d treat that session as wedged.

  2. Retry the task in smaller slices.
    That request is actually pretty heavy:

    • read 7 days of daily memory fully
    • build 2 timelines
    • find all related docs
    • synthesize it all in one answer

    The /status snapshot only showed the starting context. It does not include all the file content/tool output Bill still has to pull in after the turn starts.

  3. Lower pressure for this task.
    Keep SSE on, but for this specific memory-summary job I’d also:

    • use lower reasoning if possible
#
  • avoid broad exploratory asks
    • temporarily disable active memory for the session if it’s enabled

Best retry shape

Instead of one big ask, do this:

Step 1

Read the last 3 days of daily memory for the sniper ecosystem only. Give me a concise timeline and list related docs created in that window.

Step 2

Now do the same for the grinder profiles ecosystem.

Step 3

Now extend both from 3 days to 7 days, but only add deltas not already covered.

Or even better:

#
  • sniper timeline first
  • grinder timeline second
  • docs list third
  • merged summary last

My read from this example

This makes me think even more:

  • primary issue: fragile openai-codex/gpt-5.4 runtime path
  • trigger/multiplier: broad read + synth + high reasoning + possible memory/context extras

So I would not keep hammering the same wedged session with “Bill?” / “?” / “Helloo”. Once it does that, reset it and retry with a narrower prompt.

If it still dies after /new + smaller slices

Then I’d call it a real reproducible bug and collect:

#
openclaw status --all
openclaw models status --probe --probe-provider openai-codex --json
openclaw logs --follow

Docs: https://docs.openclaw.ai/gateway/troubleshooting

If you want, I can help you rewrite that exact Telegram prompt into a safer 3-message sequence that keeps the same goal but is much less likely to kill Bill.

#

Bill’s read is basically aligned with what I’d tell you.

The two key points I agree with are:

  • this looks like a real runtime bug on the openai-codex/gpt-5.4 path
  • the raw terminated leak into chat is a second bug in error surfacing

That second part is consistent with current source behavior, too. The embedded lifecycle/error formatting path can fall through and surface raw error text instead of a nicer mapped error.

What I’d do next

1. Do the practical workaround now

If you need Bill usable again today:

  • run /new
  • keep SSE
  • retry work in smaller chunks
#
  • lower reasoning for heavy read/summarize tasks
  • turn active memory off temporarily if it’s enabled

And for that memory-timeline job, don’t ask for:

  • 7 full days
  • 2 ecosystems
  • all related docs
  • one combined answer

all in one turn.

2. Fix the fallback situation

Right now your fallback is effectively:

  • GPT-5.4
  • then GPT-5.4 again
#

So when the run dies, there is no real rescue path.

I would either:

  • configure a real secondary fallback model, or
  • deliberately keep it fail-fast and accept that this session is brittle until the bug is fixed

Docs: https://docs.openclaw.ai/concepts/model-failover
Models docs: https://docs.openclaw.ai/concepts/models

3. File a GitHub issue

At this point I do think it’s issue-worthy.

Include:

  • exact version 2026.4.14
  • model openai-codex/gpt-5.4
#
  • note that SSE did not eliminate it
  • one matching pair of:
    • session evidence: stopReason: "error", errorMessage: "terminated"
    • log evidence: embedded_run_agent_end, providerRuntimeFailureKind: "unknown"
  • note that gateway stayed healthy

Issue tracker: https://github.com/openclaw/openclaw/issues

My recommendation order

If this were my thread, I’d do:

2 + 3 for the immediate user experience, then 1 only if you want to help debug upstream.

Meaning:

  • make Bill usable again with /new, smaller chunks, and a real fallback
  • then file the bug
#
  • only then go spelunking through the exact code path if you want to contribute a repro/fix

If you want, I can help you write the GitHub issue body from everything you’ve gathered so far.

jade umbra
#

just reporting back, after trying out many things, the only fix to avoid "terminated" message i found was to completely disable and turn active memory off. There seems to be a big issue with Active memory and big prompts that makes active memory absolutely Unusable.

spark lakeBOT
#

That’s actually a really strong A/B result.

If:

  • same agent
  • same model
  • same general workload
  • active memory on => terminated
  • active memory off => stable

then I’d stop calling active memory just a multiplier here. I’d treat it as the trigger for this bug path in your setup.

A more precise statement would be:

openai-codex/gpt-5.4 + active memory + large/heavy prompts is broken on 2026.4.14 in this setup.

What I’d do now

#

1. Keep active memory off for Bill

That’s the practical workaround.

2. If you still want memory, narrow it hard

Instead of full disable everywhere, try limiting it to safer cases:

{
  plugins: {
    entries: {
      "active-memory": {
        enabled: true,
        config: {
          agents: ["main"],
          allowedChatTypes: ["direct"],
          queryMode: "recent",
#
          timeoutMs: 15000,
          maxSummaryChars: 220
        }
      }
    }
  }
}

Why those knobs:

  • allowedChatTypes: ["direct"] keeps it out of group/channel threads
  • queryMode: "recent" is much lighter than full
  • lower timeoutMs
  • smaller maxSummaryChars

Docs: https://docs.openclaw.ai/concepts/active-memory

#

3. Turn on debug logging briefly and capture one repro

If you want a useful bug report, temporarily add:

config: {
  logging: true,
  persistTranscripts: true
}

Then reproduce once with active memory on, and compare to off.

4. File a GitHub issue

At this point I do think this should be reported.

Best issue framing: