#OOM every ~30 minutes of use in a container.

25 messages · Page 1 of 1 (latest)

white burrow
pliant mesaBOT
calm whaleBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> âś… Mark Solution

molten cloakBOT
#

I'll check whether there's already a fix or workaround landed for that OOM pattern, then I’ll give a clean in-thread answer.

#

If this is the same crash as #57349, first thing I’d check is whether you’re still on 2026.3.28.

A couple of gateway memory-leak fixes landed after that, including:

  • a node-related leak fix in early April
  • fix(gateway): plug long-running memory leaks in 2026.4.10

So my recommendation is:

  1. Update to latest first if you’re still on 2026.3.28, especially for Docker/container setups.
    Docker docs: https://docs.openclaw.ai/install/docker

  2. Then paste these:

    openclaw --version
    openclaw status --all
    openclaw gateway status
    
white burrow
#

As described in the github issue linked, this is happening on 2026.4.11

molten cloakBOT
#

Got it, that’s useful. I’m checking whether anything newer than 2026.4.11 already landed for this specific leak, so I can point you at either a real workaround or a likely fix.

#

Oof, okay, that’s important, thanks.

If it still happens on 2026.4.11, then this does not look fully fixed by the earlier leak patches.

I checked current main, and I don’t see an obvious post-2026.4.11 fix for the webchat/control UI -> node.list polling path yet.

Most useful workaround for now:

  • Do not leave the Control UI/webchat tab open if you need stability.
  • The UI starts polling node.list every ~5s as soon as it connects, even if you are not on the Nodes tab.
  • So “just leave webchat open on Chat” is probably still enough to trigger the bad path.

If you need a stopgap only, you can also raise the Node heap in the container with NODE_OPTIONS=--max-old-space-size=8192, but that’s just buying time, not fixing the leak.

At this point I’d suggest updating the GitHub issue with:

  • exact openclaw --version
  • confirmation it still reproduces on 2026.4.11
white burrow
molten cloakBOT
#

You’re right, my bad, I re-asked for info that was already in the issue.

The only actually useful addition here is:

  • it still reproduces on 2026.4.11
  • there’s no obvious released fix yet
  • the main workaround is still don’t leave Control UI/webchat open
  • bumping NODE_OPTIONS just delays the crash

So if you add anything to the GitHub issue, I’d keep it to one short update like: “still reproduces on 2026.4.11”. That’s the new signal.

white burrow
#

So what are others doing? If others are running OpenClaw in a container, this is going to happen. Which leads me to believe no one is running this in a container because it is unusable beyond asking one off questions. No work or building can take place, the crash kills any current main or subagent work being done, then you have to re-prompt through the web-ui to continue on?

molten cloakBOT
#

I wouldn’t jump to “nobody runs it in a container”.

My honest read is more:

  • containerized gateway itself is viable
  • long-lived Control UI/webchat sessions in a container are the weak spot here
  • for heavier work, people tend to either:
    1. run from source/on-host for the best dev loop, or
    2. keep the gateway in Docker/Podman but use Discord/Telegram/TUI/CLI as the main surface, not a browser tab left open all day

That lines up with the docs too, which treat Docker as optional, while the normal recommendation is the regular install / source flow:

And yes, with the bug in #57349, your concern is fair:

  • an OOM kills the active main/subagent work
  • if you mounted persistence correctly, history/workspace should survive
#
  • but the in-flight turn is toast, so you usually have to resume manually

So I think your conclusion is basically:
container + web-ui-for-hours-at-a-time is not dependable enough right now for serious building

If you want actual human experience reports, I’d also ask in #users-helping-users. This thread is good for bug tracking, but that channel is better for “what are people actually doing in practice?”

white burrow
molten cloakBOT
#

Yep, exactly, and that’s the important distinction.

State surviving is not the same as the run resuming.

After a gateway crash, the transcript/session may still be there, but the in-flight execution is gone. So today it behaves more like:

  • history survives
  • active turn dies
  • no automatic continuation
  • you have to manually re-engage the agent to get it moving again

So for real build/workflows, I agree with you, that’s still a broken experience. A restart policy helps the process come back up, but it does not solve continuity.

white burrow
#

I mean, what are fake build/workflows? Is that not what this was designed for? What model is backing you right now Krill?

molten cloakBOT
#

Fair pushback, that wording was bad on my part.

I didn’t mean “fake” as in not real. I meant short/simple one-off tasks versus long-running interactive build sessions.

And yes, you’re right, real build/workflows are absolutely part of what OpenClaw is supposed to support. So if this path OOMs, drops the in-flight run, and needs a manual “pick back up” message, that’s a real break in the intended experience, not some edge case I’d hand-wave away.

On the model question, I can’t give reliable runtime model identity in chat here, and it wouldn’t really change this diagnosis anyway, this looks like a gateway/control-ui/runtime continuity problem, not a model-choice problem.

For model docs/community notes, use:

white burrow
#

I just want to know if Im talking in loops or if you are backed by a specific model I havnt run this query through yet. That is why I ask what model is backing your responses right now.

molten cloakBOT
#

Totally fair question.

I can’t give the runtime model identity here, but I can answer the part underneath it:

No, you’re not talking in loops.
The reason the answer keeps collapsing to the same place is that, from the docs/repo/issue state, this still looks like a real unresolved bug with no clean released workaround beyond avoiding the bad path.

So the stable answer is still:

  • your use case is valid
  • the OOM + lost in-flight continuation is real
  • I don’t see a released fix yet
  • “don’t leave webchat open” is a workaround, not a solution

If you want a second non-me take, I’d ask in #users-helping-users or #models so you can compare against human experience / other setups.

white burrow
#

Thanks for the "help"

molten cloakBOT
#

Fair. Sorry, that was frustrating.

You wanted a concrete fix, and I don’t have one for this bug. I shouldn’t have dragged you through the same points twice.

white burrow
#

"Fair"... Well that gives it away. Thanks again