#'tree_reduce' is not defined

41 messages Ā· Page 1 of 1 (latest)

hoary mirage
#

Hi @icy saffron my agent running on LMstudio / local models started responding with this today! ļø ā€œāš ļøCron job "REDACTED" failed: Error in iterating prediction stream: NameError: name 'tree_reduce' is not definedā€

I’ve a hunch it’s related to my lossless-claw configuration but am not sure.

Kindly help me trace and resolve this so my crons keep running

Version: 2026.4.14

#

@icy saffron here are logs to help us:

17:00:04+00:00 warn agent/embedded {"subsystem":"agent/embedded"} {"event":"embedded_run_agent_end","tags":["error_handling","lifecycle","agent_end","assistant_error"],"runId":"4f70cd1c-5acd-441a-ab3d-672418148c29","isError":true,"error":"Error in iterating prediction stream: NameError: name 'tree_reduce' is not defined","failoverReason":null,"model":"zai-org/glm-4.7-flash","provider":"lmstudio","rawErrorPreview":"Error in iterating prediction stream: NameError: name 'tree_reduce' is not defined","rawErrorHash":"sha256:07896336b3a1","providerRuntimeFailureKind":"unknown"} embedded run agent end

icy saffronBOT
#

Let me trace this from the docs/source side first, then I’ll give you the shortest fix path.

#

Maybe this is upstream in LM Studio/model templating rather than OpenClaw itself. I’m checking that angle too, because the error string smells server-side.

#

This looks much more like an LM Studio/model-runtime failure than an OpenClaw core bug.

I checked the current docs/source, and tree_reduce does not exist anywhere in OpenClaw. So lossless-claw might be an indirect trigger by changing prompt shape/size, but it’s probably not the thing throwing that NameError.

Fastest isolation path:

  1. Check the failing cron + logs
openclaw cron runs --id <jobId> --limit 5
openclaw logs --follow
  1. A/B test without lossless-claw
openclaw config set plugins.slots.contextEngine legacy
openclaw gateway restart
#

If the cron starts working again, then lossless-claw is the trigger path.

  1. If it still fails on LM Studio/local, try the local-model mitigations from docs:
  • agents.defaults.experimental.localModelLean: true
  • compat.supportsTools: false
  • and only if logs mention messages[].content shape issues, compat.requiresStringContent: true

That matches:

My guess: LM Studio or the model/chat-template changed underneath you, and the larger cron prompt is now tripping an upstream runtime/template bug.

If you paste these, I can narrow it down pretty fast:

  • the LM Studio model ID
#
  • 10 to 20 lines around the failure from openclaw logs --follow
  • whether switching contextEngine to legacy changes it

If that fixes it, please right-click the solution message → Apps → āœ… Mark Solution so Answer Overflow can index it.

hoary mirage
#

Reporting back after log inspection and A/B test. When I disabled lossless-claw the cron started running. However, my agent continues to throw these logs / messages when I start a new chat or ping them in my channels. "17:23:50+00:00 warn agent/embedded {"subsystem":"agent/embedded"} {"event":"embedded_run_agent_end","tags":["error_handling","lifecycle","agent_end","assistant_error"],"runId":"0fc27b48-f5a3-42c6-bfb3-b5da18b01c90","isError":true,"error":"Error in iterating prediction stream: NameError: name 'tree_reduce' is not defined","failoverReason":null,"model":"zai-org/glm-4.7-flash","provider":"lmstudio","rawErrorPreview":"Error in iterating prediction stream: NameError: name 'tree_reduce' is not defined","rawErrorHash":"sha256:07896336b3a1","providerRuntimeFailureKind":"unknown"} embedded run agent end"

So Lossless claw may have some culpability but this seems to be an issue with my LMstudio.

This issue has also persisted across a complete host restart.

icy saffronBOT
#

Yep, I think we’ve got the shape of it now.

What your A/B test says:

  • lossless-claw is not the root bug
  • it amplifies / exposes the bug by changing prompt size or structure
  • the real failure is still zai-org/glm-4.7-flash inside LM Studio

The big tell is that it still happens on fresh chats after disabling lossless-claw and even after a host reboot.

I also checked this from the docs/source side:

  • tree_reduce does not appear in OpenClaw
  • it also does not appear in the published GLM-4.7-Flash chat template
  • so this looks like a deeper LM Studio runtime / model implementation issue, not an OpenClaw config key problem

Also interesting: the GLM-4.7-Flash model card explicitly calls out local serving via transformers / vLLM / SGLang, not LM Studio. That makes me suspect this is a model-runtime compatibility bug on LM Studio’s side.

#

What I’d do next, in order

  1. Test the same model directly in LM Studio

    • plain prompt like hi
    • if that also blows up, it’s 100% upstream of OpenClaw
  2. Switch OpenClaw to a different LM Studio model temporarily

    • if other local models work, GLM-4.7-Flash is the bad actor
    • that’s the fastest way to get your chats stable again
  3. Re-download or swap the GLM build

    • if LM Studio offers different variants/quants, try another one
    • unload/reload the model fully
  4. Keep lossless-claw disabled for now

    • at least until the base model path is stable again
#
  1. If you stay local-only, reduce prompt pressure
    • agents.defaults.experimental.localModelLean: true
    • compat.supportsTools: false
    • those are the right OpenClaw mitigations for fragile local backends

My practical recommendation

For now, I’d stop using GLM-4.7-Flash via LM Studio as the active agent model. It looks brittle for this workload. Use another local model, or a hosted fallback, to keep chats and crons alive while you debug GLM separately.

Relevant docs:

If you want, send your LM Studio version plus whether LM Studio’s own chat UI reproduces the same tree_reduce error, and I can help you decide whether this is worth filing upstream to LM Studio vs the GLM model repo.

hoary mirage
#

@icy saffron reporting back with fresh observations:

  1. Lossless claw continues to compact even when I change the contextEngine to legacy.
  2. Lossless claw doesn't actually seem to affect this issue; the issue is remarkably similar no matter how I configure it.
  3. LMstudio is mighty suspect here. The CLI daemon showed the models I expected to see when I ran lms ps yet when I opended the GUI I saw duplicate loads of the same models, including glm-4.7-flash. (I ejected all models, killed the app / daemon, restarted, reloaded models, then restarted the server)
  4. JIT loading is not enabled, yet I saw the agent successfully load another instance of the model.
  5. My LMstudio is up to date
#
  1. I am able to chat with the model directly in LMstudio.
#

So..I wonder if this this may be a symptom of updating to 2026.4.14 but not completing setup for LMstudio now that it's officially supported?

icy saffronBOT
#

This is super helpful, and it actually clears up two confusing bits.

1) Legacy still compacts.
Switching contextEngine to legacy does not disable compaction, it just switches back to OpenClaw’s built-in compaction path. So ā€œit still compactsā€ does not mean lossless-claw is still in control.
Docs: https://docs.openclaw.ai/concepts/context-engine

2) A load attempt with JIT off is not the weird part.
OpenClaw’s LM Studio integration does an ensure-loaded / preload step before inference. So the agent causing a load attempt is expected.
What is not expected is LM Studio showing duplicate loaded instances instead of reusing the already-loaded one.

So at this point, I’m pretty convinced the root issue is:

  • LM Studio runtime/state issue, or
  • GLM-4.7-Flash + LM Studio compatibility issue

and not lossless-claw.

#

My read now

  • same tree_reduce failure on fresh chats
  • persists after reboot
  • duplicate model loads in LM Studio
  • JIT off, but LM Studio still appears to instantiate another copy

That’s upstream-smelling as hell.

What I’d do next

  1. Stop using zai-org/glm-4.7-flash as the active OpenClaw model for now.
    Use another local model temporarily so chats/crons stay alive.

  2. Test GLM directly in LM Studio’s own chat UI with a tiny prompt like hi.

    • If it fails there too, this is firmly LM Studio / model-side
    • If LM Studio UI works but OpenClaw fails, then we’ve found a narrower integration edge case
#
  1. If another LM Studio model works fine in OpenClaw, I’d treat GLM-4.7-Flash as the bad actor and move on.

  2. If you want to keep pushing GLM, I’d file upstream with:

    • model: zai-org/glm-4.7-flash
    • error: NameError: name 'tree_reduce' is not defined
    • LM Studio up to date
    • duplicate model loads visible in GUI
    • reproducible on fresh chats, not just crons
    • reboot did not help

My recommendation

Honestly, I would not spend much more time blaming lossless-claw here. I think it just made a fragile model/runtime path easier to trip.

If you want, send:

  • whether LM Studio UI itself reproduces the error
  • one working LM Studio model name
#
  • one failing GLM variant/quant name

and I can help you decide whether the best workaround is ā€œswitch modelā€, ā€œswitch runtimeā€, or ā€œfile upstream now.ā€

#

Ah, perfect, that’s the missing clue.

If it works in LM Studio directly but fails from OpenClaw, then the model is not fundamentally broken, but the OpenClaw agent-runtime prompt shape is tripping LM Studio / GLM.

That usually means:

  • plain chat prompt in LM Studio = fine
  • full OpenClaw run with system prompt, history, compaction, tool schemas, etc. = not fine

So this now matches the docs bucket pretty cleanly: direct/simple prompts work, agent turns fail → treat it as a local backend/model compatibility limit, not a lossless-claw problem.

Best next fixes

  1. Turn on lean local mode

    • agents.defaults.experimental.localModelLean: true
  2. Disable tool schemas for this model

#
  • compat.supportsTools: false
  1. Keep lossless-claw out of the test path for now
    • only reintroduce it after the base model path is stable

If that combo still fails, my honest take is:

zai-org/glm-4.7-flash may be OK for LM Studio’s own chat UI, but not stable enough for OpenClaw agent use via LM Studio.

At that point I’d switch agent traffic to another model and stop burning time on this one.

Relevant docs:

If you want, paste your models.providers.lmstudio block and I can point to the exact spot where supportsTools: false and lean mode should go.

#

I doubt that’s the main cause.

Why I’m leaning no:

  • the log already shows provider: "lmstudio", so OpenClaw is already talking through an LM Studio provider path
  • LM Studio itself can chat with the model
  • the failure only shows up on OpenClaw agent turns, which points more to prompt/tool/runtime compatibility than ā€œsetup not completedā€

So I’d frame it like this:

  • Possible: old/stale LM Studio config after the upgrade
  • Unlikely: that alone explains NameError: name 'tree_reduce' is not defined

That error still smells like ā€œLM Studio + this model + OpenClaw’s heavier agent prompt shapeā€ rather than onboarding drift.

If you want a low-risk cleanup step, sure, I’d do one of these:

#
  1. Compare your current LM Studio provider block to the current docs
    https://docs.openclaw.ai/providers/lmstudio

  2. Re-run openclaw onboard and choose LM Studio again, if you want to normalize config

But I would treat that as a sanity pass, not the likely fix.

My stronger bets are still:

  • agents.defaults.experimental.localModelLean: true
  • compat.supportsTools: false
  • or switching away from zai-org/glm-4.7-flash for agent use

If you paste your models.providers.lmstudio block, I can tell you pretty quickly whether it looks stale from pre-official-support days.

hoary mirage
#

"providers": {
"lmstudio": {
"baseUrl": "http://localhost:1234/v1",
"apiKey": "lmstudio",
"api": "openai-responses",
"models": [
{
"id": "zai-org/glm-4.7-flash",
"name": "GLM-4.7-Flash",
"reasoning": true,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 200000,
"maxTokens": 4192
},
{
"id": "qwen/qwen3.5-9b",
"name": "Qwen3.5 9B",
"reasoning": false,
"input": [
"text",
"image"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 200000,
"maxTokens": 4192
},
{
"id": "qwen/qwen2.5-vl-7b",
"name": "Qwen2.5-VL-7B-Instruct-GGUF",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 60685,
"maxTokens": 4192
},

icy saffronBOT
#

This doesn’t look like an old pre-support LM Studio config to me.

The shape is pretty normal for current LM Studio usage:

  • baseUrl: "http://localhost:1234/v1"
  • apiKey: "lmstudio"
  • api: "openai-responses"

So I wouldn’t blame ā€œnot re-running setupā€ as the main problem.

What does jump out:

1) contextWindow: 200000 for GLM-4.7-Flash

That looks suspicious.

GLM-4.7-Flash is generally advertised around 128k context, not 200k. If OpenClaw thinks it has 200k available, it may build much larger agent prompts than the model/runtime path actually handles well.

#

For local stacks, that often causes weird backend failures instead of a clean ā€œcontext too longā€ error.

2) reasoning: true

I’d test with this off for GLM.

Since the model works in LM Studio chat but fails on OpenClaw agent turns, reducing extra reasoning/tool pressure is a good next move.

3) No compat override yet

I would add:

compat: {
  supportsTools: false
}

for the GLM entry.

#

What I’d change first

For just zai-org/glm-4.7-flash, try this shape:

{
  id: "zai-org/glm-4.7-flash",
  name: "GLM-4.7-Flash",
  reasoning: false,
  input: ["text"],
  cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
  contextWindow: 128000,
  maxTokens: 4096,
  compat: {
#
    supportsTools: false
  }
}

And also enable:

agents.defaults.experimental.localModelLean: true

If it still blows up, I’d go even lower on contextWindow for testing, like 64000.

My read now

  • config shape: fine
hoary mirage
#

@icy saffron what is this setting and when was it enabled in OpenClaw runtime?agents.defaults.experimental.localModelLean: true

icy saffronBOT
#

Let me check the exact docs/source wording so I don’t hand-wave the behavior or the timing.

#

I just checked the docs/changelog, and I need to correct myself a bit.

agents.defaults.experimental.localModelLean: true is:

  • an opt-in experimental flag
  • for weaker local model backends
  • it shrinks the default tool surface by dropping heavyweight defaults like browser, cron, and message
  • goal: make the prompt smaller and less brittle

It was added in 2026.4.15-beta.1, changelog entry:

Agents/local models: add experimental agents.defaults.experimental.localModelLean: true to drop heavyweight default tools like browser, cron, and message...

So if you’re on 2026.4.14, you probably do not have this flag yet unless you’re actually running newer beta/main bits than that version suggests.

So the practical answer is:

#
  • What is it? A lean-mode pressure-release valve for fragile local-model setups
  • When does it take effect? Only when you explicitly set it to true
  • When was it introduced? 2026.4.15-beta.1, not 2026.4.14

Docs/source:

So for your current 2026.4.14 runtime, I would focus on the knobs that are already available, especially:

  • lowering contextWindow
  • setting reasoning: false
  • adding compat.supportsTools: false

That part still stands.

And yeah, my bad, I should’ve checked the exact version boundary sooner. If this fixed it, please right-click the solution message → Apps → āœ… Mark Solution so Answer Overflow can index it.

hoary mirage
#

No worries, @icy saffron. It's an interesting and probably quite a valuable setting. I hope it doesn't drop tool calls, but rather drops tool-call results from the context. That would be money!

I'll mess around with the settings and report back. This version of GLM 4.7 Flash supports over 200k context window but I will dial that back and see if the agent can find a happy place.

icy saffronBOT
#

Yep, worth testing.

And just to clarify, localModelLean is the blunt version, not the magic version šŸ˜…

It reduces the available default tools/tool schema surface going into the prompt. It is not a smart ā€œkeep the tool call, drop the bulky tool resultā€ mode, at least from the docs/source I checked.

So your current test plan sounds good:

  • dial back contextWindow
  • try reasoning: false
  • try compat.supportsTools: false
  • see where GLM finds a stable lane

If you report back with the combo that stops tree_reduce, that’ll be a really useful data point for future LM Studio + GLM users.

If this fixed it, please right-click the solution message → Apps → āœ… Mark Solution so Answer Overflow can index it.

hoary mirage
#

@icy saffron Quick update:

  1. The agent responds now. (Dialing back the context window and turning off reasoning in openclaw.json seems to have been the main knob to turn here.)
  2. The agent is still able to load another instance of the model to LMStudio without JIT enabled (unlikely to be an OpenClaw bug!)
  3. TBD if the agent can actually do useful work; remains to be seen.
  4. Here's the config I landed on which is at least allowing me to start new sessions and get some responses (not just errors) from the agent, for the sake of posterity:

"models": { "mode": "merge", "providers": { "lmstudio": { "baseUrl": "http://localhost:1234/v1", "apiKey": "lmstudio", "api": "openai-responses", "models": [ { "id": "zai-org/glm-4.7-flash", "name": "GLM-4.7-Flash", "reasoning": false, "input": [ "text" ], "cost": { "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0 }, "contextWindow": 128000, "maxTokens": 4192 },

My Next Steps:

  1. See if there is a better local-model server than LMStudio for my stack. (It feels like the VW bug of local servers. Cute, gets you from point A to point B but performance is meh)
  2. Try enabling and disabling tools for this model (note that this model has performed tool calls with better-than-expected reliability for ~3 weeks now. It's honestly been the best performer of the ~20 I've tested, beating out nemotron, Qwen 3*, hermes 4)
  3. FAFO with more models.
icy saffronBOT
#

Nice, that’s a really useful result.

So the posterity summary is basically:

  • contextWindow: 128000
  • reasoning: false
  • LM Studio via openai-responses
  • and the tree_reduce failure stops happening on new sessions

That strongly points to prompt/runtime pressure, not lossless-claw.

Your note about GLM being the best tool-caller of the bunch is genuinely interesting too. In that case, I’d leave tools on unless instability comes back, and treat supportsTools: false as the emergency fallback, not the default.

And yeah, the duplicate-load behavior still sounds like LM Studio weirdness, not OpenClaw weirdness.

If you go server-shopping next, GLM’s own docs/model-card path makes vLLM / SGLang / transformers the obvious candidates to compare against LM Studio.

#

Glad we caught this one 🦐
If this fixed it, please right-click the solution message → Apps → āœ… Mark Solution so Answer Overflow can index it.

hoary mirage
#

@icy saffron can you spill the tea on which local servers / smaller models people are having the best results with?

icy saffronBOT
#

For servers, the rough tea is:

  • LM Studio: easiest to get running, nicest UX, good for experiments
  • vLLM / SGLang: what I’d look at when you want something more serious and reproducible
  • plain OpenAI-compatible local proxies: workable, but more likely to hit weird compat edges

For smaller models, I don’t want to give you a fake ā€œtop listā€ because that goes stale basically instantly, and I’ve been explicitly told not to invent model rankings from memory.

For fresh hands-on model recommendations, ask in #models. That’s the right place for ā€œwhat’s actually working well this week?ā€

Docs worth checking:

My non-spicy opinion:
if you want easy, LM Studio.

#

if you want less cute, more serious, start testing vLLM / SGLang.
if you want agent reliability, smaller models are still the sketchiest part of the stack.

hoary mirage
#

I cannot get Answer Overflow to mark anything as a solution. But for anybody reading this, the solution is a few responses back: #1494021818963464435 message