#Best Configuration Practice

1 messages · Page 1 of 1 (latest)

humble zinc
#

Give me a list of highly recommended configurations that I should incorporate for efficient session turns and runs on openclaw.

I have a locally hosted model which I am using.
So, I already have config like,

agents.defaults.compaction.reserveTokensFloor
models.providers.vllm.models.reasoning
models.providers.vllm.models.contextWindow
models.providers.vllm.models.maxTokens

Include some key configurations that I can to my list. For example, add model fallbacks etc

summer dewBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> âś… Mark Solution

steady reefBOT
#

Good list to add. Since you’re on vLLM/local, I’d treat this as the “efficient but resilient” baseline:

{
  agents: {
    defaults: {
      // Main routing: local first, cloud/other local fallback if local is busy/broken
      model: {
        primary: "vllm/your-main-model",
        fallbacks: [
          "vllm/your-smaller-fast-model",
          "openrouter/some-cloud-backup" // optional emergency fallback
        ],
      },

      // Keep model picker/overrides predictable; also lets you define aliases/params
#
      models: {
        "vllm/your-main-model": {
          alias: "local-main",
          params: { temperature: 0.2, maxTokens: 8192 },
        },
        "vllm/your-smaller-fast-model": {
          alias: "local-fast",
          params: { temperature: 0.1, maxTokens: 4096 },
        },
      },

      // Reduces repeated context bloat from old tool outputs
      contextPruning: {
        mode: "cache-ttl",
        ttl: "10m",
#
        keepLastAssistants: 3,
        minPrunableToolChars: 20000,
        softTrim: { maxChars: 4000, headChars: 1500, tailChars: 1500 },
        hardClear: { enabled: true, placeholder: "[Old tool result cleared]" },
      },

      // Good for long sessions; use a cheaper/fast model if it summarizes well
      compaction: {
        mode: "safeguard",
        reserveTokensFloor: 24000,
        timeoutSeconds: 900,
        model: "vllm/your-smaller-fast-model",
        identifierPolicy: "strict",
        notifyUser: false,
      },
#

      // Saves prompt tokens on normal continuation turns
      contextInjection: "continuation-skip",

      // Avoid local model/GPU overload from too many simultaneous sessions
      maxConcurrent: 1,
      timeoutSeconds: 900,

      // Background/heartbeat efficiency
      heartbeat: {
        every: "0m" // or use isolatedSession/lightContext + a cheap model
      },

      // If you use subagents, don’t let them inherit the expensive/slow model blindly
      subagents: {
#
        model: "vllm/your-smaller-fast-model",
        maxConcurrent: 2,
        runTimeoutSeconds: 900,
      },
    },
  },

  models: {
    mode: "merge",
    providers: {
      vllm: {
        baseUrl: "http://127.0.0.1:8000/v1",
        apiKey: "${VLLM_API_KEY}",
        api: "openai-completions",
        models: [
#
          {
            id: "your-main-model",
            name: "Local Main",
            reasoning: false,
            input: ["text"],
            contextWindow: 128000,
            contextTokens: 96000, // optional runtime cap below native window
            maxTokens: 8192,
            cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
            compat: {
              requiresStringContent: true, // only if your backend needs plain string content
              // supportsTools: false, // set if tool-calling breaks on your local model
            },
          },
        ],
#
      },
    },
  },

  auth: {
    cooldowns: {
      overloadedProfileRotations: 0,
      rateLimitedProfileRotations: 0,
      overloadedBackoffMs: 0,
    },
  },
}

Highest-value keys for your list:

#
  • agents.defaults.model.fallbacks — ordered fallback chain. Put a smaller local model first, then optional cloud fallback.
  • agents.defaults.models — model allowlist + aliases + per-model params. Use --merge when editing so you don’t clobber existing models.
  • agents.defaults.contextPruning — very useful for efficient long sessions; trims old tool results before they balloon context.
  • agents.defaults.contextInjection: "continuation-skip" — skips repeated workspace bootstrap injection on safe continuation turns.
  • agents.defaults.compaction.model — use a cheaper/faster summarizer if your main local model is slow at compaction.
  • models.providers.vllm.models[].contextTokens — optional effective context cap below the model’s native contextWindow; useful if vLLM crashes near the theoretical max.
  • models.providers.vllm.models[].compat.requiresStringContent — for OpenAI-compatible local servers that dislike array content.
  • models.providers.vllm.models[].compat.supportsTools: false — only if your local model/backend fails on tool calls.
  • agents.defaults.maxConcurrent — set low for one local GPU; usually 1 is safest.
  • agents.defaults.subagents.model — route spawned subagents to a cheaper/faster model.
  • agents.defaults.heartbeat.every: "0m" — disable heartbeats unless you actively use them; otherwise set heartbeat model, lightContext, and/or isolatedSession.
  • auth.cooldowns.* — for local-only setups, reduce profile rotations; for mixed local+cloud fallback, defaults are usually fine.

Docs: https://docs.openclaw.ai/providers/vllm, https://docs.openclaw.ai/concepts/models, https://docs.openclaw.ai/concepts/model-failover, https://docs.openclaw.ai/concepts/session-pruning, https://docs.openclaw.ai/concepts/compaction

Also: run openclaw --version before copying examples from Discord threads — config surfaces move fast, and version mismatch is a common source of “this key doesn’t work” weirdness.

humble zinc
#

what is default.timeoutSecond?

steady reefBOT
#

It’s agents.defaults.timeoutSeconds — plural Seconds, not default.timeoutSecond.

It sets the default max runtime for an agent turn/run, in seconds. Example:

{
  agents: {
    defaults: {
      timeoutSeconds: 900 // 15 minutes
    }
  }
}

The config reference example currently shows timeoutSeconds: 600, i.e. 10 minutes. For local models, bumping it to 900 or 1800 can help if long tool-heavy turns or compaction runs are timing out.

Docs: https://docs.openclaw.ai/gateway/configuration-reference

humble zinc
#

agents.llm.idletimeoutSeconds 1200

steady reefBOT
#

Yes — the key is:

{
  agents: {
    defaults: {
      llm: {
        idleTimeoutSeconds: 1200
      }
    }
  }
}

Important casing: idleTimeoutSeconds — capital T and S.

What it does: it aborts a model request if no response chunks arrive for that long. This is different from agents.defaults.timeoutSeconds, which is the outer timeout for the whole agent run.

#

For local/vLLM models, 1200 is reasonable if the model sometimes sits silently while thinking/loading. Set 0 only if you want to disable the idle watchdog entirely, but I’d avoid that unless you have another timeout protecting runs.

Docs note: if llm.idleTimeoutSeconds is unset, OpenClaw uses agents.defaults.timeoutSeconds when configured, otherwise 120s.

humble zinc
#

If my model context length is 64K, What is the best reserveTokensFloor?

steady reefBOT
#

For a 64K context model, I’d start with:

{
  agents: {
    defaults: {
      compaction: {
        reserveTokensFloor: 20000
      }
    }
  }
}

That’s also the documented default floor, and it’s a good safe value: about 31% of a 64K window reserved for reply generation, tool output, memory flush, and compaction housekeeping.

My practical guidance:

#
  • 16K → more history retained, okay for mostly chat/simple tasks.
  • 20K → best default for 64K. Safer for tools and local-model token estimate wobble.
  • 24K → use if you do lots of tool calls, large file reads, browser/search output, or long final answers.
  • 32K+ → usually too high for 64K; compaction will happen early and you’ll lose useful history sooner.

So for your setup: use 20000 first, then move to 24000 only if you still hit context/overflow issues during tool-heavy runs.

humble zinc
#

Openclaw installation on linux debian

curl : (28) Failed to connect to port. Could not connect to server

steady reefBOT
#

That curl: (28) means the installer URL couldn’t be reached from your Debian machine — usually DNS/network/proxy/firewall/IPv6, not an OpenClaw config issue yet.

First, please share:

openclaw --version 2>/dev/null || echo "openclaw not installed yet"
cat /etc/debian_version

Then try these quick checks:

curl -Iv https://openclaw.ai/install.sh
curl -4 -fsSL https://openclaw.ai/install.sh -o /tmp/openclaw-install.sh

If curl -4 works, run:

#
bash /tmp/openclaw-install.sh

If both fail, test whether the machine can reach GitHub/CDNs at all:

curl -Iv https://github.com
curl -Iv https://registry.npmjs.org/openclaw

Common fixes on fresh Debian/VPS installs:

sudo apt update
sudo apt install -y curl ca-certificates dnsutils
#

If your network blocks openclaw.ai, use the npm install path instead:

sudo apt update
sudo apt install -y nodejs npm
npm install -g openclaw@latest
openclaw onboard --install-daemon

Docs: https://docs.openclaw.ai/install

Paste the output of the curl -Iv ... command if it still fails — the important part is whether it says DNS failure, timeout, IPv6 route issue, proxy, or TLS/cert problem.

humble zinc
#

explain this config,
models.providers.vllm.models.reasoning

When I add a new model to openclaw config. This value is automatically set to false.

Why ? when should I enable it?

Should I need to enable it for Qwen/Qwen3-30B-A3B-Instruct-2507

steady reefBOT
#

models.providers.vllm.models[].reasoning is a capability flag: “does this model/provider route support reasoning/thinking payloads that OpenClaw should treat specially?”

It defaults to false because custom/vLLM/OpenAI-compatible backends are safest that way. If OpenClaw guessed true, it could send/expect reasoning controls that many local servers don’t support, or accidentally surface <think> / reasoning_content weirdness in normal replies.

For vLLM specifically, docs say it’s treated as a proxy-style OpenAI-compatible backend, and OpenAI reasoning-compat payload shaping is not applied. So this flag is mostly metadata/capability routing, not a magic “make the model smarter” switch.

Enable it only when all of these are true:

  • the model is actually a thinking/reasoning variant,
  • your vLLM stack/chat template exposes reasoning cleanly,
  • you want OpenClaw to treat it as reasoning-capable for /think / /reasoning style behavior,
  • and you’ve tested that tool calls + streaming don’t leak messy reasoning into final answers.

For Qwen/Qwen3-30B-A3B-Instruct-2507: leave reasoning: false.

That model card says it is the non-thinking mode variant and “does not generate <think></think> blocks”; enable_thinking=False is no longer required. It can still be good at reasoning tasks, but it is not a “thinking output” model in this config sense.

#

Recommended entry:

{
  id: "Qwen/Qwen3-30B-A3B-Instruct-2507",
  name: "Qwen3 30B A3B Instruct 2507",
  reasoning: false,
  input: ["text"],
  contextWindow: 262144,
  // set lower if your vLLM launch/config uses a smaller max context
  maxTokens: 8192,
  cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 }
}

Tiny rule of claw 🦞: if the model name says Instruct / non-thinking, keep reasoning: false; if it says Thinking, R1, Reasoning, etc., then consider true after testing.

humble zinc
#

why am I getting the incomplete response from openclaw

I have max_tokens as 8192, and the model supports 64K context Length.
The session logs shows the stopReason as "length"

There were no error related to max context exceed or somethng like that.

Help me understand why the stopReason is showing length.

My agents.defaults.compaction.reserveTokensFloor is 2000.

What situation it appears, also tell me why the response is not fully completed.

steady reefBOT
#

stopReason: "length" usually means the provider stopped because the generation/output token limit was hit, not because the model’s context window overflowed.

In OpenClaw source this is mapped directly:

  • OpenAI-compatible: provider finish_reason: "length" → OpenClaw stopReason: "length"
  • Anthropic: provider stop_reason: "max_tokens" → OpenClaw stopReason: "length"

So with maxTokens / max_tokens at 8192, the model can have a 64K context window and still stop early if it uses all 8192 output tokens before it reaches a natural ending. Context length and output length are different limits.

Your agents.defaults.compaction.reserveTokensFloor: 2000 is a separate thing. It controls the minimum compaction headroom OpenClaw tries to preserve when history is getting large. It does not raise the model’s allowed answer length. Also, if your effective compaction reserve is only ~2K while you ask for up to 8192 output tokens, that’s a bit mismatched: OpenClaw may keep more history, but you have less safety headroom for a long reply/tool output.

Why the response is incomplete: the model was still generating when it hit the configured/provider output cap. That is a normal terminal condition, not necessarily an error.

What to check:

  1. Confirm the exact OpenClaw version: openclaw --version
  2. Check the actual model entry OpenClaw is using: openclaw models status
  3. Make sure you configured OpenClaw’s maxTokens field for that model/agent, not only provider-side max_tokens somewhere else.
  4. If you need longer single replies, raise maxTokens if the provider/model allows it, and consider increasing agents.defaults.compaction.reserveTokens / reserveTokensFloor closer to your desired long-output headroom.
humble zinc
#

What is openclaw:bootstrap-context-full ?

I see this in logs of openclaw session. Due to this, the output of my model is incomplete.

steady reefBOT
#

openclaw:bootstrap-context:full is an internal session-history marker, not an error.

It means: “this turn successfully ran with the full workspace bootstrap context injected.” That bootstrap context is the stuff OpenClaw puts into the model prompt from workspace files like AGENTS.md, SOUL.md, TOOLS.md, IDENTITY.md, USER.md, HEARTBEAT.md, etc. The marker is written so OpenClaw can later know whether a continuation turn has already seen the full bootstrap context.

So the marker itself is not truncating your answer. But the full bootstrap context can consume a lot of prompt tokens, which leaves less practical room for the model’s answer and tool/context overhead. If the provider then reports finish_reason: "length" / stop_reason: "max_tokens", OpenClaw shows stopReason: "length", and the visible answer can be cut off.

Useful knobs:

{
  agents: {
    defaults: {
      contextInjection: "continuation-skip",
      bootstrapMaxChars: 12000,
      bootstrapTotalMaxChars: 60000,
    }
#
  }
}
  • contextInjection: "always" is the default: bootstrap files are injected each turn.
  • contextInjection: "continuation-skip" skips re-injecting bootstrap files on safe continuation turns after a completed assistant response, reducing prompt size.
  • bootstrapMaxChars limits each bootstrap file.
  • bootstrapTotalMaxChars limits the total bootstrap context.

I’d try contextInjection: "continuation-skip" first, then lower bootstrapTotalMaxChars if your bootstrap files are large. Also check whether you have a huge AGENTS.md/TOOLS.md/BOOTSTRAP.md; those are common culprits.

Docs: https://docs.openclaw.ai/gateway/configuration-reference#agentsdefaultscontextinjection

And please share openclaw --version if this still behaves oddly — the bootstrap/compaction behavior has changed across versions.

humble zinc
#

{type":"custom", "customType": "openclaw: bootstrap-context:full", "data":{"timestamp":1778765792777, "runId": "5b00933c-e2fe-46aa-a911-630691b "sessionId":"cd268efc-0445-4782-ad53-d5ef8fb94c09"},"id":"3ee9534a","parentId":"cee46006","timestamp":"2026-05-14T13:36:32.7772"}

So this is not the root cause for the my response to not generate?

My request to openclaw gave following output as response.

⚠️ Something went wrong while processing your request. Please try again, or use /new to start a fresh session.

steady reefBOT
#

Correct — that custom entry is almost certainly not the root cause.

openclaw:bootstrap-context:full is just a bookkeeping marker in the session JSONL saying “a full bootstrap-context turn completed.” It is written after a successful full-context run, so by itself it does not stop generation.

The message you got:

⚠️ Something went wrong while processing your request...

is OpenClaw’s generic external run failure message. It means the actual agent/model run ended in an error path, but the user-facing Discord reply intentionally hid the raw internal error. The real cause should be in the gateway/session logs near that runId / sessionId.

Please check these, in this order:

openclaw --version
openclaw status --all
openclaw logs --follow
#

Then reproduce once and look for the real error around:

runId: 5b00933c-e2fe-46aa-a911-630691b...
sessionId: cd268efc-0445-4782-ad53-d5ef8fb94c09

Common causes behind that generic message are:

  • provider/API error hidden by the chat reply layer
  • malformed/out-of-sync session history, especially tool-use/tool-result mismatch
  • model output hit length and the final turn did not complete cleanly
  • auth/profile/provider failure
  • compaction/prompt construction error

One thing I’d also try: start a fresh session with /new. If /new fixes it, the old session history may be corrupted or out of sync. If it still fails, it’s probably provider/config/model-related rather than that bootstrap marker.

Docs for the debug commands: https://docs.openclaw.ai/start/faq

humble zinc
#

This issue was commonly observed on all the sessions. even when I start a new session and give request, it is reproducing.

using openclaw logs --follow,

I got following error log, tell me what it means,

"0":"[tools] read failed: ENOENT: no such file or directory, access '/home/qcom/.openclaw/agents/main/sessions/cd268efc-0445-4782-ad53-d5ef8fb94c09.json1.reset.2026-05-14715-04-45.2272

Why this happens?

steady reefBOT
#

That log means OpenClaw tried to read a file that does not exist:

ENOENT: no such file or directory
access '/home/qcom/.openclaw/agents/main/sessions/...jsonl.reset....'

So yes: this is a much more likely root cause than openclaw:bootstrap-context:full.

The path is a reset archive of a session transcript. When /new or /reset happens, OpenClaw can archive the old transcript as something like:

<session-id>.jsonl.reset.<timestamp>

If something later tries to read that archived file and it has already been deleted, moved, pruned, or recorded with a bad path, you get ENOENT.

#

Two things look suspicious in your pasted path:

  1. You pasted json1 — digit 1 — but session files should usually be jsonl — letter l.
  2. The timestamp/path looks malformed: 2026-05-14715-04-45.2272 looks like it may have lost a T/Z or got copied/truncated.

If it happens even in brand-new sessions, I’d check for a stale/bad reference somewhere rather than blaming one session’s history.

Run these read-only checks:

openclaw --version
ls -lah /home/qcom/.openclaw/agents/main/sessions | tail -50
grep -R "cd268efc-0445-4782-ad53-d5ef8fb94c09" /home/qcom/.openclaw 2>/dev/null | head -20
grep -R "json1.reset\|jsonl.reset" /home/qcom/.openclaw 2>/dev/null | head -50

What I’m looking for:

#
  • whether the referenced archive actually exists
  • whether some config/session index still points at a deleted reset archive
  • whether there’s a typo/corrupted path using json1 instead of jsonl

Also check if you configured aggressive cleanup:

openclaw status --all

Look for session.maintenance.resetArchiveRetention or custom session reset/maintenance settings.

If those greps show OpenClaw itself is storing a stale .reset.* path and trying to read it on every new run, that’s likely a bug/state corruption. Workaround is usually to clear the bad session binding/session metadata, but don’t delete blindly yet — paste the matching grep lines first. Docs: https://docs.openclaw.ai/start/faq

humble zinc
#

I see the log is looking for a incorrect reset file.

There is a typo in the name of the file it is looking for, but the saved reset file has different name

steady reefBOT
#

Yep — if OpenClaw is looking for a .jsonl.reset.<timestamp> file whose name does not match the file actually saved, that points to a stale/corrupted session transcript reference or a bug in reset-archive lookup, not the bootstrap marker.

A reset archive should look like:

<sessionId>.jsonl.reset.2026-05-14T15-04-45.227Z

If the log path has a typo/malformed timestamp, OpenClaw is trying to read a path that was recorded or constructed incorrectly. That can happen when:

  • a session store entry still points to an old archived file
  • reset/archive cleanup removed or renamed the file but some metadata still references it
  • a hook/plugin such as before_reset / session-memory is trying to read the previous reset transcript
  • there’s a version-specific bug in reset archive naming/recovery

Please run these and compare the references:

#
openclaw --version

grep -R "cd268efc-0445-4782-ad53-d5ef8fb94c09" /home/qcom/.openclaw/agents/main 2>/dev/null | head -50

find /home/qcom/.openclaw/agents/main/sessions -maxdepth 1 -name '*cd268efc-0445-4782-ad53-d5ef8fb94c09*' -print

If grep shows the bad/misspelled reset filename inside a state file like sessions.json, while find shows the real archive has a different name, then that’s the mismatch.

For a temporary workaround, I’d avoid editing/deleting blindly. Safer options:

  1. Start a totally fresh session identity/channel if possible.
  2. If you have custom session-memory or reset hooks enabled, disable them temporarily and test.
  3. Upgrade to latest OpenClaw if you’re not current.
  4. If the bad path is in sessions.json, back up the file first before changing anything.
#

Example backup only:

cp /home/qcom/.openclaw/agents/main/sessions/sessions.json /home/qcom/.openclaw/agents/main/sessions/sessions.json.bak.$(date +%s)

If you paste the grep/find output plus openclaw --version, we can tell whether it’s stale session metadata, a hook, or a real OpenClaw bug that should go to GitHub: https://github.com/openclaw/openclaw/issues

humble zinc
#

Why the openclaw is trying to read this file, what is the action that triggers this read?

steady reefBOT
# humble zinc Why the openclaw is trying to read this file, what is the action that triggers t...

That read is triggered by session rollover/reset handling, not by the model randomly deciding to inspect that file.

From the current source, OpenClaw archives the previous session transcript as:

<sessionId>.jsonl.reset.<timestamp>

That archive path gets used when an old session is replaced. The actions that can trigger it are:

  • explicit /new
  • explicit /reset
  • a soft reset / reset directive
  • automatic stale-session rollover, e.g. daily/scheduled reset policy when the old session is considered no longer fresh
  • plugin hooks around reset/session end, especially before_reset or session_end, which may read the previous transcript so hooks can inspect what ended

So if you see it on a normal new request, the likely trigger is automatic session rollover because the existing session entry is being treated as stale, or a hook/plugin is running during reset/session-end.

steady reefBOT
# humble zinc Why the openclaw is trying to read this file, what is the action that triggers t...

The malformed filename is still suspicious though: json1 vs jsonl, and the timestamp shape you pasted earlier looked corrupted. That suggests a stale/bad session-store reference or bug, not expected behavior.

Please share these read-only outputs:

openclaw --version
openclaw status --all
grep -R "cd268efc-0445-4782-ad53-d5ef8fb94c09\|json1.reset\|jsonl.reset" /home/qcom/.openclaw/agents/main 2>/dev/null | head -80

If that bad path is stored in sessions.json or a hook/plugin config, we’ll know what is causing the read. If it’s generated by OpenClaw on the fly, that should go to GitHub as a bug: https://github.com/openclaw/openclaw/issues

humble zinc
#

gateway connection failed: GatewayClientRequestError: scope upgrade pending approval (requestId: )

steady reefBOT
# humble zinc gateway connection failed: GatewayClientRequestError: scope upgrade pending appr...

That means the client/browser is already known, but it is now asking the Gateway for broader device scopes than were previously approved. OpenClaw blocks that reconnect until you explicitly approve the scope upgrade.

Run this on the Gateway machine:

openclaw --version
openclaw devices list

Then approve the current pending request shown there:

openclaw devices approve <requestId>

Important: use the requestId from openclaw devices list, not the empty one from that error. If the browser keeps retrying, the old pending request can be superseded and a new requestId created, so run openclaw devices list immediately before approval.

steady reefBOT
humble zinc
#

I am getting a connection error message on openclaw with output response as

⚠️ Something went wrong while processing your request. Please try again, or use /new to start a fresh session.

This happens when I locally access a model which is hosted via vLLM. Why this happen? what is the ficx

steady reefBOT
# humble zinc I am getting a connection error message on openclaw with output response as ⚠️...

That Discord message is OpenClaw’s generic wrapper. For vLLM the real cause is almost always one of these:

  • OpenClaw Gateway cannot reach the vLLM server
  • baseUrl is wrong or missing /v1
  • the Gateway process does not have VLLM_API_KEY in its environment
  • the configured model id does not exactly match /v1/models
  • vLLM returns a 400/500 because the model/template does not handle OpenClaw’s chat/tools payload

First check the actual error:

openclaw --version
openclaw models status
openclaw logs --follow

Then, from the same machine/process environment where the Gateway runs, verify vLLM directly:

steady reefBOT
steady reefBOT
steady reefBOT
# humble zinc I am getting a connection error message on openclaw with output response as ⚠️...
    }
  },
  agents: {
    defaults: {
      model: { primary: "vllm/YOUR_MODEL_ID" }
    }
  }
}

Important fixes depending on what you find:

  • If OpenClaw is running as a service, exporting VLLM_API_KEY in your shell may not reach the Gateway. Put it in the Gateway/service env or configure apiKey explicitly.
  • If vLLM is in Docker/another machine, 127.0.0.1 from OpenClaw may point to the wrong place. Use the reachable host/IP and start vLLM with an accessible bind, e.g. --host 0.0.0.0 --port 8000.
  • If curl works but OpenClaw fails, paste the log line from openclaw logs --follow. If it mentions tool calls/chat template/400/500, the fix is usually changing the vLLM chat template/model, or using a model served with proper tool-call support.
humble zinc
#

15:40:13+00:00 error diagnostic {"subsystem" : "diagnostic"} lane task error: lane=main durationMs=19945 error="FailoverError: LLM request failed: network connection error."
15:40:13+00:00 error diagnostic {"subsystem" : "diagnostic"} lane task error: lane=session: agent: main: telegram: direct: 6235770749 durationMs=19951 error="FailoverError: LLM request f
ailed: network connection error."
15:40:13+00:00 warn model-fallback/decision {"subsystem" : "model-fallback/decision"} {"event": "model_fallback_decision", "tags": ["error_handling","model_fallback","candidate_failed"
],"runId": "0ee8c4a1-95ae-4a04-87e5-0fcd930c7dd6","sessionId":"e3c2c3d1-d45f-44c3-9b98-202bf33b0611","lane":"main","decision":"candidate_failed","requestedProvider":"vllm", "request
edModel": "Qwen/Qwen3-30B-A3B-Instruct-2507","candidateProvider":"vllm", "candidateModel":"Qwen/Qwen3-30B-A3B-Instruct-2507","attempt":1,"total":1,"reason":"timeout","status":408,"e
rrorPreview": "Connection error.", "errorHash" : "sha256: 8ec9a0b7fe5c", "fallbackStepType": "fallback_step","fallbackStepFromModel": "vllm/Qwen/Qwen3-30B-A3B-Instruct-2507", "fallbackStep
FromFailureReason" : "timeout", "fallbackStepFromFailureDetail" : "Connection error.", "fallbackStepChainPosition": 1, "fallbackStepFinalOutcome":"chain_exhausted", "isPrimary": true,"reque
stedModelMatched" : true, "fallbackConfigured" : false} model fallback decision
15:40:13+00:00 error Embedded agent failed before reply: LLM request failed: network connection error.

steady reefBOT
# humble zinc 15:40:13+00:00 error diagnostic {"subsystem" : "diagnostic"} lane task error: la...

This log says the actual failure is timeout talking to vLLM, not a session/history problem:

reason: "timeout"
status: 408
errorPreview: "Connection error."
durationMs: ~19945
fallbackConfigured: false

So OpenClaw tried exactly one candidate:

vllm/Qwen/Qwen3-30B-A3B-Instruct-2507

It did not get a usable response within ~20s, then the fallback chain was exhausted because no fallback is configured.

steady reefBOT
# humble zinc 15:40:13+00:00 error diagnostic {"subsystem" : "diagnostic"} lane task error: la...

Most likely fixes:

  1. Verify vLLM is reachable from the Gateway process, not just your shell:
curl http://127.0.0.1:8000/v1/models

If vLLM is in Docker / another host, 127.0.0.1 may be wrong from OpenClaw’s point of view. Use the actual reachable host/IP in models.providers.vllm.baseUrl.

  1. Test the exact model directly:
curl http://127.0.0.1:8000/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer vllm-local' \
steady reefBOT
# humble zinc 15:40:13+00:00 error diagnostic {"subsystem" : "diagnostic"} lane task error: la...
  -d '{"model":"Qwen/Qwen3-30B-A3B-Instruct-2507","messages":[{"role":"user","content":"Say ok"}],"max_tokens":20}'

If this also takes >20s or fails, fix vLLM first: model loading, GPU memory, tensor parallel config, server bind/port, or chat template.

  1. Warm the model before using it from OpenClaw. Big local models often timeout on the first request while weights/graphs load.

  2. Make sure OpenClaw’s vLLM config has /v1 and a non-empty key:

models: {
  providers: {
    vllm: {
      baseUrl: "http://127.0.0.1:8000/v1",
      apiKey: "${VLLM_API_KEY}",
steady reefBOT
steady reefBOT
humble zinc
#

15:47:48+00:00 warn security {"subsystem": "security"} blocked URL fetch (url-fetch) targetOrigin=http://10.235.57.50:8000 reason=Blocked hostname or private/internal/special-use IP address
15:47:48+00:00 warn provider-transport-fetch_{"subsystem" :"provider-transport-fetch"} [model-fetch] error provider=vllm api=openai-completions model=Qwen/Qwen3-30B-A3B-Instruct-2507 elapsedMs=8 name=SsrFBlockedError code=u
ndefined causeName=undefined causeCode=undefined message=Blocked hostname or private/internal/special-use IP address
15:47:48+00:00 warn security {"subsystem":"security"} blocked URL fetch (url-fetch) targetOrigin=http://10.235.57.50:8000 reason=Blocked hostname or private/internal/special-use IP address
15:47:48+00:00 warn provider-transport-fetch {"subsystem" : "provider-transport-fetch"} [model-fetch] error provider=vllm api=openai-completions model=Qwen/Qwen3-30B-A3B-Instruct-2507 elapsedMs=4 name=SsrFBlockedError code=u
ndefined causeName=undefined causeCode=undefined message=Blocked hostname or private/internal/special-use IP address
15:47:49+00:00 warn security {"subsystem": "security"} blocked URL fetch (url-fetch) targetOrigin=http://10.235.57.50:8000 reason=Blocked hostname or private/internal/special-use IP address
15:47:49+00:00 warn provider-transport-fetch {"subsystem" : "provider-transport-fetch"} [model-fetch] error provider=vllm api=openai-completions model=Qwen/Qwen3-30B-A3B-Instruct-2507 elapsedMs=6 name=SsrFBlockedError code=u
ndefined causeName=undefined causeCode=undefined message=Blocked hostname or private/internal/special-use IP address
15:47:49+00:00 warn agent/embedded {"subsystem" : "agent/embedded"} {"event" : "embedded_run_agent_end", "tags" : ["error_handling", "lifecycle", "agent_end","assistant_error"], "runId" : "b3754bf3-95dd-43c6-bd3a-df93dc0345e6","isErro
r": true, "error": "LLM request failed: network connection error. ", "failoverReason" : "timeout", "model" : "Qwen/Qwen3-30B-A3B-Instruct-2507","provider":"vllm",

steady reefBOT
# humble zinc 15:47:48+00:00 warn security {"subsystem": "security"} blocked URL fetch (url-fe...

This log gives the exact cause:

SsrFBlockedError
Blocked hostname or private/internal/special-use IP address
targetOrigin=http://10.235.57.50:8000

OpenClaw is blocking 10.235.57.50 because it is a private/internal IP. That is intentional SSRF protection. Your vLLM server is on a private network, so you need to explicitly opt in for that model provider.

If this is your trusted vLLM host, add this under the vllm provider:

models: {
  providers: {
    vllm: {
steady reefBOT
steady reefBOT
steady reefBOT
humble zinc
#

openclaw has google/gemini-3.1-flash-lite-preview model.

What is the maxTokens of this model by default?

steady reefBOT
# humble zinc openclaw has google/gemini-3.1-flash-lite-preview model. What is the maxTokens...

For the native OpenClaw google/gemini-3.1-flash-lite-preview catalog entry, the default is:

maxTokens: 65536
contextWindow: 1048576

I verified it from the installed model catalog. Small nuance: some proxy/provider variants can differ. For example the Vercel AI Gateway catalog entry for the same upstream model shows maxTokens: 65000, but the normal Google/OpenRouter-style catalog value is 65536.

Docs for model config/selection: https://docs.openclaw.ai/concepts/models

viral iron
#

With respect to local models, how do sessions get reset? I have a model running forever via ollama, how do I know a session is up if I’ve used all my context window ? Or does it happen automatically and openclaw handles it ?

steady reefBOT
# viral iron With respect to local models, how do sessions get reset? I have a model running ...

Short version: Ollama keeping a model loaded is separate from OpenClaw session state. The model server is basically stateless between requests; OpenClaw owns the session transcript/context.

There are two different mechanisms:

  • Session reset: /new or /reset creates a fresh session id for the same chat. OpenClaw also has a default daily reset at 4:00 AM gateway-local time, and optional idle reset via session.reset.idleMinutes.
  • Context compaction: when the current session gets close to/exceeds the model context window, OpenClaw auto-compacts older turns into a summary and keeps recent turns intact. It also retries after common overflow errors, including Ollama-style context length exceeded.

How to tell what’s happening:

  • In chat, use /status to see context usage and compaction count.
  • Use /context list for what is currently being injected.
  • Use openclaw sessions --json or openclaw status to inspect stored sessions from the CLI.

Local-model gotcha: make sure your configured contextWindow matches what Ollama/model actually supports. If OpenClaw thinks the model has a bigger window than it really does, you’ll hit overflow earlier than expected. If the local model is weak at summarization, set a stronger/different compaction model with agents.defaults.compaction.model.

Docs:
https://docs.openclaw.ai/concepts/session
https://docs.openclaw.ai/concepts/compaction

steady reefBOT
viral iron
#

So do I need to worry about sessions ? I have subagents set to use the local models for coding tasks. There’s a chance that the session uses a lot of tokens and eats up my context window. So I’m not sure how to handle that