#Running OpenClaw locally with LM Studio + Qwen3-Coder-30B — agent outputs endless wall of text even

1 messages · Page 1 of 1 (latest)

compact coral
#

Hi — I'm trying to run OpenClaw fully local using LM Studio as the model provider.

Setup:

OS: Windows 10

GPU: NVIDIA 5090 - 32GB VRAM + 64 GB system RAM

Model: qwen3-coder-30b-a3b-instruct

LM Studio server running on: http://127.0.0.1:1234

Context length in LM Studio: 16k

OpenClaw version: 2026.2.9

Config highlights

Provider: lmstudio

Endpoint: http://127.0.0.1:1234/v1

Model ID returned by /v1/models:
qwen3-coder-30b-a3b-instruct

Behavior:

OpenClaw connects successfully and the model clearly runs, but the agent output is unusable.

Even simple prompts like: "OK" or similar produce very long multi-paragraph outputs that include things like:
-timestamps
-message IDs
-internal analysis
-repeated session status text
-The output keeps generating until I manually press Stop.

This happens even after:

-restarting the gateway
-creating a new session
-editing bootstrap.md to force minimal replies
-limiting max tokens in LM Studio
-temperature = 0

Question
-Is this a known issue when using coder/reasoning models (like Qwen3-Coder) with OpenClaw?
-Do agents require a chat-tuned model (e.g., Qwen-Instruct, Llama-Instruct, etc.) to behave correctly?
-Or is there a configuration step I'm missing when using LM Studio as the provider?

Any guidance appreciated.

torn elmBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

subtle pondBOT
#

What you’re seeing is usually not “OpenClaw is broken”, it’s “the local model + API mode + chat template combo isn’t behaving like a chat assistant”, so it starts regurgitating the prompt / internal state and never finds a natural stop.

A few key points:

1) Yes — this is common with coder/reasoning-ish models on local OpenAI-compatible servers

OpenClaw’s prompts contain a lot of structured context (system instructions, tool schemas, metadata). Strong chat models ignore that and just answer. Weaker / coder-optimized models often:

  • echo parts of the prompt (timestamps / ids / status-y text)
  • spill “analysis”
  • don’t emit a clean stop, so streaming keeps going until you hit Stop

So: agents don’t strictly require “chat tuned”, but in practice they need a model that reliably follows chat roles + stopping behavior (and ideally tool calling).

2) The biggest config “gotcha”: don’t use the Completions-style API mode for LM Studio

If your LM Studio provider is running as api: "openai-completions", OpenClaw has to flatten chat into a big prompt string — and that’s exactly when models start continuing/echoing that transcript.

Docs now recommend LM Studio with Responses API instead:
https://docs.openclaw.ai/gateway/local-models

#

Minimal shape (adapt to your model/limits):

{
  agents: { defaults: { model: { primary: "lmstudio/qwen3-coder-30b-a3b-instruct" } } },
  models: {
    mode: "merge",
    providers: {
      lmstudio: {
        baseUrl: "http://127.0.0.1:1234/v1",
        apiKey: "lmstudio",
        api: "openai-responses",
        models: [
          {
            id: "qwen3-coder-30b-a3b-instruct",
            name: "Qwen3 Coder (local)",
#
            reasoning: false,
            input: ["text"],
            cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
            contextWindow: 16000,
            maxTokens: 2048
          }
        ]
      }
    }
  }
}

If openai-responses 404s, you’re likely on an older LM Studio server build — update LM Studio and retry.

3) LM Studio-side: confirm the chat template is correct for Qwen

#

Even with the right API mode, if LM Studio is using a “raw”/wrong chat template for Qwen, you’ll get exactly the “it prints internal crap forever” behavior. Make sure the server is using the model’s intended chat/instruct template (not plain completion).


If you paste these two outputs, we can get very precise about what OpenClaw thinks it’s doing:

  • openclaw models status
  • the models.providers.lmstudio block from your config (redact nothing except secrets — and for LM Studio the key is usually just a dummy anyway)

That’ll show whether you’re currently on openai-completions vs openai-responses, which is the usual culprit for this exact symptom.

compact coral
#

The openclaw model status:
PS C:\Windows\system32> openclaw models status

🦞 OpenClaw 2026.2.9 (33c75cb) — Less middlemen, more messages.

Config : ~.openclaw\openclaw.json
Agent dir : ~.openclaw\agents\main\agent
Default : lmstudio/qwen3-coder-30b-a3b-instruct
Fallbacks (0) : -
Image model : -
Image fallbacks (0): -
Aliases (0) : -
Configured models (0): all

Auth overview
Auth store : ~.openclaw\agents\main\agent\auth-profiles.json
Shell env : off
Providers w/ OAuth/tokens (0): -

  • lmstudio effective=models.json:lmstudio | models.json=lmstudio | source=models.json: ~.openclaw\agents\main\agent\models.json
  • openai effective=env:Local | env=Local | source=env: OPENAI_API_KEY

OAuth/token status

  • none
subtle pondBOT
#

That models status output is actually a big clue:

  • Your lmstudio provider config is coming from ~\.openclaw\agents\main\agent\models.json (not directly from ~\.openclaw\openclaw.json).
  • You have no explicit model definition for LM Studio (Configured models (0)), so OpenClaw will assume generic defaults (including a way too large contextWindow / maxTokens for many local setups).

With a 16k-context local model, OpenClaw’s system/tool prompt can already be chunky — if the server truncates/warps the beginning of the prompt, the model often starts doing exactly what you described (echoing metadata, ids, “status-y” text, and never cleanly stopping).

What I’d do next (most likely fix)

  1. Open this file and paste just the lmstudio section here (redact nothing except any real secrets):
    ~\.openclaw\agents\main\agent\models.json

I’m specifically looking for what api mode you’re using. If it’s openai-completions, that commonly causes the “transcript regurgitation” vibe on local models.

  1. In that same provider block, make sure you have:
  • api: "openai-responses" (recommended for LM Studio per docs)
  • an explicit model entry that matches your real limits, e.g. contextWindow: 16000 and a sane maxTokens for testing
#

Docs (this is the official “LM Studio local” guidance):
https://docs.openclaw.ai/gateway/local-models

One more important note

OpenClaw usually expects huge context for “agent mode” prompts. The FAQ is pretty blunt that local models are often not great for this unless they’re big-context + strong instruction-followers:
https://docs.openclaw.ai/help/faq#is-a-local-model-ok-for-casual-chats

So even with perfect config, a 16k local model can be a rough experience.

If you paste the lmstudio block from models.json, I’ll tell you exactly what to change (and whether you’re currently on completions vs responses).

compact coral
#

here's the json:

{
"providers": {
"openai": {
"baseUrl": "http://127.0.0.1:1234/v1",
"apiKey": "local",
"api": "openai-completions",
"models": [
{
"id": "qwen3-coder-30b-a3b-instruct",
"name": "Qwen3 Coder 30B A3B Instruct (LM Studio)",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 200000,
"maxTokens": 8192
},
{
"id": "text-embedding-nomic-embed-text-v1.5",
"name": "nomic-embed-text v1.5 (LM Studio)",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 200000,
"maxTokens": 8192
}
]
},
"lmstudio": {
"baseUrl": "http://127.0.0.1:1234/v1",
"apiKey": "lmstudio",
"api": "openai-responses",
"models": [
{
"id": "qwen3-coder-30b-a3b-instruct",
"name": "Qwen3 Coder 30B A3B Instruct (LM Studio)",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 16000,
"maxTokens": 2048
}
]
}
}
}