#Openclaw - Using a local hosted Qwen/Qwen3-30B-A3B-Instruct-2507

1 messages · Page 1 of 1 (latest)

sour dagger
#

I have a local Qwen/Qwen3-30B-A3B-Instruct-2507 model hosted via vLLM. This model is a non-thinking only mode and does not generate <think></think> blocks in its output. Meanwhile, specifying enable_thinking=False is no longer required.

So, Applying Thinking level has no effect on the results.

My openclaw.json configuration looks like below,

"vllm": {
"baseUrl": "http://192.168.75.81:8000/v1",
"api": "openai-completions",
"apiKey": "VLLM_API_KEY",
"models": [
{
"id": "Qwen/Qwen3-30B-A3B-Instruct-2507",
"name": "Qwen/Qwen3-30B-A3B-Instruct-2507",
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 65536,
"maxTokens": 16384
}
]
}

Let's say I also have another model with reasoning ability. How to set the Think level to medium or high.
There is also a config called "reasoning" is available.

What should I include and how to enable the thinking mode of the model

brazen mauveBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

visual cloudBOT
#

For OpenClaw there are two separate knobs here:

  1. thinking / /think = asks the provider/model to use reasoning effort.
  2. reasoning / /reasoning = only controls whether reasoning output is shown back to you. It does not enable model thinking by itself.

For a custom OpenAI-compatible/vLLM model, mark the model as reasoning-capable first:

{
  agents: {
    defaults: {
      model: { primary: "vllm/your-reasoning-model" },
      thinkingDefault: "medium" // or "high"
    }
  },
  models: {
#
    providers: {
      vllm: {
        baseUrl: "http://192.168.75.81:8000/v1",
        api: "openai-completions",
        apiKey: "${VLLM_API_KEY}",
        models: [
          {
            id: "your-reasoning-model",
            name: "Your Reasoning Model",
            reasoning: true,
            compat: {
              supportsReasoningEffort: true,
              supportedReasoningEfforts: ["low", "medium", "high"]
            },
            input: ["text"],
#
            cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
            contextWindow: 65536,
            maxTokens: 16384
          }
        ]
      }
    }
  }
}

Then either set a default in config:

agents: {
#
  defaults: {
    thinkingDefault: "high"
  }
}

or set it per session in chat:

/think medium
/think high

If the model is Qwen-style and expects enable_thinking instead of reasoning_effort, use this compat shape instead:

#
reasoning: true,
compat: {
  thinkingFormat: "qwen"
}

That makes non-off /think send enable_thinking: true, and /think off send enable_thinking: false. If your vLLM setup expects it under chat_template_kwargs, use:

compat: { thinkingFormat: "qwen-chat-template" }

For your current Qwen3-30B-A3B-Instruct-2507 non-thinking-only model, leave reasoning omitted/false. /think medium/high can’t make a non-thinking checkpoint reason if the model/template doesn’t support it.

Docs: https://docs.openclaw.ai/tools/thinking and https://docs.openclaw.ai/concepts/model-providers