#Troubleshooting Request — OpenClaw bot setup and functionality:

tight beacon · 2026-02-14T05:29:09.019Z

- Bot consistently fails to understand tasks, its own architecture, and agents. - Bot lacks agency to propose or implement solutions; often explains how user could do things instead of acting autonomously. - Agents return nothing or perform incorrect actions; bot reports success when tasks were not completed. - Token usage is high/wasted despite attempted token-reduction policies. - Bot overcomplicates simple tasks (e.g., building a dashboard) and fails to deliver functioning results. - Repeated setup attempts (four different setups). - Models tested: - Opus 4.6 (used for setup/initiation) - Kimi 2.5 - Google Gemini Pro - Google Gemini Flash - Grok - Multiple agents created for different purposes; agents either return nothing or perform wrong actions. - One agent was given access to documentation to self-diagnose/fix but did not improve. - Bot does not understand what it should or should not do. - Bot claims tasks are completed when they are not. - When errors are pointed out, bot acknowledges but then makes further mistakes. - Bot fails to optimize for token usage despite policies provided. - Identity/instruction variations tested (different personas and initial instruction methods) with no meaningful improvement. - Specific examples/issues: - Dashboard task: intended to simply display files as a test; bot attempted to build an overly complex server calling its own API, could not make it work; persisted with complexity despite direction to simplify. - I am experienced with AI tools (Codex, Claude Code, various agents). - Best practices for first setup of OpenClaw bot / troubleshooting guide. - Guides or first-use documentation for making the bot smarter or more autonomous without relying solely on Opus 4.6. - How to configure agents, policies, identities, and workflows so the bot reliably understands tasks and executes them with minimal token waste.

1 messages · Page 1 of 1 (latest)

tight beacon Feb 14, 2026, 5:29 AM

Bot consistently fails to understand tasks, its own architecture, and agents.
- Bot lacks agency to propose or implement solutions; often explains how user could do things instead of acting autonomously.
- Agents return nothing or perform incorrect actions; bot reports success when tasks were not completed.
- Token usage is high/wasted despite attempted token-reduction policies.
- Bot overcomplicates simple tasks (e.g., building a dashboard) and fails to deliver functioning results.
- Repeated setup attempts (four different setups).
- Models tested:
  - Opus 4.6 (used for setup/initiation)
  - Kimi 2.5
  - Google Gemini Pro
  - Google Gemini Flash
  - Grok
- Multiple agents created for different purposes; agents either return nothing or perform wrong actions.
- One agent was given access to documentation to self-diagnose/fix but did not improve.
- Bot does not understand what it should or should not do.
- Bot claims tasks are completed when they are not.
- When errors are pointed out, bot acknowledges but then makes further mistakes.
- Bot fails to optimize for token usage despite policies provided.
- Identity/instruction variations tested (different personas and initial instruction methods) with no meaningful improvement.
Specific examples/issues:
- Dashboard task: intended to simply display files as a test; bot attempted to build an overly complex server calling its own API, could not make it work; persisted with complexity despite direction to simplify.
- I am experienced with AI tools (Codex, Claude Code, various agents).
- Best practices for first setup of OpenClaw bot / troubleshooting guide.
- Guides or first-use documentation for making the bot smarter or more autonomous without relying solely on Opus 4.6.
- How to configure agents, policies, identities, and workflows so the bot reliably understands tasks and executes them with minimal token waste.

visual stagBOT Feb 14, 2026, 5:29 AM

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

golden boughBOT Feb 14, 2026, 5:31 AM

What you’re describing usually isn’t “one bug” — it’s a combo of (a) toolchain not actually working end-to-end, (b) model/tool-calling reliability, and (c) missing “definition of done” guardrails so the agent can’t prove it did the thing.

Here’s the fastest way to get from “it’s useless” → “it’s predictable”.

1) First: verify the plumbing (otherwise everything feels hallucinated)

On the gateway host, run these and paste the output (especially status --all):

openclaw --version
openclaw status --all
openclaw models status
openclaw doctor
openclaw channels status --probe

openclaw logs --follow

Docs:

Troubleshooting runbook (this exact command ladder): https://docs.openclaw.ai/gateway/troubleshooting
FAQ “first 60 seconds”: https://docs.openclaw.ai/help/faq

If the bot “reports success” but didn’t do anything, the logs usually show one of:

tool call rejected/failed
model failed over / “all models failed”
exec blocked by approvals/allowlist
routing/policy dropped the message

2) Reduce to a single-agent baseline (then add complexity back)

When people try 4 setups + multiple agents + multiple models, it’s easy to end up in a state where:

the agent you’re talking to isn’t the one you think
the model you selected isn’t allowed (silent-ish “no reply” feeling)
a fallback model that’s bad at tool calling is actually running the show

Baseline test:

Use one agent
Use one model (no fancy fallbacks initially)
Test in DM, not a busy group/thread
Don’t load huge docs into the system prompt

You can inspect what’s going on live with:

/status
/model status
/context detail (shows what’s eating context)
/usage tokens or /usage cost (to see burn)

Slash commands reference: https://docs.openclaw.ai/tools/slash-commands

3) Treat “autonomy” as: tools + permissions + approvals (not vibes)

A super common mismatch: you expect it to act, but it’s not actually allowed to.

If you want it to build dashboards / touch files / run commands, you need the exec/tool guardrails set sanely, otherwise it will “try”, get blocked, and weaker models will paper over it.

Read this (it matters): https://docs.openclaw.ai/tools/exec-approvals

Key concept:

Even if the model is willing, exec approvals + allowlists decide whether anything runs on your real machine/node.

If you see approvals getting in the way, you can also forward approval prompts to chat and approve with /approve … (covered in that doc).

4) Stop “claimed done” with one rule: no proof = not done

Add an explicit workflow constraint to your agent instructions (AGENTS / system prompt style):

“Never claim a task is complete unless you provide verifiable artifacts.”

Examples of “proof”:

file paths + file contents (or a diff)
command output
a URL that actually loads + screenshot
a /context detail snippet showing the created files are in-workspace

This single change eliminates 80% of the “it said it worked” frustration.

5) Token waste: measure it, then fix the real causes

Most token burn comes from:

huge context (docs pasted into the prompt, giant system prompts, long chat history)
retries/failover due to tool errors
“overplanning” because the agent isn’t constrained to a minimal solution

Practical knobs:

Use /context detail to find prompt bloat (then remove it)
Use /compact when the session gets big: https://docs.openclaw.ai/concepts/compaction
Use multi-agent routing so “cheap chat” and “tool work” aren’t the same agent/model: https://docs.openclaw.ai/concepts/multi-agent
Turn on /usage tokens while iterating so you can see whether changes helped

6) About models (important, but easy to mis-diagnose)

Some models are fine for chatting but unreliable at strict tool calling (schema adherence, function arguments, etc.). That shows up exactly as:

tools not invoked when they should be
tools invoked with wrong args
“success” narratives without real side effects

So: don’t try to make one model do everything. Use:

one “reliable tool-use” model for the agent that touches tools
optionally a cheaper model for lightweight chat (separate agent)

For up-to-date “which models are best right now” community guidance, the server’s #models channel is the best source, and the model docs explain how selection/fallbacks work: https://docs.openclaw.ai/concepts/models

If you paste one thing, paste this

openclaw status --all + a short log excerpt around a failed task (from openclaw logs --follow).