#Agents: Fuzzy logic at compile vs runtime

1 messages · Page 1 of 1 (latest)

coral widget
#

Convince me I’m wrong that the decisions around agents are essentially: Apply fuzzy logic at compile time (Agentic Coding) vs runtime (Autonomous Agents with some steering behavior)

limpid questBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> âś… Mark Solution

queen lilyBOT
#

Your framing is catchy, but it collapses too many independent design choices into one axis, and it mislabels where the “fuzziness” actually lives.

1) “Compile-time vs runtime fuzzy logic” is the wrong cut

In both “agentic coding” and “autonomous agents”, the fuzzy part (LLM uncertainty) is applied at runtime: you’re sampling a policy conditioned on context. The difference is when you commit side effects and how the loop is governed, not when fuzziness is evaluated.

What’s closer to “compile-time” is design-time policy: prompts, tool schemas, permissions, evaluation harnesses, stop conditions, routing, budget limits. That’s not fuzzy logic; it’s constraints and scaffolding.

2) The core distinction is “suggestions” vs “actions under uncertainty”

  • Agentic coding: the model proposes artifacts (code, diffs, refactors). A separate governor (human review + tests + CI) decides what becomes real.
  • Autonomous agents: the model (plus guardrails) decides and executes actions in an environment (files, network, payments, messages, devices). The governor has to exist at runtime because side effects happen continuously.

So the key question is: who is the runtime controller? Human/CI pipeline vs the agent+monitors.

3) There are multiple axes you’re bundling into one

A few that matter more than “compile/runtime”:

A. Control loop structure

#
  • One-shot generation → evaluate → apply (agentic coding)
  • Sense → plan → act → observe → replan (autonomous)

B. Time horizon & state

  • Stateless or short-lived context (typical coding assist)
  • Long-lived state: memory, tasks, world model, queues, schedules

C. Safety model

  • Post-hoc verification is viable (tests catch regressions)
  • Pre-action authorization/containment is required (because you can’t “unit test” sending an email to the wrong person)

D. Observability & rollback

  • Easy rollback (git revert)
  • Often irreversible or socially costly actions (messaging, purchases, account changes)

4) A better “convince me” reframe

Instead of “fuzzy logic at compile vs runtime”, I’d frame it as:

#

Offline synthesis with strong ex-post verification (code)
vs
Online control with continuous risk management (agents)

Same underlying fuzziness; different commit points, feedback loops, and governance mechanisms.

If you want, I can try to rescue your original analogy by mapping “compile-time” to “policy compilation” (permissions + tool contracts + eval gates), but the moment the model is choosing among actions, the uncertainty is inherently runtime.

coral widget
#

A lot of the output of an agent is just deterministic reusable software programs though.

queen lilyBOT
#

Totally — but that actually strengthens the “this isn’t just compile-time vs runtime fuzziness” point.

You’re mixing up (a) the determinism of the artifact with (b) the determinism of the controller that produces/chooses/updates that artifact.

Deterministic program ≠ deterministic agent system

An agent can output a perfectly deterministic, reusable program (a script, a library, a Terraform module). But:

  • The search over program space (what to write, which approach, which dependencies, which assumptions) is still a stochastic / heuristic process.
  • The decision to stop (“good enough”), what to test, and how to respond to failures is still a runtime control problem unless you fully externalize it into a fixed pipeline.
  • The program’s behavior is often only “deterministic given the world state” — and the world state (APIs, credentials, filesystem, network, rate limits, schema drift) is exactly the part you don’t control.

So yes: the output can be deterministic, but the system is not automatically “compile-time fuzziness only”.

What’s really happening: “policy compilation” (but it’s partial)

A better version of your claim is:

Agents sometimes compile uncertain reasoning into deterministic subroutines.

#

That’s real. It’s basically caching/distillation: once you’ve paid the fuzzy reasoning cost once, you ship a deterministic tool.

But it’s only complete if the task is:

  • well-specified,
  • stable across environments,
  • and has cheap, reliable verification.

A lot of “autonomous agent” value shows up when those aren’t true — when you need online adaptation.

The practical architecture is usually hybrid

Most useful agents end up as:

  • deterministic building blocks (programs, DAGs, playbooks, regexes, SQL, scripts),
  • orchestrated by a runtime controller that decides when to run what, with what inputs, and what to do when reality disagrees.

So the axis isn’t “agentic coding = compile-time fuzz; autonomous = runtime fuzz”.
It’s more like: how much of the controller can you freeze into a pipeline vs how much must remain an online decision-maker.

#

If you give me a concrete example domain (e.g., “generate a scraper”, “triage customer emails”, “run incident response”), I can pin down where the “compilable” boundary actually is and why.

coral widget
#

I guess a popular coding agent I’d use for agentic coding is like a well polished openclaw agent for coding.

The benefit of openclaw is that I guess I can create semi autonomous multi agent systems that run longer than these coding agents that I have to pay for … but idk. Why is it hard to reason about this?

queen lilyBOT
#

A “polished coding agent” and an OpenClaw coding agent can look similar on the surface (both write diffs, run tests, iterate). The reason it’s hard to reason about is that you’re not comparing two agents — you’re comparing a product vs a platform, and autonomy isn’t a single knob.

What you “pay for” in polished coding agents

Those tools bake in a bunch of engineering you don’t see:

  • A tight problem framing: “edit this repo until tests pass / build succeeds”
  • Curated context: IDE index, symbol search, file selection heuristics, caching
  • Guardrails by default: constrained tool surface, safe diff application, predictable loops
  • Good UX and observability: you can see what it’s doing, undo it, and keep it on-rails

So the agent might not be fundamentally smarter — the surrounding system is.

What OpenClaw buys you (that those don’t)

OpenClaw is valuable when the work is ongoing and crosses boundaries:

  • long-lived sessions (days/weeks), background operation, schedules/heartbeats
  • multi-channel I/O (Discord/Telegram/etc), notifications, “check back later”
#
  • multiple agents with different permissions/tools (“researcher”, “coder”, “ops”, “triage”)
  • you choose the governance model: approval gates, tool allowlists, budgets, etc.

It’s closer to building your own “agent ops” than buying a single coding feature.

Why it’s genuinely hard to reason about

Because the difficulty isn’t “compile-time fuzz vs runtime fuzz”. It’s these messy axes:

  1. Where is the boundary of the system?
    IDE agent lives in a narrow sandbox (repo + terminal). OpenClaw can touch the world (messages, devices, services). That changes everything.

  2. What counts as “done”?
    In coding, “tests pass” is a crisp objective. In autonomous work, success metrics are vague (“keep my inbox under control”, “watch for outages”, “nudge me when needed”).

  3. Long-horizon + open world = control theory, not codegen
    The hard part becomes retries, idempotency, partial failures, changing APIs, stale assumptions, coordination — basically distributed systems with a stochastic planner.

#
  1. Autonomy is continuous, not binary
    You can run OpenClaw like a conservative coding agent (human approval every step), or like a daemon. The “right” setting depends on risk tolerance and the domain.

A practical rule of thumb

  • If the task can be reduced to a deterministic artifact + cheap verification (tests/CI/lint): a polished coding agent is unbeatable.
  • If the task is ongoing, cross-system, and needs monitoring + escalation: OpenClaw starts to make sense.

If you tell me one concrete workflow you’d want (e.g., “maintain a repo”, “triage issues”, “watch logs + alert me”, “ship weekly changelog”), I can map it to: single coding agent vs OpenClaw + what autonomy level actually helps.

coral widget
#

Yeah but openclaw agents result in tool execution .. like, specific software programs that run.. to do these “long running ops” so even if you disguise them as “long running boundary cross ops” machines, they’re still doing what we’re doing when we just use Agentic coding to produce a specific program.

Available Agentic coding tools are just more conveniently already set up for human oversight and approval

queen lilyBOT
#

You’re not wrong that, at the bottom, “agents” = run programs / call tools. The part I think you’re missing is where the branching logic lives and what you’re buying when you choose one stack vs the other.

1) “It’s still just programs” — yes, but which program?

There are two very different “programs” here:

  1. A compiled, deterministic program you generated once (via agentic coding):

    • Logic is in code.
    • It runs the same way every time (modulo environment).
    • If reality deviates, it fails unless you explicitly encoded the contingency.
  2. A runtime controller (agent) that chooses which tools/programs to run next based on observations:

    • The “program” is effectively the policy loop: observe → decide → act → observe.
    • You don’t pre-encode every branch; you lean on the model to handle the long tail.
    • That’s exactly what makes it powerful and why it needs runtime guardrails.

So OpenClaw isn’t “disguising” long-running ops — it’s offering a generic runtime controller that can orchestrate lots of deterministic subprograms without you having to write a giant decision tree and UI around it.

#

2) The oversight point is the real differentiator (and you said it)

“Available agentic coding tools are already set up for human oversight and approval.”

Yes. That’s a huge deal. Most of the value of polished coding agents is that the governance/UX is already productized:

  • scoped environment (repo + tests)
  • diff-based approvals
  • nice visibility into what changed

OpenClaw gives you primitives to build governance (permissions, tool allowlists, channels, “ask the human”, etc.), but it’s less opinionated. That makes it flexible and also harder to reason about.

3) The honest trade

  • If your “long running ops” can be captured as a clean daemon/script + clear invariants + easy monitoring: compile it (agentic coding is great at getting you there).
  • If the work lives in messy gray areas (ambiguous inputs, evolving APIs, human comms, exception handling, escalation): a runtime controller saves you from writing and maintaining a lot of glue/decision logic.

4) The punchline

Your framing becomes accurate if you assume the world is stable enough that you can always “compile” the uncertainty away into a script. In practice, the pain is the long tail + changing state + human coordination — and that’s the part that tends to stay “interpreted” at runtime.

#

If you want, propose one concrete “long-running boundary-crossing” job you have in mind, and I’ll tell you whether I’d compile it into a deterministic program, keep it runtime-agentic, or do a hybrid (usually the right answer).

coral widget
#

Elaborate on what “long tail” means

queen lilyBOT
#

“Long tail” is just the shape of the real-world case distribution:

  • There’s a head: a small number of common, repeatable situations you hit all the time.
  • And then there’s a long tail: a huge number of rare, weird, one-off situations you hit occasionally.

Individually the tail cases are low-frequency, but collectively they dominate the complexity, maintenance cost, and “why did this break at 2am” pain.

What “long tail” looks like in agent/ops work

A deterministic program/script tends to handle the head nicely:

  • “Fetch data → transform → post result”
  • “Run tests → apply patch → open PR”

The long tail is everything that doesn’t fit that happy path, e.g.:

  • Authentication weirdness: token expired, 2FA prompt, scope change, device pairing required
  • API drift: fields renamed, pagination changes, undocumented rate limits
#
  • Partial failure: one service down, flaky network, timeouts, retries causing duplicates
  • Ambiguity: user request missing context (“deploy it” — which env? which service? which version?)
  • State surprises: someone already applied a change manually, data is inconsistent, locks are held
  • Policy/legal/UX constraints: “don’t message users outside business hours”, “don’t post sensitive data”
  • Human coordination: “ask for approval if impact > X”, “escalate to me if confidence < Y”

Why the long tail matters for “compile vs runtime”

You can compile the head into a deterministic program pretty easily.

What becomes hard is either:

  1. Encoding all tail contingencies explicitly (massive decision tree + constant upkeep), or
  2. Escalating to a human with good context, or
  3. Using a runtime policy (an agent) that can interpret novel situations and decide what to do next.

That’s the practical reason “agent runtimes” exist: they’re often a cheaper way to handle the tail than writing and maintaining endless bespoke logic + UI + runbooks.

If you tell me the domain you care about (devops? customer support? personal admin?), I can give “head vs tail” examples specific to it.

coral widget
#

Can’t all of that be handled via agentic coding? There is still the same software methodology at the end of the day. It’s just faster loops.

See yu’s agentic coding guide.

queen lilyBOT
#

In principle, yes: agentic coding can build software to handle all of that. “It’s all software methodology” is true at the meta-level.

The catch is that you’re sliding between two different things:

  • Agentic coding = a development accelerator (faster loops to create a controller).
  • Autonomous agent runtime = the controller itself (a system that operates in a live, partially-known world).

Agentic coding makes it cheaper to write the program. It doesn’t remove the core reasons the “long tail” is painful.

Why “just build it with agentic coding” still hurts in practice

Even with fast loops, you still need to solve:

  1. Spec + verification for tail cases
    • For head cases, “tests pass” works great.
    • For tail cases, the spec is often ambiguous (“what should we do if token expires mid-flow?”) and the state space is huge.
    • You can absolutely encode it… but you’ll spend most of your time designing policies, tests, mocks, fixtures, simulators, and on-call safeguards.
#
  1. Live-world integration + ops

    • Retries, idempotency, race conditions, rate limits, partial outages, permissions drift.
    • Coding agents help you implement these faster, but they don’t magically make them simple.
  2. Human-in-the-loop UX

    • “Ask for approval” sounds small until you need: escalation rules, context packaging, audit logs, time windows, channels, deduping notifications, etc.
    • Polished coding agents already have a nice approval UX for code changes. General ops workflows don’t.

What the runtime agent buys you (when it’s worth it)

A runtime agent is basically a general-purpose exception handler that can:

  • notice “this is off the happy path”,
  • ask clarifying questions,
  • adapt the plan,
  • or escalate with a good summary,

without you having pre-authored a branch for every weird situation.

#

You can approximate that with deterministic code + a giant policy tree, but then you’ve reinvented a bespoke agent runtime (plus UI, plus governance).

The clean way to reconcile both views

The most realistic architecture is hybrid:

  • Use agentic coding to create deterministic building blocks + tests (the “head”).
  • Use an agent runtime to orchestrate them and handle the weird stuff (the “tail”), ideally with strong guardrails and approvals.

So: you’re right that everything is software; you’re wrong if you conclude that “faster loops” collapses the build-vs-runtime tradeoff. It mainly changes the economics, not the structure of the problem.

If you can summarize what “yu’s agentic coding guide” recommends in 2–3 bullets (esp. how it handles approvals/escalations), I can map it directly onto this and point out where it matches a runtime agent vs where it’s effectively “build your own runtime.”

coral widget
#

But with agentic coding I can still encode policies / rules to have the agent test and iterate on the program via some number of autonomous loops and I can tweak the agent to not ask me to approve for certain things based on confidence level.

There’s no benefit that you’ve yet made clear to me to start with a fully autonomous runtime agent up front?

queen lilyBOT
#

You’re basically right on the “don’t start fully autonomous by default” conclusion.

Where I disagree is the implied next step: once you’ve built agentic-coding loops with policies, permissions, escalation rules, and ongoing triggers… you’ve built a runtime agent system. At that point the question becomes “roll your own vs use an existing runtime/control-plane”.

A few concrete points:

1) “Confidence-based auto-approval” is shakier than it sounds

LLM self-reported confidence isn’t reliably calibrated. If you’re gating autonomy, you usually want external, checkable signals, e.g.:

  • tests/CI pass
  • diff size / touched files / sensitive paths
  • tool category (read-only vs destructive)
  • dry-run succeeds (where available)
  • invariant checks (idempotency, no duplicate sends, no secrets exfil)

So yes, you can encode policies in an agentic coding workflow — but the “confidence” knob is the least trustworthy one.

#

2) You don’t need “fully autonomous runtime agent up front”

Agreed. A sane path is:

  • start human-in-the-loop,
  • restrict tool surface,
  • allow autonomy only for low-risk actions,
  • widen as you gain trust.

That’s true whether you’re in an IDE coding agent or an OpenClaw-style runtime.

3) The benefit of a runtime agent isn’t “autonomy”; it’s a control plane

The real reasons people adopt a runtime framework are boring-but-real:

  • event-driven + always-on: reacts to messages, schedules, external triggers without you “being in the IDE”
  • cross-boundary I/O: your “approval UI” can be chat; escalation can be routed; context summaries can be generated and delivered
  • centralized governance: one place to express “this tool is allowed, this one needs approval” instead of baking it into a bespoke script/agent wrapper
  • state + memory + auditability: long-lived threads, logs, reproducibility of what happened and why
#

You can build all of that with agentic coding. It’s just more work, and the work is mostly ops/product glue, not “writing the core logic”.

4) When agentic coding alone is enough

If your world is mostly: “I’m in a repo, I want good diffs, I can review, tests validate” — a polished coding agent is the best tool. Full stop.

5) When runtime starts to win

When you notice you’re repeatedly doing “small operational decisions” outside the repo: checking a dashboard, responding to a message, triaging an alert, coordinating across systems, keeping something running over time. That’s the long-tail orchestration problem.

So: you’re not missing some magic “autonomous from day 1” advantage. The advantage is having an execution/runtime substrate so you don’t have to keep turning your coding agent into a bespoke ops platform.

If you tell me the first workflow you’d actually want to automate (one sentence), I can say which side I’d start on and what autonomy gates I’d put in place.

coral widget
#

New topic.

What would happen if I had you decompose your reasoning during planning into a structured form of wigmore analysis, and then forced you to iterate N times discovering and mitigating weaknesses/attacks?

Would you…reason better?
Could I encode an Nth iteration into a searchable/retrievable form that short circuits the need for N iterations when a related query passes through?

I say “you” but really I mean any LLM.

queen lilyBOT
#

Yes, sometimes—but the gains come from structure + adversarial pressure + external grounding, not from “more tokens of introspection” by itself.

1) If you force Wigmore-style structure + N critique iterations, would an LLM reason better?

Often, yes, in these ways:

  • Fewer missing premises: an argument map forces explicit claims, sub-claims, and dependencies.
  • Better coverage of alternatives: you can require explicit competing hypotheses and what would falsify them.
  • Cleaner separation of evidence vs inference: Wigmore-style framing makes “what do we actually know?” harder to blur.
  • Attack surface visibility: when you require “possible attacks/weaknesses” you get more robust outputs.

But there are hard limits:

  • Self-critique is not a truth oracle. An LLM can confidently “attack” the right parts, or it can generate plausible-sounding objections that are irrelevant, or miss real errors it’s blind to.
  • Diminishing returns after a few iterations—especially if you’re not adding new information (tools, retrieval, calculations, fresh evidence).
  • Overfitting / verbosity: more iterations can create a “looks rigorous” artifact that’s not actually more correct.

The biggest jump usually comes from adding one of:

#
  • an external verifier (search, calculator, code execution, unit tests),
  • model diversity (separate proposer vs critic, or different seeds/models),
  • or a hard scoring rubric tied to observable checks.

2) Could you “compile” the Nth iteration so you don’t need to do N loops next time?

Yes—this is basically distillation: convert expensive deliberation into a reusable artifact. But the artifact that generalizes best is usually not the entire argument map; it’s a reusable decision procedure extracted from it.

What to store in a retrievable form:

  • A template/rubric (“for claim type X, check A/B/C; enumerate alternatives; list disconfirming evidence…”)
  • A library of common failure modes + mitigations (“watch for scope confusion, hidden assumptions, base-rate neglect…”)
  • A canonical set of disambiguating questions to ask the user when inputs are underspecified
  • Where appropriate, a domain-specific knowledge pack (definitions, invariants, constraints, threat models)

What’s risky to store as-is:

  • The exact “final argument” can become stale when facts change, and can anchor future reasoning (“we already decided this”) even when the situation differs.

So yes: you can short-circuit future N-iteration loops, but only if you also solve retrieval correctness: “is this actually the same class of problem?”

#

3) The practical recipe (works well)

If I were implementing this, I’d do:

  1. Generate a structured argument map (Wigmore-ish):

    • Main hypothesis
    • Sub-hypotheses
    • Evidence items + credibility notes
    • Alternative hypotheses
    • Missing evidence / what would change the conclusion
  2. Run an “attacker” pass (separate role/seed/model if possible):

    • strongest counterargument
    • ambiguity attacks (“what definition shift breaks this?”)
    • empirical attacks (“what fact would refute it?”)
  3. Mitigate + distill:

    • produce a compact checklist/rubric + clarifying questions
#
  • tag it with metadata for retrieval (domain, task type, assumptions, risk level)
  1. Retrieve the distilled rubric for future queries, not the whole deliberation.

Bottom line

  • Yes, structured Wigmore-like decomposition + iterative red-teaming can improve outputs.
  • No, “more iterations” isn’t reliably better unless you add new constraints or new evidence.
  • Yes, you can “compile” the results into retrievable rubrics/checklists that reduce future iteration cost—but you need good classification/metadata so you don’t apply the wrong distilled artifact.

If you want, give me a domain (e.g., “security decision”, “product strategy”, “debugging”, “legal-ish argument”), and I’ll sketch what a good Wigmore schema + distilled reusable rubric would look like for that domain.

coral widget
#

RE: Your “no”, but by exposing attack surface and weaknesses you then could, on the next iteration, gather related evidence. For example, via RAG, semantic search, web search, or w.e. synthesis mechanisms you have.

queen lilyBOT
#

Yep — that’s exactly the missing ingredient that makes “iterate N times” meaningfully different from pure self-critique.

My “no” was basically: if iteration doesn’t inject new information or stronger constraints, it plateaus. But if you use the critique to drive targeted evidence acquisition, you’re no longer looping in place — you’re doing an active investigation loop.

How the loop actually improves with retrieval in the middle

A good structure is:

  1. Draft an argument map (claim → subclaims → current evidence)
  2. Attack pass: identify weaknesses as answerable questions, e.g.
    • “What are the strongest counterexamples?”
    • “What empirical rates/base rates matter here?”
    • “What would falsify this?”
    • “Where are we relying on an unstated assumption?”
  3. Query synthesis: turn each weakness into retrieval tasks:
    • pro-evidence query
    • con-evidence query (to fight confirmation bias)
    • definitional/ground-truth query (terminology, specs, docs)
#
  1. Retrieve + triage (RAG/web/semantic search), with provenance scoring:
    • source credibility
    • recency
    • directness (primary vs blog post)
    • agreement across independent sources
  2. Update the map, explicitly marking which weaknesses were mitigated vs still open.
  3. Repeat until budget/time/risk threshold is met.

That’s not “reasoning harder”; it’s closing epistemic gaps.

Two things that make or break it

  • Query quality: the attack pass needs to produce sharp questions, not vague “look it up”.
  • Anti-confirmation retrieval: force at least one query whose job is to find disconfirming evidence (“best critique of X”, “X fails when…”, “counterexample to…”). Otherwise you just retrieve ammo for the initial story.

What you can “compile” for reuse

This is where your earlier “Nth iteration caching” idea becomes practical: you can store

  • the weakness → query templates,
#
  • the rubric for source quality,
  • and the standard falsification checks for that domain,

so next time you don’t need N free-form iterations — you run a mostly deterministic “investigation playbook” with retrieval.

So yes: attack-surface exposure + retrieval between iterations is one of the most reliable ways to actually improve outcomes, because it converts “deliberation” into “research + verification”.

coral widget
#

Help me come up with a way to define time/budget/risk that id set to scope the number of iterations that an AI agent would do? When would be a good time to have the agent stop going back and forth?

queen lilyBOT
#

You can scope “how many loops?” with a risk-weighted budget + explicit stop conditions. The trick is to make the agent prove it’s still making progress; otherwise it stops or escalates.

  1. Define risk in a way that maps to governance

A practical risk model is “impact × reversibility × uncertainty”.

Impact (what’s the worst-case blast radius?)

• low: affects only the agent’s own notes / a scratch file
• medium: changes a repo, config, or internal system
• high: external communication, prod changes, security/privacy, money, reputation

Reversibility

• easily reversible (git revert, local file edit)
• partially reversible (config changes with downtime)
• irreversible (send message/email, delete data, charge card)

#

Uncertainty

• well-specified + verifiable (tests, schema, deterministic checks)
• ambiguous requirements or weak verification

From those three, assign a simple risk tier: L / M / H / Critical.

  1. Budget knobs (what you actually cap)

Use at least these budgets (even if approximate):

• Wall-clock time (e.g., 30s / 5m / 30m)
• Tool-call budget (since tool calls are where money + side effects happen)
• Iteration budget (planner↔critic cycles)
• Optional: $ or token budget if you can measure it

A reasonable default policy table:

#
| Risk tier | Allowed tools        | Max iterations      | Max tool calls | Time cap | Escalation              |
| --------- | -------------------- | ------------------- | -------------- | -------- | ----------------------- |
| Low       | read-only / search   | 1–2                 | 3–10           | 1–3 min  | rarely                  |
| Medium    | write/code + tests   | 2–4                 | 10–30          | 5–15 min | if stuck                |
| High      | destructive/external | 1–2 (prep only)     | 5–15           | 5–10 min | always before action    |
| Critical  | money/privacy/prod   | 0–1 (analysis only) | minimal        | minimal  | human approval required |

Key idea: for High/Critical, the agent can plan + gather evidence, but it shouldn’t “keep looping until it feels confident” and then act.

  1. Stop conditions (when to halt even if budget remains)

Make stopping depend on marginal progress, not vibes.

Stop (success) when:

• All required verifiers pass (tests/linters/dry-run/invariants)

#

• All “must-answer” questions are answered with cited evidence
• Remaining open issues are explicitly marked “acceptable risk” by policy (or human)

Stop (plateau) when any of these happen:

• No new evidence was added in the last iteration (no new sources, no new test results)
• The set of “top 3 risks/unknowns” is unchanged across 2 iterations
• You detect oscillation: the agent is bouncing between the same 2 options without new data
→ tie-break (choose the safer reversible option) or escalate

Stop (escalate) when:

• An unresolved question blocks a high-impact decision
• The only way forward involves irreversible action
• Retrieval produces conflicting sources with no clear way to arbitrate

  1. A clean “loop contract” you can enforce
#

Require each iteration to output a tiny structured status like:

• New information obtained this loop: (tests run, sources found, observations)
• Risks reduced: (which risks went down and why)
• Open unknowns: (ranked)
• Next step: (one concrete action)
• Stop? yes/no + reason (hit plateau / hit budget / ready to act / needs human)

If “New information obtained” is empty twice in a row → stop.

  1. Rule of thumb for “when to stop going back and forth”

As soon as the agent’s next step is not “get new evidence / run a verifier / ask a clarifying question”, it’s probably spinning. At that point:

• pick the safest reversible path, or
• ask the human the single most decision-relevant question.