#i'll give it a try. not sure what

1 messages · Page 1 of 1 (latest)

gleaming heath
#

ok i asked my boss:

two separate issues are getting conflated:
1. Tool-looping (Codex keeps calling the same MCP tool/read/query repeatedly)
2. Cross-session memory (starting a new session feels like amnesia)

They interact: if the agent doesn’t have a stable “I already did this” state, it’s more likely to re-run the same tool calls.

What changed recently in Codex (relevant bits)

Recent Codex CLI updates added connectors support via the app server + MCP, plus improvements around resume reliability and “threads”/spawned context handling. 

That’s directionally good—but it still doesn’t automatically give you “long-term memory” unless you resume the same thread or you persist state outside the model.

Ranked solutions for “agentic memory” (maximizing OpenAI/Codex)

  1. Codex thread/session resume + app-server-backed history (highest utility, lowest friction)

If your team is bouncing between “new sessions,” stop doing that—treat work as resumable threads. The Codex app-server is explicitly designed for deep integrations including conversation history and streamed events. 
Why #1: This is the only approach that keeps you “inside Codex” while preserving continuity with minimal new infrastructure.

Productivity tip: enforce “one workspace = one long-lived thread,” and only fork when intentionally branching.

  1. Repo-native “project memory” files + Codex rules (high utility, simple, very reliable)

Codex is great at reloading state from the workspace if you make it explicit:
AGENTS.md / rules
• PROJECT_STATE.md (current objectives, decisions, constraints)
• TASK_LOG.md (what’s done, what’s pending, links to files)

OpenAI’s own “AI-native engineering” guidance calls out persistent project memory as a key pattern (even if it’s implemented as artifacts the agent can reread). 
Why #2: Works even when “resume” breaks, and it’s audit-friendly.

  1. Memory layer behind an MCP tool (still used from Codex) — Zep first
#

If you’re okay adding one external component but keeping Codex as the primary agent, add a “memory MCP”:
• Zep (temporal knowledge graph / memory API) 
Why Zep: purpose-built for cross-session recall + temporal updates; strong benchmarks + practical APIs.

  1. Agent-framework persistence (LangGraph / LlamaIndex) used as an orchestrated “brain”

If you want richer state machines (checkpoints, resumability, HITL, time-travel):
• LangGraph persistence/checkpointers 
• LlamaIndex memory abstractions 
Why #4: Excellent if you’re building a durable agent platform, but it’s “more system” than most teams need for coding-agent workflows.

  1. MemGPT-style stateful agents (Letta)

Letta is explicitly about agents with persistent memory (MemGPT lineage). 
Why #5: Powerful, but it’s effectively adopting a new agent runtime. Great for productized agents; heavier for “Codex-first coding.”

Quick fixes for the endless MCP read/query loop (do these regardless)

These reduce loops today even without “true memory”:
• Tool result caching + idempotency keys in the MCP server: same input → same output + “already served” marker.
• Hard tool budget: max N calls per tool per turn/session; after that, force the model to summarize and ask for help.
• Return a strong terminal signal from tools (e.g., status: "ok_done" + a short “next recommended step”). Models loop less when tool outputs are “final-looking.”
• Log + detect duplicate tool calls and respond with “duplicate_request” plus the prior result (server-side guardrail).

Looping is a known failure mode across tool-calling systems (not Codex-specific), so treating it as an engineering control problem pays off fast.

regal shell
#

I think it's less memory and more relentless effort. The calls to mcp fail or grep searches don't return results and it keeps trying. Once I intervene it will stop and continue with the task though

gleaming heath
#

potential solution:

that’s a retry policy / tool-failure handling problem (a “stuck in try harder mode”), not a cross-session memory problem.

What’s happening: when an MCP call errors or grep returns nothing, the agent treats that as “insufficient evidence” and keeps searching instead of switching strategies. Your intervention effectively provides the missing “stop condition.”

Fix it at the tool boundary (highest leverage)

  1. Make MCP failures explicit and typed
    Have the MCP tool return structured outcomes like:
    • status: "ok"
    • status: "no_results" (search succeeded, nothing found)
    • status: "retryable_error" (transient: timeout, 429, network)
    • status: "fatal_error" (bad args, auth, 404 resource)

Agents loop way less when they can distinguish “no results” from “error.”

  1. Add a circuit breaker in the MCP server
    • Same request (or same normalized query) fails 2–3 times → return fatal_error: "circuit_open" with the last error + what to do next.
    • If no_results repeats on the same target, return no_results plus a “stop recommendation” field.

  2. Add budgets
    Per-session caps like:
    • max_tool_calls_per_kind = 10
    • max_identical_calls = 2
    • max_total_failures = 5

After budget exceeded, force a “summarize + ask human or switch approach” response.

Fix it in the agent policy (fast, no infra changes)

Drop a rule into your repo’s agent instructions (e.g., AGENTS.md or whatever Codex respects):
• After 2 failed tool calls of the same type, change strategy.
• After 3 total failures (any tools), stop tool use and ask for a decision.
• Treat empty grep/search as a valid terminal signal (“no match found”), not as an error.

A minimal policy snippet:
• “If a tool returns an error twice, stop retrying and propose 2 alternative next steps.”
• “If grep/find returns no matches twice, stop searching and ask what file/dir to target or broaden scope.”

#

Quick diagnostic: why do the calls “fail”?

If you can capture just one loop instance, look for which bucket it is:
• Transient: timeouts / rate limits → needs backoff + retry cap
• Bad query: wrong path / wrong working dir / wrong file glob → needs “validate assumptions” step (list dir, print pwd)
• No results: query is fine, target doesn’t exist → needs “accept no” + pivot