#Optimize token usage

1 messages · Page 1 of 1 (latest)

heavy saffron
#

Can you evaluate if this approach will improve codex experience by drastically reduce token consumption and avoid model to allucinate on huge projects by decomposition and code maps.

https://github.com/vitotafuni/bemyagent

GitHub

BEMYAGENT is a lightweight, self-bootstrapping protocol designed to initialize AI-assisted software projects. It establishes a structured environment that prevents AI context bloat, reduces token c...

mellow gladeBOT
#

Hi @heavy saffron. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

The knowledge sources don’t mention the bemyagent project or evaluate specific third‑party repos, so I can’t directly assess that implementation. I can only compare your goals (decomposition, code maps, reduced tokens, fewer hallucinations) with documented best practices for Codex and the API.

From the docs, your general direction is aligned with recommended techniques:

  1. Reducing token usage

    OpenAI suggests reducing tokens by:

    Your idea of decomposing a large repo into smaller, targeted contexts and using a “code map” to only load relevant parts is consistent with:

    • Chunking documents into smaller pieces for retrieval. [Data prep]
    • Minimizing “chattiness” and replacing broad LLM calls with more focused operations. [Minimizing token cost]
#

So, if your code map lets you:

  • Select only the minimal set of files/chunks relevant to the current task, and
  • Avoid re‑sending large, repeated context (e.g., via caching),
    then it should indeed reduce token consumption compared to naïvely stuffing the whole project into context.
  1. Decomposition / subagents

    Codex docs explicitly recommend:

    • Splitting complex tasks into smaller subtasks. [Faithful reasoning]
    • Using subagent workflows to keep the main thread clean and move noisy work (logs, tests, exploration) into separate agents. [Subagent concepts; Subagents]

    Your approach of decomposing work and possibly using separate “agents” per concern (e.g., per module or task) is aligned with this guidance and can:

    • Reduce context pollution/rot.
    • Keep prompts smaller and more focused.
      Note: subagents increase total tokens if you run many in parallel, so the savings come from better scoping and less wasted context, not from the mechanism itself. [Subagents]
  2. Code maps / knowledge structures

    While the docs don’t mention your specific “code map” idea, they do recommend:

    • Building structured knowledge bases and chunked representations for retrieval. [Data prep]
    • Using embeddings and metadata to surface only the most relevant chunks. [Data prep]
#

A code map that:

  • Encodes architecture, dependencies, and ownership, and
  • Drives which chunks/files are retrieved for a given task,
    is conceptually similar to a well‑designed RAG index over a codebase. That’s a recommended pattern for large corpora.
  1. Hallucinations on huge projects

    The sources are clear that:

    • RAG / retrieval and better context control can reduce hallucinations but not eliminate them. [Data prep]
    • Constraining what the model can say and splitting tasks improves reliability. [Faithful reasoning]

    So:

    • Decomposition + targeted context (via a code map) should help the model stay grounded in the right parts of the repo.
    • But it will not “avoid hallucinations” entirely; it just makes them less likely and easier to detect.

Summary

#
  • I can’t evaluate the specific bemyagent implementation from the docs, but the strategy you describe—decomposition, architecture/code maps, targeted context, and sub‑tasking—is strongly aligned with OpenAI’s guidance for:
    • Reducing token usage, and
    • Improving reliability on large projects.
  • Expect significant token savings vs. naive “dump the repo into context” approaches, and meaningful but not total reduction in hallucinations, assuming:
    • Your code map is accurate and granular,
    • You aggressively filter context to only what’s needed per step, and
    • You use caching and smaller models where appropriate. [Managing costs; Minimizing token cost]

If you want, you can describe how bemyagent currently builds and uses its code map (e.g., embeddings, graph, heuristics), and I can map each part more explicitly to the patterns in these docs.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@heavy saffron Kapa.ai is still learning and improving, please let me know how I did by reacting below.

heavy saffron
#

Thanks for your analysis! If you want I can provide you more details even if I assume your job is to answer about OpenAI docs topics. This project works well with Codex

harsh parcel
#

Really interesting approach — context bloat is one of the biggest hidden costs in production agent systems.
The self-bootstrapping protocol idea reminds me of how I structure LangGraph graphs — defining clear state schemas upfront prevents the agent from carrying unnecessary context through the workflow.
Have you tested BEMYAGENT with multi-agent setups where each sub-agent needs its own context scope?