#Too many input tokens

1 messages · Page 1 of 1 (latest)

snow condor
#

Why is my input tokens much larger than my output tokens? (Output 43.7k, Input 3M)

rocky pendantBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

rustic sapphireBOT
#

That’s usually normal: input tokens are the full prompt sent to the model every model call, while output tokens are only what the model generated.

So input includes repeated copies of:

  • system prompt + tool definitions
  • workspace/bootstrap files like AGENTS.md, MEMORY.md, etc.
  • conversation history
  • tool calls + tool results
  • attachments/transcripts
  • compaction summaries / cached prompt material
  • multiple model calls inside one “reply” if tools are used

Example: if your live context is ~100k tokens and a tool-heavy task makes 30 model calls, that’s already ~3M input tokens, even if the assistant only writes 40k output tokens total.

To inspect it:

#
/status
/context list
/context detail
/usage tokens

To reduce it:

/compact

or start fresh with: