#Too many input tokens
1 messages · Page 1 of 1 (latest)
That’s usually normal: input tokens are the full prompt sent to the model every model call, while output tokens are only what the model generated.
So input includes repeated copies of:
- system prompt + tool definitions
- workspace/bootstrap files like
AGENTS.md,MEMORY.md, etc. - conversation history
- tool calls + tool results
- attachments/transcripts
- compaction summaries / cached prompt material
- multiple model calls inside one “reply” if tools are used
Example: if your live context is ~100k tokens and a tool-heavy task makes 30 model calls, that’s already ~3M input tokens, even if the assistant only writes 40k output tokens total.
To inspect it:
/status
/context list
/context detail
/usage tokens
To reduce it:
/compact
or start fresh with: