#Usage/Charge differences between premium models
4 messages · Page 1 of 1 (latest)
Hi @blazing adder. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:
Yes, there is a difference in measured usage between different premium models in Warp. AI request usage is not a flat rate per prompt; instead, it scales with the number of tokens processed during the interaction, and the model you choose is a significant factor. Larger, more advanced models like Claude 4 Opus generally consume more tokens and therefore more AI requests per interaction than smaller or faster models like Claude 4 Sonnet or Gemini 2.5 Pro. The documentation specifically notes:
Generally, smaller, faster models typically consume fewer requests than larger, reasoning-based models. For example, Claude 4 Opus tends to consume the most tokens and requests in Warp, followed by Claude 4 Sonnet, GPT-5, Gemini 2.5 Pro, and others in roughly that order. This generally correlates with model pricing as well.
So, while the difference may not be exactly 10x as with GitHub Copilot, you should expect that using a more advanced model like Opus 4.1 will consume more of your AI request quota per interaction than Sonnet 4. The exact multiplier is not specified and can vary depending on the complexity of your prompt, the amount of context, and other factors. There is no published table of per-model request costs, as the number of requests per interaction is non-deterministic and depends on several variables, including the model, prompt size, and tool calls made by the agentHow are Warp AI requests calculated?.