It's understandable to be surprised by hitting a quota limit so quickly after upgrading to Warp Pro. Based on the information in the knowledge sources, here’s what’s likely happening:
Warp AI usage is governed by two separate limits: the number of AI requests and the number of tokens used. Tokens are chunks of text (words, code, or characters) that the AI model processes. Each request consumes a certain number of tokens, depending on the length and complexity of your prompt and the AI's response. Even if you have requests left, you can still hit your monthly token limit, which will prevent further AI usage until your quota resets at the start of the next billing cycle. This is a common source of confusion, as the request counter and token counter are separate, and only the request count is typically visible in the UI right now [AI usage numbers in setting do not appear to match value used to limit requests].
It is technically possible to hit the token limit quickly if your prompts or the AI's responses are large, or if you use features like Agent mode, which can consume more tokens per request. Some users have reported hitting the token limit after only a few hours or a handful of complex queries, especially if the AI is processing large amounts of context or code [Token limit reached after 4 or 5 queries].
If you believe your usage was not excessive, it's possible that something in your workflow (such as large prompts, long responses, or repeated context) is consuming tokens faster than expected. Unfortunately, Warp currently does not provide a detailed token usage counter, which makes it hard to track exactly how your quota is being used [Need AI Request / Token counter].