#QuotaLimit
13 messages · Page 1 of 1 (latest)
Hi @true creek. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:
I understand your frustration with encountering a "QuotaLimit" error despite having an "unlimited requests" plan. This is actually a common issue that many Warp users experience.
While your plan may offer unlimited AI requests, there's a separate monthly token limit that applies to all plans, including those with unlimited requests. Tokens represent the amount of text processed by the AI model (both your input and the AI's output), and they're different from the number of requests you can make.
For example, if you're on the legacy Team plan, it includes unlimited requests but has a 70-100 million token monthly limit. The newer Turbo plan has 3,000 AI requests per month with a higher token limit (1B per month) and unlimited access to the Lite model when you exceed your premium model limits Discord support thread.
When you hit your monthly token limit, Warp will display a message like "You've reached the monthly token limit" and disable AI features until your quota resets at the start of your next billing cycle.
To resolve this issue:
- Wait for your token limit to reset at the beginning of your next billing cycle
- If you're on the Turbo plan, you'll automatically be switched to the Lite model (a more lightweight AI) when you exceed your premium model limits Plans & Pricing
- Contact Warp support at support@warp.dev if you believe there's an error with your account
To avoid hitting token limits in the future, try being more concise with your prompts and avoid commands that generate large outputs, as these consume tokens quickly.
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@true creek Kapa.ai is still learning and improving, please let me know how I did by reacting below.
Hard to help without context, what was the prompt you used and what was happening when this error popped up?
I'm assuming the prompt was too vague so the response and the work entailed would have either been too expensive token wise, or was beyond the context window capacity. I have had similar experiences and it's mostly because my prompt was not properly contextualized
.
Warp is a fantastic tool and I rely on it heavily, its actually much better than Cursor but this quota limit is a real issue. I dont want to be careful with my tokens, I want to be able to pay for as much as I need it, even if it ends up being expensive. currently on 3 turbo plans already and I keep hitting this limit. who do I talk to to be bumped to entrerprise level?
You can just upgrade to Enterprise from inside your app.
Go to https://warp.dev/pricing and click on the enterprise link to submit a request to our sales team. You'll need a custom quote for the enterprise plan. also we're working on a way for folks to get more requests so you dont have to keep adding accounts, but it's not quite ready yet.
Even if, for some reason, the token limit has been exceeded, I believe there should be an option to activate Lite mode, right? Additionally, it would be very helpful if Warp provided an early warning indicating how close you are to reaching the token limit, since this information currently isn’t available for monitoring.
It should be defaulting to lite mode...
https://docs.warp.dev/agents/warp-ai/agent-mode#exceeding-agent-mode-request-limits
On the Turbo plan, when the 3000 request limit is reached, you fallback to the Lite model. The token limit is very high, but when it's hit all the AI will cease to function and you may see the QuotaLimit error. The Token limit is a hard limit on all plans, but very high so we don't expect any users to hit it.
We do send email warnings when a user is close to reaching the Token limit (i think 75% and 90% warnings), but token usage isn't in the UI at this time.
do lite model requests count towards the 3000 before hitting it?