Why are the concurrency requests on the API so low?? How is it I can only do 15 concurrent requests | ElevenLabs | Page 1

AI-generated Answer for kes0953

Short answer: ElevenLabs limits by concurrent in-flight requests per account (not requests-per-minute). The exact concurrency you get depends on your subscription tier and the TTS model family you call. Requests above your limit are queued (typically adding ~50 ms) rather than hard-rejected, and you can monitor limits via the response headers maximum-concurrent-requests and current-concurrent-requests. (elevenlabs.io)

What your “15 concurrent” cap likely means

If you’re using the Multilingual v2 TTS model, Scale/Business tops out at 15 concurrent; Pro is 10. By contrast, the Turbo and Flash TTS models have higher limits: Pro 20, Scale/Business 30. Switching to Turbo/Flash can double your effective concurrency on the same plan. (elevenlabs.io)
The Help Center summarizes the same policy: “Flash and Turbo” have higher concurrency than “all other models,” with Scale/Business shown as 30 vs 15, respectively. (help.elevenlabs.io)

Why this doesn’t equal throughput limits

ElevenLabs uses a queue once you hit your plan’s concurrency; in practice this “typically only adds ~50ms of latency,” so you can still sustain high request volumes even with small concurrency numbers. As a rule of thumb from the docs, a concurrency limit of 5 can often support ~100 simultaneous audio broadcasts, depending on speaking patterns. (elevenlabs.io)

How to raise effective concurrency today

Use Turbo or Flash TTS instead of Multilingual v2 to access the higher per-plan limits. (elevenlabs.io)
Prefer WebSocket TTS: with websockets, only the time while audio is being generated counts toward concurrency; the rest of the connection time does not. (elevenlabs.io)
Upgrade your plan or contact Enterprise for elevated limits beyond the published caps. (help.elevenlabs.io)

What error you’ll see if you exceed limits

The API returns HTTP 429 with message too_many_concurrent_requests when you surpass your plan’s concurrency; system_busy indicates temporary platform load. (help.elevenlabs.io)

If you want, share which TTS model and plan you’re on—I can point to the exact limit and the least-effort way to increase it per the docs. (elevenlabs.io, help.elevenlabs.io)

#Why are the concurrency requests on the API so low?? How is it I can only do 15 concurrent requests