#How does Uptime/Throughput Sorting work? Can I use it for Azure OAI + OpenAI?

2 messages · Page 1 of 1 (latest)

dim lark Mar 12, 2025, 5:37 PM

I'm curious about how your uptime detection for different model providers work. I want to use Azure OpenAI, but have realized they have spikey poor performance. In a dream world, I would use OpenRouter and it would use Azure (which is faster normally) whenever their performance is around their normal (i.e. sorting by throughput). How often is throughput measured? Is it the 30 second window you mentioned in your docs? How does this compare to the yellow status signal shown under Azure OpenAI 4o-mini currently. Is the yellow status detected or inferred?

Would you expect OpenRouter to just work for my use case, or would I have to implement something custom? I'm not just falling back on errors, more so on throughput and latency degradations.

static kindle Mar 12, 2025, 6:26 PM

Throughput Sorting:

It uses the median throughput from the past 24 hours—not a 30-second window—to determine the best-performing provider.
Provider Status:
The status (e.g., “yellow”) is measured separately by monitoring the provider’s returned status codes.
Providers with a significant number of non-success codes are flagged as yellow and automatically ranked lower.
Routing Behavior:
When sorting by throughput, any provider marked as yellow (indicating an outage) is avoided.
The recommended approach is to set the provider.sort option to throughput, as detailed in the provider routing documentation.