Providers throughput on a model’s page feels inaccurate | OpenRouter | Page 1

scarlet flint Mar 16, 2026, 8:27 PM

#

Currently, at least as far as I can tell, on model pages, a providers throughput is its aggregated throughput for all requests, and not for that specific model. OpenRouter clearly collects the data I think is more valuable as it’s displayed further down the page. This leads to, for example, GLM-5 turbo’s page reporting a 30TPS for z.ai and also 48TPS. I think it would be more useful and less confusing if providers had the throughput for that specific model listed rather than its aggregate on a models page.

weak river Mar 16, 2026, 11:16 PM

#

wdym? it's different for everyone model

#

why would it be an aggregated value for every model?

#

you can just check it by going to the provider's page like Z.AI and see that GLM 4.6 has 69TPS and GLM 5 has 30 TPS

#

or are talking about the Performance page? cause there i see a different value but that's probably the daily average

scarlet flint Mar 17, 2026, 12:43 PM

#

Oh it’s not aggregated? Why are the numbers like all so wrong then?

#

Hm, I wonder does OpenRouter subtract or include the TTFT when calculating TPS

naive brook Mar 17, 2026, 1:14 PM

#

scarlet flint Hm, I wonder does OpenRouter subtract or include the TTFT when calculating TPS

🤔 https://openrouter.ai/docs/guides/guides/get-started/for-providers#4-performance-metrics

OpenRouter Documentation

Provider Integration - Add Your Models to OpenRouter

Learn how to integrate your AI models with OpenRouter. Complete guide for providers to make their models available through OpenRouter's unified API.

scarlet flint Mar 17, 2026, 2:26 PM

#

weak river or are talking about the Performance page? cause there i see a different value b...

The numbers on the throughput in the providers cards being different than in the performance section is what’s strange to me. While I’d argue that actually neither are particularly accurate, given that OR claims the TPS for the GLM-5-Turbo request I just made is 54tps when it’s more like 77tps (calculated by taking the total output tokens generated, as reported by the vercel AI SDK, divided by the request duration minus TTFT, which I get by setting a variable to performance.now on the first “content” token (tool call, text, reasoning), both of which are a far cry from either the 30TPS reported on the provider card and the 40tps reported in the performance section

#

I can forgive that my tps is obviously more accurate because I subtract TTFT, but aside from that, how do I have 3 different tps numbers that all quite substantially disagree with each other

scarlet flint Mar 17, 2026, 2:56 PM

#

I think my criticism more is just that I feel like the throughput numbers should be consistent, and they don’t feel accurate

scarlet flint Mar 17, 2026, 5:01 PM

#

Providers throughput on a model’s page feels inaccurate

#Providers throughput on a model’s page feels inaccurate