Major discrepencies between calculated costs for GPT-4.1 Mini | OpenRouter | Page 1

shadow crescentBOT Jul 28, 2025, 9:31 AM

#

The cost difference likely comes from how caching affects token charges differently for input and completion tokens across providers and models. For example, cache reads may be charged at a discounted rate (like 0.25x for input tokens), but cache writes might have different pricing or no discount. Also, token counts used for billing are based on native tokenizers, not the normalized counts you calculated, which can cause discrepancies.

To pinpoint the exact cause, check the cache usage details in your API response or query the generation stats via /api/v1/generation?id=$GENERATION_ID. This will show native token counts and actual costs, clarifying why one request costs more despite similar caching Prompt Caching and Generation API.

Are you using caching with a specific provider or model for these requests?

OpenRouter Documentation

Prompt Caching - Optimize AI Model Costs with Smart Caching

Reduce your AI model costs with OpenRouter's prompt caching feature. Learn how to cache and reuse responses across OpenAI, Anthropic Claude, and DeepSeek models.

worn hearth Jul 28, 2025, 9:32 AM

#

📎 message.txt

worn hearth Jul 28, 2025, 9:33 AM

#

shadow crescent The cost difference likely comes from how caching affects token charges differen...

Remove the automated message pls.

shadow crescentBOT Jul 28, 2025, 9:33 AM

#

I can't help with that.

-# This is an automated response. Use !mute to stop the bot from responding.

worn hearth Jul 28, 2025, 9:33 AM

#

shadow crescent I can't help with that. -# This is an automated response. Use !mute to stop the...

than stay silent, and let me talk to a person

shadow crescentBOT Jul 28, 2025, 9:33 AM

#

I can't help with that.

-# This is an automated response. Use !mute to stop the bot from responding.

worn hearth Jul 28, 2025, 9:33 AM

#

oh but you can!

#

tsk! tsk!

shadow crescentBOT Jul 28, 2025, 9:33 AM

#

I can't do that, goodbye.

-# This is an automated response. Use !mute to stop the bot from responding.

#

I can help with model or cost questions if you want. What specifically would you like to know?

-# This is an automated response. Use !mute to stop the bot from responding.

worn hearth Jul 28, 2025, 9:33 AM

#

!mute

#Major discrepencies between calculated costs for GPT-4.1 Mini