#Gemini 3 Flash input token counting bug

16 messages · Page 1 of 1 (latest)

sturdy oasisBOT
#

@quick belfry

Description:
The input token counts are wrong for Gemini 3 Flash. Run the script below with these settings:

OPENROUTER_MODEL=google/gemini-3.1-flash-lite-preview ./openrouter-request.sh "set name to hello world" | jq -r '.usage'

OPENROUTER_MODEL=google/gemini-3-flash-preview ./openrouter-request.sh "set name to hello world" | jq -r '.usage'

You will get 953 input tokens for Flash Lite and 5 input tokens for Flash.

I'll provide script in a follow-up message.

#

I'm an AI assistant - my answers may be incorrect and I cannot see screenshots. Please verify information with the OpenRouter docs.

To help debug the Gemini 3 Flash input token counting issue, I need a few more details:

  1. Can you share the exact error message or output you receive from your script, especially the full .usage object for both models?
  2. Please confirm the exact model identifiers you are using—are they google/gemini-3.1-flash-lite-preview and google/gemini-3-flash-preview?
  3. Can you provide the full code or shell script (openrouter-request.sh) you're running, including any relevant request headers or body parameters?

These details are critical for diagnosing token counting discrepancies, and will help mods or support escalate the issue accurately if needed.

-# This is an automated response powered by Answer HQ. Use !mute to stop the bot from responding, or !close when your question is resolved.

quick belfry
sturdy oasisBOT
#

Thanks for sharing! Please paste the full contents of your openrouter-request.sh script, and (if possible) the raw output (full .usage fields) for both model runs:

  • Paste the actual output received from:
    • OPENROUTER_MODEL=google/gemini-3.1-flash-lite-preview ./openrouter-request.sh "set name to hello world" | jq -r '.usage'
    • OPENROUTER_MODEL=google/gemini-3-flash-preview ./openrouter-request.sh "set name to hello world" | jq -r '.usage'

This will help confirm the input/output token counts and clarify if there are any script or API differences affecting the results.

Once you provide the script and outputs, I can help identify if this is a model, API, or script-level token counting issue.

-# This is an automated response powered by Answer HQ. Use !mute to stop the bot from responding, or !close when your question is resolved.

quick belfry
#

Here's output for 3.1 Flash Lite:

{
  "input_tokens": 953,
  "input_tokens_details": {
    "cached_tokens": 0
  },
  "output_tokens": 38,
  "output_tokens_details": {
    "reasoning_tokens": 0
  },
  "total_tokens": 991,
  "cost": 0.00029525,
  "is_byok": false,
  "cost_details": {
    "upstream_inference_cost": 0.00029525,
    "upstream_inference_input_cost": 0.00023825,
    "upstream_inference_output_cost": 0.000057
  }
}

Here's for Flash:

{
  "input_tokens": 5,
  "input_tokens_details": {
    "cached_tokens": 0
  },
  "output_tokens": 38,
  "output_tokens_details": {
    "reasoning_tokens": 0
  },
  "total_tokens": 43,
  "cost": 0.0001165,
  "is_byok": false,
  "cost_details": {
    "upstream_inference_cost": 0.0001165,
    "upstream_inference_input_cost": 0.0000025,
    "upstream_inference_output_cost": 0.000114
  }
}
sturdy oasisBOT
#

Thank you for providing the detailed outputs. This token counting discrepancy between google/gemini-3.1-flash-lite-preview (953 input tokens) and google/gemini-3-flash-preview (5 input tokens) for the same prompt is highly unusual and suggests a possible bug or inconsistency in tokenization or accounting for at least one of these models.

I do not have specific documentation addressing token counting issues for these models. The official OpenRouter docs do not mention model-specific tokenization bugs. Please verify with the mods or check the OpenRouter docs directly for the most accurate information.

To escalate this properly, please include:

  • The model identifiers you used (google/gemini-3.1-flash-lite-preview and google/gemini-3-flash-preview)
  • Your script and these raw outputs

Contact the mods in this Discord server with these details so they can investigate or report the issue to OpenRouter engineers. If you have a paid account ($10+ spend), you may also email [email protected] with all the evidence for further assistance. For free or <$10 accounts, Discord is the main support channel.

-# This is an automated response powered by Answer HQ. Use !mute to stop the bot from responding, or !close when your question is resolved.

quick belfry
#

What's the correct way to "notify the mods"?

sturdy oasisBOT
#

To notify the mods about a technical or billing issue, you have several options:

  • Post in the appropriate Discord channel and clearly describe your issue. Include all relevant details (model identifier, error messages, code, outputs).
  • For technical or API bugs, ask in the support or bug-report channels.
  • For billing or account issues, do not use Discord—email [email protected] instead, as billing/account help is not handled on Discord Support.
  • You can also submit feedback directly through the Activity page or via the Chatroom bug icon. Select the appropriate feedback category (e.g., "Billing" or "API Error") and describe your issue. The system will automatically capture generation IDs and context Report Feedback Guide.

If urgent, mention @mod or @admin in your Discord channel post to draw attention, but use this sparingly and always provide complete details to help them assist you efficiently.

-# This is an automated response powered by Answer HQ. Use !mute to stop the bot from responding, or !close when your question is resolved.

#

Bot has reached the maximum number of responses for this thread and will no longer respond.

quick belfry
#

Perhaps <@&1384697330254610442> ?

tacit turtle
#

looking

past frigate
#

im having the same issue with gpt-oss-120b

#

#general message

#

openai's tokenizer handles it fine, and other models also work fine

quick belfry
#

@tacit turtle any updates?

past frigate
#

@tacit turtle ?