#models

4 messages · Page 1 of 1 (latest)

amber rivetBOT Apr 2, 2026, 9:15 PM

I'm an AI assistant - my answers may be incorrect and I cannot see screenshots. Please verify information with the OpenRouter docs.

1. Model Intelligence & Context Depth

a) Context window sizes:

Gemma 3 27B (free): "It handles context windows up to 128k tokens" Gemma 3 27B (free)
Mixtral 8x22B (free): I don't have specific documentation for this model. Please verify with the mods or check the OpenRouter docs directly.

b) Coding & tool-calling:

For Gemma 3 27B (free): Composite coding capability score: 9.6 (better than 23% of models compared) Gemma 3 27B (free)
For Qwen 3.6 Plus (free): I don't have benchmark data or coding/tool-calling scores for this model. Please check OpenRouter Models for the latest benchmarks or ask mods directly.

c) Trinity Large Preview (free):

I don't have specific documentation about this model's specialization or long-form reasoning capability. Please verify with mods or check the OpenRouter docs directly.

2. Speed & Stability (TPM/RPM)

a) Rate limits (google/gemma-3-27b-it:free):

No exact RPM/TPM limit is listed. Please check the Gemma 3 27B (free) page or ask mods for current rate limits.

b) Fastest free model (lowest TTFT):

No direct TTFT/latency benchmarks are listed for all free models. Models like Gemini 2.0 Flash Lite are described as "significantly faster time to first token (TTFT) compared to Gemini Flash 1.5" but speed can vary by provider [Gemini 2.0 Flash Lite](<https:/...

.../openrouter.ai/google/gemini-2.0-flash-lite-001>).

For up-to-date latency data, check the Benchmarks or ask mods.

c) Cool-down after 429 error:

I don't have specific documentation on a recommended cool-down period after a 429 error. Please check with the mods or the OpenRouter docs.

3. openrouter/free Router Behavior

a) Fastest vs. most intelligent model:

I don't have documentation specifying if openrouter/free prioritizes speed or intelligence. Please confirm with mods or see OpenRouter docs.

b) System prompts & tool-calling:

I don't have explicit documentation on system prompt or tool-calling support for openrouter/free. Please check OpenRouter docs or ask mods for current router capabilities.

c) 20k token prompt routing:

"OpenRouter routes requests to the best providers that are able to handle your prompt size and parameters, with fallbacks to maximize uptime" Gemma 3 27B (free). This suggests prompts will be routed to a model capable of handling the length, but please verify with mods for router-specific behavior.

4. Advanced Features

a) Free models with Structured Outputs (JSON mode):

Gemma 3 models (including 27B and 12B) support "structured outputs and function calling" Gemma 3 27B (free), Gemma 3 12B (free).
No comprehensive list of all 2026 free models with JSON mode; for others, check each model's page at OpenRouter Models.

**b) Free Vision (Image Inp...

...ut) models:**

Gemma 3 27B (free): "Gemma 3 introduces multimodality, supporting vision-language input and text outputs" Gemma 3 27B (free).
Gemma 3 12B (free): Also supports vision-language input [Gemma 3 12

-# This is an automated response powered by Answer HQ. Use !mute to stop the bot from responding, or !close when your question is resolved.

Best Free Multimodal (Vision) Model – Exact Model ID
The best free multimodal (Vision) model currently available is:

google/gemma-3-27b-it:free

"Gemma 3 introduces multimodality, supporting vision-language input and text outputs" Gemma 3 27B (free)
If this fails, try:

google/gemma-3-12b-it:free

Check OpenRouter Models for the latest availability.

High-Capacity, 128k Context Model – Exact Model ID
Yes, google/gemma-3-27b-it:free supports up to 128k context:
"It handles context windows up to 128k tokens" Gemma 3 27B (free)
There is currently no Llama 3.1/3.2 70B+ free model with 128k context in the official catalog.
Check OpenRouter Models for updates.

Lowest Latency Free Model – Exact Model ID
The model described as having "significantly faster time to first token (TTFT)" is:

google/gemini-2.0-flash-lite:free

"Gemini 2.0 Flash Lite is a smaller, faster-responding version... with significantly faster time to first token (TTFT) compared to Gemini Flash 1.5" Gemini 2.0 Flash Lite
For real-time latency, confirm at Benchmarks.

-# This is an automated response powered by Answer HQ. Use !mute to stop the bot from responding, or !close when your question is resolved.