If you want to accept anything equal or above Q8 quants, documentation would suggest defining a list that looks like this in the provider parameters "quantizations": ["fp32", "fp16", "fp8", "int8", "bf16", "unknown"]
However, model that have no quants s (such as google/gemini-2.5-flash) give this error: {'error': {'message': 'No endpoints found for the request with quantization: fp32,fp16,fp8,int8,bf16. To learn more about provider routing, visit: https://openrouter.ai/docs/provider-routing', 'code': 404}}
How to accept an allowlist of quants AND no quants?