Xiaomi MiMo V2 Flash | OpenRouter | Page 1

viscid fog Dec 16, 2025, 2:56 PM

#

https://openrouter.ai/xiaomi/mimo-v2-flash:free

MiMo-V2-Flash (free) - API, Providers, Stats

MiMo-V2-Flash is an open-source foundation language model developed by Xiaomi. It is a Mixture-of-Experts model with 309B total parameters and 15B active parameters, adopting hybrid attention architecture. Run MiMo-V2-Flash (free) with API

hardy plume Dec 16, 2025, 2:57 PM

#

is this running on ascend chips?

hearty shale Dec 16, 2025, 3:14 PM

#

🔥

short summit Dec 16, 2025, 3:23 PM

#

@viscid fog using that models :free counting to 1000 requests per day limit on 10$+ user plan?

#

what if there is only :free model variant and no paid? We can't use it after reaching 1000 RPD?

#

<@&1384697330254610442>

hardy plume Dec 16, 2025, 3:31 PM

#

short summit what if there is only :free model variant and no paid? We can't use it after rea...

yes

viscid fog Dec 16, 2025, 3:32 PM

#

the Xiaomi free endpoint has unlocked RPD limits

short summit Dec 16, 2025, 3:32 PM

#

@viscid fog Nemotron 3 Nano 30B A3B also have unlocked RPD limits? this model also don't have for now paid variants

viscid fog Dec 16, 2025, 3:33 PM

#

short summit <@165587622243074048> Nemotron 3 Nano 30B A3B also have unlocked RPD limits? thi...

that model also does not have RPD

short summit Dec 16, 2025, 3:33 PM

#

okay thanks ❤️

pastel turret Dec 16, 2025, 3:48 PM

#

viscid fog the Xiaomi free endpoint has unlocked RPD limits

Is this endpoint being trained on? Or just free, but no training?

quiet onyx Dec 16, 2025, 3:48 PM

#

weights: https://huggingface.co/XiaomiMiMo/MiMo-V2-Flash

XiaomiMiMo/MiMo-V2-Flash · Hugging Face

honest oxide Dec 16, 2025, 3:51 PM

#

Prompt: Write a short story about a deal with the djinn gone awry
Output:

📎 message.txt

viscid fog Dec 16, 2025, 3:51 PM

#

pastel turret Is this endpoint being trained on? Or just free, but no training?

free & not training

pastel turret Dec 16, 2025, 3:54 PM

#

Huh this model seems pretty good

honest oxide Dec 16, 2025, 3:54 PM

#

honest oxide Prompt: `Write a short story about a deal with the djinn gone awry` Output:

I actually really like the writing

pastel turret Dec 16, 2025, 3:55 PM

#

I like the coding style

#

Much more than the default style of like GPT 5

#

And it seems to be coding pretty well too

honest oxide Dec 16, 2025, 3:57 PM

#

gonna use this to co-write for a bit and I'll report back if I find the writing to get annoying

quiet onyx Dec 16, 2025, 3:57 PM

#

blog mentions this being under the MIT license which is great but there is no licence in the repo or if its "modified MIT"

pastel turret Dec 16, 2025, 3:59 PM

#

if hybrid attention like this can be widely adopted + competitive it'll be so sick

#

110 TPS on a top model is massive

honest oxide Dec 16, 2025, 3:59 PM

#

anyway this model is good at writing and free so

#

enjoy it before jai finds it

round cradle Dec 16, 2025, 4:00 PM

#

Looks like there's a moderation layer

#

It will reject 'high risk' prompts

honest oxide Dec 16, 2025, 4:00 PM

#

ah

round cradle Dec 16, 2025, 4:00 PM

#

Which in my testing is a pretty wide category :/

pastel turret Dec 16, 2025, 4:01 PM

#

all rules and UI work perfectly

castling, en passant, promotion, etc.

quiet onyx Dec 16, 2025, 4:01 PM

#

this model scored 40% on my spatial reasoning test from a 20 year old children's medieval fantasy game

honest oxide Dec 16, 2025, 4:01 PM

#

😭

pastel turret Dec 16, 2025, 4:04 PM

#

did very well in a personal bench, just as well as gemini 2.5 flash / grok 4 fast / deepseek 3.2

honest oxide Dec 16, 2025, 4:05 PM

#

pastel turret did very well in a personal bench, just as well as gemini 2.5 flash / grok 4 fas...

what's the API pricing? It might replace grok 4 fast for me

pastel turret Dec 16, 2025, 4:06 PM

#

no idea, but I'd guess cheaper

honest oxide Dec 16, 2025, 4:06 PM

#

also 15 day free period ⁉️

pastel turret Dec 16, 2025, 4:06 PM

#

probably figuring out pricing

#

hybrid models are probably harder to price correctly than traditional models

#

idk

honest oxide Dec 16, 2025, 4:07 PM

#

holy

pastel turret Dec 16, 2025, 4:07 PM

#

based

#

not AMAZING at terminal bench, but not awful?

#

GLM 4.6 scores 24.5%

#

this is actually the second highest open model on terminal bench perhaps?

#

yeah I think so

#

deepseek is higher but has other factors that make quite bad to use for coding, like insane hallucination

#

This model is still prone to hallucinations, but vibes seem better on its niche knowledge than most other open models I've tried

quiet onyx Dec 16, 2025, 4:11 PM

#

blog mentions this model is under MIT licence but Github has Apache???

https://github.com/XiaomiMiMo/MiMo-V2-Flash/blob/main/LICENSE

GitHub

MiMo-V2-Flash/LICENSE at main · XiaomiMiMo/MiMo-V2-Flash

Contribute to XiaomiMiMo/MiMo-V2-Flash development by creating an account on GitHub.

brittle notch Dec 16, 2025, 4:12 PM

#

Improvement

pastel turret Dec 16, 2025, 4:12 PM

#

this model might actually be the best flash level model out there rn

#

honest oxide Dec 16, 2025, 4:13 PM

#

this is 100% taking the role of grok 4 fast for me

pastel turret Dec 16, 2025, 4:13 PM

#

scoring higher than grok code fast on terminal bench

hardy plume Dec 16, 2025, 4:13 PM

#

sounds like a great model, now i gotta try it

pastel turret Dec 16, 2025, 4:14 PM

#

and if it wasn't apparent: scoring higher than minimax, glm 4.6, kimi k2 (and thinking variant), and even Claude 4.5 Haiku

hardy plume Dec 16, 2025, 4:14 PM

#

@viscid fog

#

maybe because of the word "brutalist"?

#

nope.. not because of brutalist

#

🤔

#

maybe my system prompt

honest oxide Dec 16, 2025, 4:17 PM

#

oh right refusals

#

hopefully it doesn't do that when I try to use it to answer tool calls?

pastel turret Dec 16, 2025, 4:17 PM

#

sadly, this model does not have up to date knowledge it seems (e.g React Router V7, Tailwind V4)

hardy plume Dec 16, 2025, 4:18 PM

#

hardy plume maybe my system prompt

i have no clue what in my system prompt could be causing this

#

k time to cut out parts till it works

honest oxide Dec 16, 2025, 4:20 PM

#

looks like .8 temp and .95 top_p is recommended?

quiet onyx Dec 16, 2025, 4:21 PM

#

honest oxide holy

where is this from? I couldn't find any pricing info on their official API docs https://platform.xiaomimimo.com/#/docs/pricing

Xiaomi MiMo API Open Platform

A simple open-source product documentation platform

honest oxide Dec 16, 2025, 4:21 PM

#

quiet onyx where is this from? I couldn't find any pricing info on their official API docs ...

https://platform.xiaomimimo.com/#/docs/quick-start/first-api-call

Xiaomi MiMo API Open Platform

A simple open-source product documentation platform

#

oh

#

you meant the pricing

#

https://platform.xiaomimimo.com/#/docs/news/news20251216

Xiaomi MiMo API Open Platform

A simple open-source product documentation platform

hardy plume Dec 16, 2025, 4:23 PM

#

hardy plume i have no clue what in my system prompt could be causing this

okay it seems to be this causing it

#

whatever i can omit that to use it anyway

quiet onyx Dec 16, 2025, 4:24 PM

#

honest oxide https://platform.xiaomimimo.com/#/docs/news/news20251216

$0.1 per million input is low but not as low as like $0.02 from some other models if it doesn't support caching

honest oxide Dec 16, 2025, 4:25 PM

#

quiet onyx $0.1 per million input is low but not as low as like $0.02 from some other model...

tbf it doesn't say

hardy plume Dec 16, 2025, 4:26 PM

#

hardy plume okay it seems to be this causing it

and somehow only sometimes

honest oxide Dec 16, 2025, 4:30 PM

#

oh my god

#

this model is actually really good at being agentic

#

thank you xiaomi

brittle notch Dec 16, 2025, 4:33 PM

#

visual creek Dec 16, 2025, 4:33 PM

#

brittle notch

SO GOOD

quiet onyx Dec 16, 2025, 4:34 PM

#

atleast they didn't commit any chart crimes here

brittle notch Dec 16, 2025, 4:35 PM

#

did retry, stuck in reasoning

steel harbor Dec 16, 2025, 4:35 PM

#

its a good model, sir

brittle notch Dec 16, 2025, 4:35 PM

#

ohhh, the

You are MiMo-V2-Flash (free), a large language model from xiaomi.

Formatting Rules:
- Use Markdown for lists, tables, and styling.
- Use ```code fences``` for all code blocks.
- Format file names, paths, and function names with `inline code` backticks.
- **For all mathematical expressions, you must use dollar-sign delimiters. Use $...$ for inline math and $$...$$ for block math. Do not use (...) or [...] delimiters.**

fucks with it's head

#

#

infinite headpats. this model gets it to not be awkward

#

#

so it is VERY system prompt sensitive and gets hyper focused on it's objective to the point of getting confused

honest oxide Dec 16, 2025, 4:38 PM

#

guys bad news, I just saw a screenshot from the jai server

hardy plume Dec 16, 2025, 4:38 PM

#

safety filter should help if its this sensitive

honest oxide Dec 16, 2025, 4:39 PM

#

honest oxide Dec 16, 2025, 4:39 PM

#

hardy plume safety filter should help if its this sensitive

safety filters save us

brittle notch Dec 16, 2025, 4:42 PM

#

hardy plume safety filter should help if its this sensitive

sir, you want to nurf an already silly model?

#

https://tenor.com/view/my-hero-acadamia-deku-excited-head-desk-wiggling-gif-5497470

Tenor

hardy plume Dec 16, 2025, 4:43 PM

#

brittle notch sir, you want to nurf an already silly model?

well its open weights, providers will probs pick it up once they resolve the license thing

honest oxide Dec 16, 2025, 4:46 PM

#

I like this model for summarizing

quiet onyx Dec 16, 2025, 4:46 PM

#

hardy plume well its open weights, providers will probs pick it up once they resolve the lic...

I made a github issue about that https://github.com/XiaomiMiMo/MiMo-V2-Flash/issues/2

GitHub

Licence clarification · Issue #2 · XiaomiMiMo/MiMo-V2-Flash

The blog mention this model is under the MIT license but this repo has Apache 2.0 licence file. Which is the correct licence for this model?

honest oxide Dec 16, 2025, 4:46 PM

#

overall very happy with it

viscid fog Dec 16, 2025, 4:47 PM

#

brittle notch ohhh, the ``` You are MiMo-V2-Flash (free), a large language model from xiaomi....

I am fixing this now

hardy plume Dec 16, 2025, 4:48 PM

#

based on their gh these are very recommended

visual creek Dec 16, 2025, 4:49 PM

#

Hightly recommended, even

hardy plume Dec 16, 2025, 4:49 PM

#

also seems to be a interleaved thinker

#

like week number maybe?

visual creek Dec 16, 2025, 4:51 PM

#

Maybe it should be {month} ?

hardy plume Dec 16, 2025, 4:51 PM

#

i asked gemini 3 to translate the system prompt

#

thats probably what it would come out to

quiet onyx Dec 16, 2025, 4:53 PM

#

so far this model have been performing my SWE tests competently, such as compiling program from the year 2000 on modern toolchain https://gist.github.com/kth8/0897f24ce7c7bed643291dd6ff658e15

Gist

gist:0897f24ce7c7bed643291dd6ff658e15

GitHub Gist: instantly share code, notes, and snippets.

brittle notch Dec 16, 2025, 4:57 PM

#

my friend sent this, very weird result for multiplication

📎 message.txt

hardy plume Dec 16, 2025, 4:58 PM

#

talk about optimizing, inlining C

#

if only it worked that easily

pastel turret Dec 16, 2025, 4:58 PM

#

I'll be extremely pleasantly surprised if this ends up to be the best open agentic coding model

#

and it seems like it may be that

#

(deepseek 3.2 excluded for being too spiky, slow, and full of hallucinations)

visual creek Dec 16, 2025, 4:59 PM

#

I guess Cursor Composer V2 will just happen to drop a week from now as well then eh

pastel turret Dec 16, 2025, 5:00 PM

#

I was about to say

#

this is like a better composer

#

lmao

visual creek Dec 16, 2025, 5:00 PM

#

I think composer is just GLM 4.5-4.6 finetuned no?

pastel turret Dec 16, 2025, 5:00 PM

#

if they do their postrain regime on this model instead of GLM 4.6 (assuming that's what they're using)

#

yea

#

I think so

#

lines up with the pricing

#

they could run this on GPUs instead of cerebras or groq or whoever, and serve it WAY cheaper

#

one composer's issues has been it's surprisingly expensive, I assume because it's on cerebras/groq to go fast

steel harbor Dec 16, 2025, 5:01 PM

#

is this actually securely better than glm? im having mixed vibes with it

pastel turret Dec 16, 2025, 5:01 PM

#

it seems better to me

#

maybe not design sense

left tree Dec 16, 2025, 5:03 PM

#

If it is flash then the pro model will also be on the way.

brittle notch Dec 16, 2025, 5:13 PM

#

steel harbor is this actually securely better than glm? im having mixed vibes with it

doesn't seem to have real understanding on stuff and seems to be trained on best Q&A from our short test, but it is adorable

#

did not test the agentic coding of it, so can't prove or deny bulbasaur

quiet onyx Dec 16, 2025, 5:23 PM

#

Honestly shocked Xiaomi coming in out of nowhere and making a model this good at agentic tasks

honest oxide Dec 16, 2025, 5:35 PM

#

I am very pleasantly impressed by it

pastel turret Dec 16, 2025, 5:58 PM

#

brittle notch did not test the agentic coding of it, so can't prove or deny bulbasaur

Did better or on par in a couple vibes tests, but the big thing to me is it’s performing significantly better on TerminalBench 2.0

#

Which imo is a quite high quality bench

brittle notch Dec 16, 2025, 5:59 PM

#

no, aider benchmark is imo, but no unofficial results yet

umbral oxide Dec 16, 2025, 6:01 PM

#

Ok Xiaomi, I wasn't familiar with your game...

#

Looks interesting

pastel turret Dec 16, 2025, 6:02 PM

#

brittle notch no, aider benchmark is imo, but no unofficial results yet

I think aider is not relevant anymore, it doesn’t test tool calling or anything

#

it’s been around too long, data is definitely in training sets

#

The benchmark hasn’t even been updated with latest models in over a month it seems like

brittle notch Dec 16, 2025, 6:03 PM

#

in my tests, it is still representative of the model state. i don't think it is a benchmark they can saturate.

brittle notch Dec 16, 2025, 6:03 PM

#

pastel turret The benchmark hasn’t even been updated with latest models in over a month it see...

ppl in discord testing themselves

pastel turret Dec 16, 2025, 6:03 PM

#

shrujj

jovial plinth Dec 16, 2025, 7:01 PM

#

😭 this is so good guys.

pastel turret Dec 16, 2025, 8:17 PM

#

seems pretty good, testing it with Opencode on a large codebase

#

not "amazing" or anything

#

like composer 1

rustic river Dec 16, 2025, 9:01 PM

#

I'm constantly getting

421 {"error":{"code":"421","message":"Moderation Block","param":"The request was rejected because it was considered high risk","type":"content_filter"}}

even if I just say "hello"

#

Not sure how you guys have managed to get it working in your agentic coding tools

hardy plume Dec 16, 2025, 10:13 PM

#

hey indeed.

#

jeez this model is quite good, no longer getting the safety warning atleast, managed to add a codex-like session system to my cli first try, also went around my harness by using shell commands to read files because it didnt like that i didnt have any line range support for reading.

#

this model is quite eager to write test python scripts, then use them to test, very practical and seems to actually delete them after too

#

similar to claude

pastel turret Dec 16, 2025, 11:29 PM

#

First Token Latency is the only issue I have with this model currently

#

it's ~2.5s on average, which cuts down significantly on the benefits of TPS

#

if it was like 500ms like other models it would be amazing

hardy plume Dec 16, 2025, 11:30 PM

#

pastel turret it's ~2.5s on average, which cuts down significantly on the benefits of TPS

probably due to the region and their filter

quiet onyx Dec 17, 2025, 6:18 AM

#

One limitation I found with this model is it can only make 1 tool call per turn. That became really inefficient and troublesome here when I asked to setup a whole cluster and it can only run commands or write file to 1 machine at a time https://gist.github.com/kth8/f2d17b3b8b017055a4daedd03994d2f6

Gist

gist:f2d17b3b8b017055a4daedd03994d2f6

GitHub Gist: instantly share code, notes, and snippets.

#

I've seen Grok in comparison make 5-10 tool calls at once per turn to manage multiple machine in parallel

pastel turret Dec 17, 2025, 3:53 PM

#

quiet onyx One limitation I found with this model is it can only make 1 tool call per turn....

It seems like it’s doing parallel tool calls in OpenCode… are you sure it can’t?

quiet onyx Dec 17, 2025, 4:28 PM

#

pastel turret It seems like it’s doing parallel tool calls in OpenCode… are you sure it can’t?

I haven't seen it. Is there some magic phrase I need to put in the system prompt for it to do it?

last sluice Dec 17, 2025, 4:31 PM

#

Is this model permanently free, or just for a week or so to test, like Grok 4.1 was?

quiet onyx Dec 17, 2025, 4:33 PM

#

last sluice Is this model permanently free, or just for a week or so to test, like Grok 4.1 ...

free time left: https://platform.xiaomimimo.com/#/docs/pricing

Xiaomi MiMo API Open Platform

A simple open-source product documentation platform

last sluice Dec 17, 2025, 4:33 PM

#

Perfect, thank you.

thick niche Dec 17, 2025, 8:36 PM

#

honest oxide 😭

o z o n e

lean maple Dec 18, 2025, 4:40 PM

#

Xaomi has been throwing 500s and 524s

Its not common but it happens

hardy plume Dec 18, 2025, 6:21 PM

#

this model is a good replacement for grok code fast if providers will step that low on price & have caching

#

grok code fast is good but its tool calls are hit or miss, sometimes it tries to call them in reasoning and says it did do the changes but didn't

lean maple Dec 20, 2025, 1:15 AM

#

xiaomi mimo is unsuable now due to rate limits

pastel turret Dec 20, 2025, 2:49 AM

#

@viscid fog btw Novita is hosting this model now, can we get a paid endpoint?

bleak forum Dec 20, 2025, 5:19 PM

#

https://x.com/XiaomiMiMo/status/2002401028722106843

XiaomiMiMo (@XiaomiMiMo)

MiMo-V2-Flash scores 66 on the @ArtificialAnlys Intelligence Index — #2 among open-source models and #8 overall! 🎉🎉🎉

Designed for Agentic AI — now with the benchmarks to prove it: #1 on τ²-Bench Telecom for agentic tool-use among all evaluated models. ⚡⚡⚡

Frontier

proven sequoia Dec 20, 2025, 8:27 PM

#

did anyone manage to get this working with interleaved thinking in agentic coding tool?

naive hedge Dec 20, 2025, 10:31 PM

#

thinking is disabled in their anthropic API for some reason, and can't be turned on

buoyant phoenix Dec 20, 2025, 10:55 PM

#

not doing full testing, but as proxy, not great at chess, bottom 15%

proven sequoia Dec 21, 2025, 12:35 AM

#

naive hedge thinking is disabled in their anthropic API for some reason, and can't be turned...

ah... that might explain when I tried to do integrated thinking with OpenCode via OR API it just put the function call inside the think and printed the XML instead of interceptin it as a function call

#

but their docs say it supports tool integrated thinking

#

wierd

naive hedge Dec 21, 2025, 2:14 AM

#

proven sequoia ah... that might explain when I tried to do integrated thinking with OpenCode vi...

I am getting that too in opencode, but that's slightly unrelated to Anthropic API as OR almost certainly is using OpenAI API format where thinking can be enabled. But not sure what's up with these XMLs here.

#

OpenAI spec doesn't really support interleaved thinking (I don't even think there is a spec?), iirc opencode conditionally turns it on for a few models such as new deepseek 3.2, probably not for this model yet anyway

proven sequoia Dec 21, 2025, 2:20 AM

#

DeepSeek API does it in its own adapted way, so does OpenRouter

#

however I was hoping that it would work using a model via OpenRouter, since the API to the app is the same

pastel turret Dec 21, 2025, 2:33 AM

#

I quite appreciate the coding style of this model

#

unlike something like gpt 5 which I still hate the coding style of

#

this model structures code well and doesn't have weird stuff mixed in like gpt would (e.g if clauses with like 5 && conditionals to validate the type of a parameter)

#

vast mirage Dec 21, 2025, 3:38 PM

#

Not a really useful topic. But this model is good for Rp too

strange bison Dec 21, 2025, 4:07 PM

#

vast mirage Not a really useful topic. But this model is good for Rp too

Compared to what models?
Claude or other models like deepseek?

vast mirage Dec 21, 2025, 4:10 PM

#

strange bison Compared to what models? Claude or other models like deepseek?

It is a bit smarter than deepseek. I havent use claude since i aint paying allat for claude. So this is being compared to v3.2 exp, v3.2, grok 4.1 fast

#

so most of the nsfw models that is normally used in RP.

#

Nah i take my words back

#

it is yet not on the level of v3.2

honest oxide Dec 21, 2025, 4:17 PM

#

vast mirage Not a really useful topic. But this model is good for Rp too

I like deepseek better

vast mirage Dec 21, 2025, 4:18 PM

#

honest oxide I like deepseek better

yep, deepseek is better

stone tangle Dec 21, 2025, 5:56 PM

#

vast mirage Not a really useful topic. But this model is good for Rp too

We've seen a ton of messages above about moderation. You can't really roleplay with constant moderation errors.

vast mirage Dec 21, 2025, 6:29 PM

#

stone tangle We've seen a ton of messages above about moderation. You can't really roleplay w...

huh?. i never got moderation message at all while i am mimo. even for nsfw rp

stone tangle Dec 21, 2025, 6:32 PM

#

Maybe not a "ton," but if you scroll up, you'll see a couple.

strange bison Dec 21, 2025, 9:53 PM

#

Bruh. What kind of role plays are you guys doing that you get constant moderation errors. Genuinely curious

naive hedge Dec 21, 2025, 11:33 PM

#

proven sequoia however I was hoping that it would work using a model via OpenRouter, since the ...

In most recent opencode, it seems you can add this to a model:

"interleaved": {
  "field": "reasoning_content"
},

seems to work here with OR (no more xml errors), though this model tends to get loopy kinda soon

proven sequoia Dec 22, 2025, 4:24 AM

#

naive hedge In most recent opencode, it seems you can add this to a model: ``` "interleaved...

oh thankyou, I will try it again later

silver mortar Dec 23, 2025, 10:14 AM

#

this is sooo fast

rustic river Dec 23, 2025, 1:59 PM

#

The model has been glitching in opencode today with broken outputs and premature stops

proven sequoia Dec 23, 2025, 8:46 PM

#

This model seems pretty smart but I can't get it working properly in opencode

ionic surge Dec 23, 2025, 11:35 PM

#

whats weird to me is the strong recommendation to turn off thimking for agentic stuff

#

i dont really understand WHY

#

also the promise of them not logging prompts i kinda doubt

proven sequoia Dec 23, 2025, 11:54 PM

#

just how it was trained I guess, although I don't know why it also supports interleaved thinking with that being the case

ionic surge Dec 24, 2025, 5:00 AM

#

kinda interesting how no providers have launched support for this model

#

on a paid endpoint

#

and idk when xiaomi will end this (if they will?)

#

@viscid fog do you know anything about this?

quiet onyx Dec 24, 2025, 10:21 AM

#

ionic surge and idk when xiaomi will end this (if they will?)

free time left: https://platform.xiaomimimo.com/#/docs/pricing

Xiaomi MiMo API Open Platform

A simple open-source product documentation platform

strange bison Dec 24, 2025, 2:28 PM

#

ionic surge kinda interesting how no providers have launched support for this model

I know providers like chutes do provide this model , but open router hasn't offered the paid version yet

proven sequoia Dec 24, 2025, 3:29 PM

#

Xiaomi will presumably swap their endpoint to a paid one soon. The model itself seems really smart (I went through a difficult problem with it in the chat interface), but interleaved not working properly in opencode for me right now.

#

this was using the "interleaved": {"field": "reasoning_details"} trick but maybe there is some other stuff that needs to be done for it to work properly.

pastel turret Dec 24, 2025, 3:31 PM

#

ionic surge on a paid endpoint

novita does, openrouter just not routing to it yet

rustic river Dec 27, 2025, 3:54 PM

#

I mostly do brownfield coding, and it solved problems that GPT 5.2 Codex / Gemini 3 could not

#

It's remarkably persistent. One of my favorite models of 2025 so far, given that it's also fast.

slow jay Dec 31, 2025, 1:12 PM

#

no paid verison yet

strange bison Dec 31, 2025, 6:22 PM

#

rustic river I mostly do brownfield coding, and it solved problems that GPT 5.2 Codex / Gemin...

What parameters are you using?

hardy plume Dec 31, 2025, 11:07 PM

#

they extended the free access

pastel turret Jan 1, 2026, 2:04 AM

#

We got a paid endpoint but OR hasn’t put it up yet..

#

(Novita)

rustic river Jan 2, 2026, 10:59 AM

#

strange bison What parameters are you using?

Default!

slow jay Jan 2, 2026, 7:30 PM

#

hardy plume they extended the free access

how much more

hardy plume Jan 2, 2026, 7:33 PM

#

slow jay how much more

till jan 20

slow jay Jan 9, 2026, 7:43 AM

#

hardy plume till jan 20

and then what pricing?

quiet onyx Jan 9, 2026, 8:23 AM

#

slow jay and then what pricing?

https://platform.xiaomimimo.com/#/docs/news/beta-free

Xiaomi MiMo API Open Platform

A simple open-source product documentation platform

fickle mortar Jan 12, 2026, 2:03 AM

#

@viscid fog Novita AI has support for the paid mimo v2 flash, can we please get support on the openrouter gateway

fickle mortar Jan 12, 2026, 2:19 AM

#

i need to use it in a commercial application

strange bison Jan 20, 2026, 1:22 PM

#

The Mimo model has gotten a new snapshot 🥂

#

The free tier continues

#

ionic surge Jan 21, 2026, 7:03 AM

#

yay

#

they finally say that the thinking mode is recommended

slow jay Jan 21, 2026, 1:53 PM

#

strange bison The free tier continues

till when?

clear ember Jan 21, 2026, 3:37 PM

#

why is this model getting deprecated?

viscid fog Jan 21, 2026, 3:55 PM

#

the free model is

strange bison Jan 21, 2026, 5:34 PM

#

slow jay till when?

26th Jan, I assume

clear ember Jan 21, 2026, 8:29 PM

#

viscid fog the free model is

so it will be paid only model? and why?

lean maple Jan 22, 2026, 1:25 PM

#

clear ember so it will be paid only model? and why?

Yes, it will only be paid

The provider that's providing the model for free will stop providing it for free soon

quiet onyx Jan 22, 2026, 1:33 PM

#

abuse it while you still can

ionic surge Jan 22, 2026, 4:29 PM

#

dont have any workloads to abuse it with 🥀

hoary quail Jan 23, 2026, 10:23 AM

#

does anyone here know which AP prompt works best with mimo v2? for RP specifically?

plain coral Jan 23, 2026, 12:06 PM

#

I have a question about the deprecation of this model. I tested it over the last 30 days very extensively and find it very useful for my agentic coding tasks. Other providers hosting this model too and I tried it out on them and have very different behaviors. How do I know that they run the same latest snapshot? Should I ask them all? And what is the latest snapshot at all hosted on OR for the free model?
Except for the thinking loop and leaking tool calls into assistant messages and thinking tokens, the model performs very well.
At least, the paid providers let me choose the seed parameter.

vast mirage Jan 24, 2026, 1:00 PM

#

Anyone got recommendations for free models that is on same level as mimo v2

hoary quail Jan 24, 2026, 1:25 PM

#

vast mirage Anyone got recommendations for free models that is on same level as mimo v2

unfortunately not really. mimo v2 with titi's prompt is probably as close u can get to a decent 3.2 experience as you can right now
and its the cheapest option right now too

vast mirage Jan 24, 2026, 2:51 PM

#

hoary quail unfortunately not really. mimo v2 with titi's prompt is probably as close u can ...

yea I think i might switch to v3.2 with a provider that allows for cache read

analog juniper Jan 25, 2026, 12:20 AM

#

This model is a neat lil guy. I like him, especially for the price.

pure garnet Jan 25, 2026, 1:30 AM

#

Oh, huh 🤔

$0.09/M input tokens
$0.29/M output token
That price is actually pretty low, around which models do you think this performs?

hoary quail Jan 25, 2026, 2:31 AM

#

vast mirage yea I think i might switch to v3.2 with a provider that allows for cache read

although personally, mimo has a 2nd person POV issue
for ideal results you need to generate 1-3 messages with DS 3.2 first, and then you can leapfrog to mimo v2 for a better stable experience that's different from 3.2 and cheaper

i've done extensive testing (like 200 msgs) worth, so give it a try

hoary quail Jan 25, 2026, 3:52 AM

#

vast mirage yea I think i might switch to v3.2 with a provider that allows for cache read

alo another major weakness of mimo v2 is that it really struggles to progress the scene/RP on its own
you have to explicitly prompt it (and that's even with a comprehensive AP prompt) as well

ionic surge Jan 25, 2026, 3:52 AM

#

pure garnet Oh, huh 🤔 > $0.09/M input tokens > $0.29/M output token That price is actually...

grok 4.1 fast

#

and cheaper than it too

#

and i already thought that 4.1 fast was the performance/price goat

#

mimo takes it

hoary quail Jan 25, 2026, 6:25 AM

#

ionic surge grok 4.1 fast

is grok 4.1 similar to deepseek 3.2 or something in the RP it generates?

honest oxide Jan 25, 2026, 6:28 AM

#

hoary quail is grok 4.1 similar to deepseek 3.2 or something in the RP it generates?

no, under no circumstances should you use grok fast for rp

#

it barely speaks english

hoary quail Jan 25, 2026, 6:29 AM

#

honest oxide no, under no circumstances should you use grok fast for rp

aw thats a damn shame. than any other recommendations for something thats as cheap as mimo v2?
i just need something thats on mimo v2 level but actually can advance the plot story on its own
cuz basically all i use right now is DS 3.2 with heavily nerfed context to make it budget affordable

ionic surge Jan 25, 2026, 6:35 AM

#

rp 🥀

#

idk i dont do that

#

mostly for classification tasks/extraction takss etc

honest oxide Jan 25, 2026, 6:38 AM

#

hoary quail aw thats a damn shame. than any other recommendations for something thats as che...

nothing beats deepseek on price per token. GLM has a cheap sub they offer, but that's about it tbh

#

I'm not sure what your budget is, but if you're using something like ST I'd say maybe look into a memory extension to save on context

hoary quail Jan 25, 2026, 6:40 AM

#

honest oxide nothing beats deepseek on price per token. GLM has a cheap sub they offer, but t...

i'd use glm 4.7 everytime, but its damn expensive once i get past 50 msgs
i wish it had a non thinking mode - or at least didn't use as much thinking, cuz frankly, its kinda ridiculous compared to models like r1 0528
i'm not exactly broke - i just prefer as much bang for buck llm model as possible
i use j.ai , just not sure how using chat memory affects how much tokens is consumed and whatnot

honest oxide Jan 25, 2026, 6:41 AM

#

hoary quail i'd use glm 4.7 everytime, but its damn expensive once i get past 50 msgs i wis...

look into the z.ai coding plan then, but yeah as far as PAYG goes nothing beats deepseek

#

(also, 4.7 does have a non thinking mode, j.ai just doesn't support it kek )

quiet onyx Jan 25, 2026, 7:20 AM

#

hoary quail i'd use glm 4.7 everytime, but its damn expensive once i get past 50 msgs i wis...

GLM are hybrid reasoning models mean you can disable thinking. You can do it via code or create a custom https://openrouter.ai/docs/guides/features/presets

OpenRouter Documentation

Presets - Configuration Management for AI Models

Learn how to use OpenRouter's presets to manage model configurations, system prompts, and parameters across your applications.

vast mirage Jan 25, 2026, 7:05 PM

#

hoary quail alo another major weakness of mimo v2 is that it really struggles to progress th...

it hallucinates alot for me. Like many times it would just either say something irrelevant or something that is not within scenario. Interestingly My friend pointed out that this happened in chub.ai more than it happened on Janitor

#

i will switch to r1 0528

hoary quail Jan 25, 2026, 11:32 PM

#

vast mirage it hallucinates alot for me. Like many times it would just either say something...

yea just did some further testing
its a big nothing burger model, just generates responses that don't really move forward, more circling on the spot - no matter the AP prompt
i don't know if this will change but i would suggest to avoid it for now

pure garnet Jan 25, 2026, 11:38 PM

#

GLM 4.7 has a non-thinking mode

hoary quail Jan 25, 2026, 11:47 PM

#

pure garnet GLM 4.7 has a non-thinking mode

i'm aware of that. but i dunno how to easily set it up. i don't got a lick of coding knowledge

pure garnet Jan 25, 2026, 11:52 PM

#

The easiest no code way would be to make a preset that forces no reasoning: https://openrouter.ai/settings/presets

thorn bane Jan 26, 2026, 3:29 AM

#

What does "deprecrating Jan 26, 2026" mean ?

open heart Jan 26, 2026, 3:38 AM

#

thorn bane What does "deprecrating Jan 26, 2026" mean ?

it's being unavailable (correct me if I'm wrong)

open heart Jan 26, 2026, 3:40 AM

#

pastel turret did very well in a personal bench, just as well as gemini 2.5 flash / grok 4 fas...

it is fast but not as good as gemini 2.5 for coding

pure garnet Jan 26, 2026, 3:55 AM

#

2.5 Flash is over 3x the input price and over 8x the output price, though

river lichen Jan 27, 2026, 3:04 AM

#

does anyone know how to integrate mimo v2 to j.ai from the xiaomi site itself? i keep getting network errors

half iris Jan 27, 2026, 4:13 AM

#

does anyone know why this is always rate limited?
I can't get any agentic coding to work with this model
it consistently stops early, before making any file changes

rain veldt Jan 27, 2026, 4:28 AM

#

half iris does anyone know why this is always rate limited? I can't get any agentic coding...

Do you use paid or free version?

half iris Jan 27, 2026, 4:48 AM

#

paid

#

ok, maybe this is a false alarm
I was getting rate limits yesterday
but I just spotted a problem with my opencode config (edit: deny)
giving this another test now

half iris Jan 27, 2026, 5:20 AM

#

yup, it was my config
seems to be working fine now
thanks for responding @ Monkey !

plain coral Jan 27, 2026, 9:15 AM

#

half iris does anyone know why this is always rate limited? I can't get any agentic coding...

The issue with this model is, that it sometimes still leaks tool calls into the thinking and sometimes assistant tokens and this stops the multi-turn inference. To use this model reliably, I needed to sanitize those tokens after streaming each block and ignore the stop_reason for that.
Additional, I suspect that the paid providers are using an older snapshot of this model because it behaves very differently by each of them.
It is such a great model, but without enough transparency hard to use without those workarounds.

rain veldt Jan 27, 2026, 10:10 AM

#

river lichen does anyone know how to integrate mimo v2 to j.ai from the xiaomi site itself? i...

I don't think xiaomi site support jai

half iris Jan 27, 2026, 1:26 PM

#

I happen to be using atlas cloud as a provider
Is there any way to identify which version of the model they provide?

lean maple Jan 28, 2026, 12:35 AM

#

half iris I happen to be using atlas cloud as a provider Is there any way to identify whic...

Is there any way to identify which version of the model they provide?

Nope

river lichen Jan 28, 2026, 5:01 AM

#

rain veldt I don't think xiaomi site support jai

damn, i'd thought it'll work just like the official deepseek site, just by adding & putting its own v1/chat/completions url into j.ai proxy settings like usual

hoary quail Jan 29, 2026, 1:22 PM

#

half iris I happen to be using atlas cloud as a provider Is there any way to identify whic...

nope but i notice certain providers give real subpar responses, sometimes. so i'd recommend blocking them if ur okay with that. atlas cloud is generally terrible

quiet onyx Jan 29, 2026, 2:19 PM

#

I only use official provider when possible. In this case the official xiaomi/fp8 provider is also the cheapest and fastest

half iris Jan 29, 2026, 2:22 PM

#

Oh I only have Atlas Cloud and Novita AI available for this model
Chutes and Xiaomi must be blocked in my OR privacy settings

#

So any opinions on Novita vs Atlas Cloud? 🫠

quiet onyx Jan 29, 2026, 2:23 PM

#

half iris So any opinions on Novita vs Atlas Cloud? 🫠

Novita supports prompt caching which will be the biggest cost saver

pure garnet Jan 29, 2026, 2:53 PM

#

Oh, wow, TIL this has caching, this is dirt cheap

half iris Jan 29, 2026, 6:25 PM

#

Yup it is
And in my testing so far, it does a decent job
One real hallucination about a .gopls.toml file which isn’t a feature of gopls
Otherwise, it’s been nice

fickle mortar Jan 30, 2026, 5:57 PM

#

quiet onyx Novita supports prompt caching which will be the biggest cost saver

Novita has the lowest tps(they are running the model in vllm not sglang) and the lowest e2e latency

quiet onyx Jan 30, 2026, 6:25 PM

#

fickle mortar Novita has the lowest tps(they are running the model in vllm not sglang) and the...

why are you stating to me what I posted in my screenshot?

half iris Jan 31, 2026, 1:20 AM

#

Ya Xiaomi looks like the most performant provider and supports caching
But I prioritize the privacy side above everything else so not an option for me

Atlas cloud is noticeably much faster than Novita
But Novita supports caching 🤷‍♂️

fickle mortar Feb 1, 2026, 12:20 AM

#

half iris Ya Xiaomi looks like the most performant provider and supports caching But I pri...

Xiaomi also has a zdr policy

quiet onyx Feb 1, 2026, 12:29 AM

#

fickle mortar Xiaomi also has a zdr policy

Xiaomi is not listed here unlike AtlasCloud and Novita https://openrouter.ai/docs/guides/features/zdr#zero-retention-endpoints

OpenRouter Documentation

Zero Data Retention - How OpenRouter gives you control over your data

Learn how OpenRouter gives you control over your data

fickle mortar Feb 1, 2026, 12:29 AM

#

quiet onyx Xiaomi is not listed here unlike AtlasCloud and Novita https://openrouter.ai/doc...

You can check out on Xiaomi api page

half iris Feb 1, 2026, 1:38 AM

#

Xiaomi Retained for 30 days ✓ Does not train
^^ open router docs ^^

#

https://platform.xiaomimimo.com/#/docs/welcome

here’s a snippet I’ve read so far

API Services. If you use the API services, we will collect your IP address and the text information you submit to analyze the relevant instructions based on the model you select and to generate the returned content. Xiaomi will not use the text content you provide for model training or any other purposes. When you use prepaid API services, we will collect your top-up information and transaction records**.**

Xiaomi MiMo API Open Platform

A simple open-source product documentation platform

#

I don’t see the number 30 show up in that page 🤷‍♂️

#

Seems to me like openrouter has a different reason for the privacy settings restricting Xiaomi
Or it’s a mistake?

strange bison Feb 10, 2026, 2:51 AM

#

Anyone else having an issue with the xiaomi endpoint where the model thinks forever all the sudden?

Like didn't change the prompt or any of the sampling parameters, yet it's happening consistently in the last few days

fickle mortar Feb 19, 2026, 7:06 PM

#

strange bison Anyone else having an issue with the xiaomi endpoint where the model thinks fore...

you should optimize the hyperparams as per your application

#

refer to xiaomi huggingface

strange bison Feb 19, 2026, 7:09 PM

#

fickle mortar you should optimize the hyperparams as per your application

No, I think Xiaomi had an issue on their end.
It was working fine for 7-8 days, then for 1-2 days it started capping reasoning (65k reasoning tokens) occasionally, even though I didn't change anything. Now it's fine again.
Weird. Also xiaomi end point is the only end point I direct to for the API calls becuz of cache so it wasn't a provider switch either.

celest grail Feb 23, 2026, 2:40 PM

#

this is awesome

#

i build a two agent loop so they get to a consensus with very strict guidelines

#

the result was pretty good

#

crazy value

ionic surge Feb 23, 2026, 3:31 PM

#

yeah it’s peak

fickle mortar Feb 23, 2026, 5:47 PM

#

I just wished we get more providers and xiaomi oss ed the latest mimo v2

visual creek Feb 23, 2026, 6:22 PM

#

This is an impressively capable tool calling agent model, nothing seems to come even close at the price.

ionic surge Mar 9, 2026, 11:05 PM

#

daily goat model reminder

#

when v3 flash ?

#

im so happy w tyhis model

#

xiaomi's a great provider too

#

great caching

pure garnet Mar 9, 2026, 11:06 PM

#

I agree

ionic surge Mar 9, 2026, 11:09 PM

#

if this model had vision i think it would be insane

#

because the price/performance ratio is insanely good

#

i wonder if grok can reclaim its price/performance crown

honest oxide Mar 9, 2026, 11:14 PM

#

would you say it's better than grok fast?

celest grail Mar 9, 2026, 11:41 PM

#

yes

honest oxide Mar 9, 2026, 11:44 PM

#

no structured output support though 😭

celest grail Mar 10, 2026, 1:39 AM

#

well, it follows instructions very well

ionic surge Mar 10, 2026, 5:20 AM

#

honest oxide no structured output support though 😭

?

#

really?

#

it says it does on orca.orb.town

#

i love the instruction following on this model since i know that many other flash-style models (even g3f) dont follow instructions well

hoary quail Mar 10, 2026, 5:36 AM

#

I've gotten mimo V2 flash to work very well
However my only gripe that if it had a reasoning/thinking mode

That would really take it to the next level
Cuz for price to performance - it's quite good already

ionic surge Mar 10, 2026, 5:39 AM

#

it does

hoary quail Mar 10, 2026, 6:40 AM

#

ionic surge it does

Oh? Where is it then?

ionic surge Mar 10, 2026, 6:41 AM

#

https://openrouter.ai/docs/guides/best-practices/reasoning-tokens

#

just set reasoning.enabled = true or set some reasonign effort and itll turn on

#

i dont think you can change the effort level its just off or on

hoary quail Mar 10, 2026, 6:44 AM

#

ionic surge https://openrouter.ai/docs/guides/best-practices/reasoning-tokens

How do I enable on OR?
Do I have to make a preset or something?

ionic surge Mar 10, 2026, 6:46 AM

#

do you use the api or like the chatroom

hoary quail Mar 10, 2026, 6:46 AM

#

I chat using proxy on j.ai
So I guess...Api I think? Sorry, I'm not a tech guy so not sure

ionic surge Mar 10, 2026, 6:47 AM

#

oh, im not sure how to on janitor

#

try searching up how to enable reasoning on janitor

#

but mimo does support reasoning

steel harbor Mar 18, 2026, 10:27 AM

#

https://x.com/Lentils80/status/2034207379445698593?s=20

Lentils (@Lentils80)

🚨 MiMo-V2-Pro by Xiaomi has suddenly appeared on Artificial Analysis before any official announcements.

Mentioned as a proprietary unimodal (text-only) reasoning model, it sports a 1 million context window with an Intelligence Index score of 49, sitting just below GLM-5.

iron solstice Mar 21, 2026, 12:15 AM

#

viscid fog free & not training

Is there a way to tell which ones dont have an rpd limit?

#

I find that paying for anything less than opus costs me more for anything I do since doing it correctly first time = cheapest but having stuff that would be literally free to experiment with would be nice

versed turret Mar 21, 2026, 2:08 AM

#

iron solstice I find that paying for anything less than opus costs me more for anything I do s...

It's not free anymore

iron solstice Mar 21, 2026, 3:48 AM

#

versed turret It's not free anymore

I meant in general, if there's a way to tell which ones are 1000 rpd vs not

versed turret Mar 21, 2026, 4:05 AM

#

iron solstice I meant in general, if there's a way to tell which ones are 1000 rpd vs not

It's 1000 i think

junior narwhal Mar 21, 2026, 7:13 AM

#

Based on my experience, the Mimo V2 Flash offers the best value for money right now at just $0.1/$0.3.

iron solstice Mar 21, 2026, 8:11 PM

#

versed turret It's 1000 i think

Jesus Christ

#

At least read what you're replying to

#Xiaomi MiMo V2 Flash