#MiMo V2 Pro
1 messages · Page 1 of 1 (latest)
wonder what the pricing on this will be
Context window: 1M tokens
➤ Pricing: $1/$3 per 1M input/output tokens, for 256K token input and $2/$6 per 1M input/output tokens for 1M token input
➤ Availability: Xiaomi first-party API only
➤ Modality: Text input and output only (no multimodality)
benchmaxxed or actually good?
Would be actually good if it’s actually Hunter Alpha
will probably be really good the first week then degrade like the others
I hope not 
Bro i hate that 2x increase after certain context lenght
It’s not too stupid, for low context you get cheaper prices - alternative would be to have everyone pay double
Anyone have any clue what’s the long context performance here? 1M is impressive but it’s an OSS model so I’m expecting it to be doo doo
its their first 1m context model no? im not expecting much tbh
Still type of stuff that i will never support
Claude doing a good thing just as of now with their flatten pricing for long context
GPU memory isn’t free sadly
Claude's API is vastly overpriced, so it evens out
bro claude pricing is 3x for any context length 💀
Compare to their previous pricing, their current pricing is much more better
If we compare it to other then that's different story
It's more expensive for sure than other models, but can't lie with that quality
I don't like being bias toward certain company that produce models, but at this moment no one can't really beat the quality antrophic produce.
The quality of their products seems to make the pricing much more bearable
Unfortunate but that's the reality
I wonder what subreddit did they censored 
I don't think this is hunter alpha
mimo v2 flash outperforms hunter alpha in coding afaict
(at least on my specific tests)
wait what??
is this from them?
Yeah their official release website
Please 🙏 let others host this model
what the FUCK
wow
im not gonna lie
this pricing sucks dick ass balls everything
i dont really think many people will use this
esp. since glm 5 is better
Hunter Alpha is an early internal test build of MiMo-V2-Pro.
hopefully the one they release is a better model because this one loops a lot (in agentic at least) to the point that i cant even get it to do like 10+ turns
What temperature?
They recommend a temp of 0.3 and top_p of 0.95 for “vibe coding” and “function calling” and a temp of 0.8 and top_p of 0.95 for “web development” so it seems like it would be okay with a wide range of temps, models don’t really fall apart with wild temps anymore like they used to, but maybe give 0.3 a shot
Api when? 
Use hunter alpha until we get the actual model on OR
yes still looping hella :(
Fireworks provider wen
It’s not even open weight 

Yeah unfortunately I kinda expected that.
Qwen did the same shit .
i was so excited for the new mimo just to get this :(
For their plus model
Is this 3rd time when minimax released model the same day as other and get overshadowed? 
i thought their plus model was just like the regular biggest one but increased context length
m2.7, glm 5 turbo not open source
really hope this doesnt become a trend
Yeah hopefully Deepseek will make the move and wake up the world again
Someone ask MiMo V2 Pro 💀
Lol
Also officially confirmed
🚀 Model Quick Look
1. MiMo-V2-Pro: The Flagship Foundation for the Agent Era
- Anonymous Launch: It was launched anonymously on OpenRouter for one week under the codename "Hunter Alpha," topping the daily charts for multiple days with a total usage volume exceeding 1 trillion tokens.
- Parameters & Context: Featuring a trillion-scale (1T+) parameter architecture, it natively supports an ultra-long context window of 1M tokens.
- Best Scenarios: Capable of reliably handling complex, long-cycle, and multi-step tasks; it performs exceptionally well within Agent frameworks such as OpenClaw.
2. MiMo-V2-Omni: An Omni-modal Agent Foundation Model that Sees, Hears, and Acts
- Anonymous Launch: It was launched anonymously on OpenRouter for one week under the codename "Healer Alpha," garnering widespread attention from users worldwide.
- Parameters & Context: Designed to be lightweight and efficient, it supports a 256K token context window.
- Best Scenarios: Truly realizes a seamless "See-Hear-Think-Act" workflow; it also demonstrates excellence within ecosystems like OpenClaw.
shoutout to whoever used 822 tokens on openclaw
Waittt... What Mimo-Omni has more world knowledge than Mimo-pro. It knows a kinda obscure VN, Stella of the End that Mimo-pro and even opus 4.6 kinda fully hallucinate (well at least the name of the protagonists and the premise)
is iy any good?
no
unlike v2 omni, this model could output syntaxically fine and functional code, but the entire point of the test is asking it to write the fastest possible implementation of a verifiable api. its implementation worked, but was moderately slow (~1.5x slower than gemini 3 pro, 2x slower than kimi k2.5) - not very impressive to me for a 1T+ parameter reasoning model
yeah its not amazing
I think the 1M context is ambitious
in chess omni wins out in almost every metric. with the exception of ~0.3% illegal rate. did a 20match test (faster, cheaper, better):
mimo-v2-pro $1.60 / match, 54% accuracy, 710 elo, 94s/move
mimo-v2-omni $0.65 / match, 64% accuracy, 840 elo, 34s/move
Lowkey it’s kinda looking like Omni might just be a better model than pro
omni is definitely not for agentic coding
Yeah it seems to struggle with looping tool calls
Definitely more for general conversation and question answer and probably openclaw if people want to do that for some reason
It's surprsingly okay with creative writing though
is mimo free from openrouter?
i saw the news from mimo website, it said there will be one week free partner with openrouter. anyone knows any info?
You are late
It's not free now
sort of. Pro and Omni are somehow free to use in OpenClaw through the openrouter provider: #announcements message
until 3/25
uhhhh seems to be stuck in a loop
What two models would you sandwich it between in terms of coding capability?
honeslty i havent hda much luck coding wieth it because it loops in its thinking and gets stuck a lot (at least with opencode)
i cant really plcae it well
Coding aside, how is this model in practice? Better or worse than glm 5/kimi k2.5?
To me it's worse than both in every possible aspect
Definitely worse in terms of being censored
Running into censorship issues with NSFW rp, it really starts to put up a fight whenever you do anything more rough than vanilla
Getting really annoyed with it cuz combined with its high price, it's wasting my time s
I don't think mimo is good choice in almost anything
Just use Kimir or GLM
Up until recently mimo V2 pro was doing really well in terms of rp both NSFW and normal for me
Something happened recently, not sure what or why
they probably read your chats bro 💀
i was referring to planning and research when I said coding aside lol
Tested MiMo-V2:
Xiaomi MoE hybrid reasoning models.
Flash:
open 309B-A15B model
- some issues with Chinese responses and special character loops
- around Mistral Small 4 overall, cheaper
- decent fast coder
- good bang/buck overall
Omni:
multimodal midclass
- decent performance across most fields, all-rounder
- not for coding
- can be verbose, depending on task
- around GLM-4.5 performance
Pro:
large flagship
- generally smarter than omni but not as versatile
- on generic tasks, quite concise for a reasoning model
- didn't make any noteworthy mistakes across my small coding segments, clear focus area
- poor on creative tasks, very censored
- mediocre bang/buck
At Chess, omni outperformed pro in almost all metrics (elo, accuracy, efficiency, speed, cost). At 580|720|850 mixed Elo they placed fairly weak, peaking around oss-20b.
Vision is only supported by Omni and was fairly mediocre, around Qwen3-VL-8b / Qwen3.5-9b.
I think this is a decent family worth a look. The models are quite generic but produced overall competent results.
No comment in censorship since I haven't tested it, but Pro is really smart in creative writing. 70,000+ tokens with 11 characters in the same scene, and it didn't confuse or mess up character voices and physical appearances
Minimax M2.7 kinda starts falling apart in the same case
gooner spotted
thanks for this simplified pros and cons. mainly use flash and v2 pro currently
would you recommend omni for RP on janitor over say v2 pro?
This one has the same strange cleverness as V2-Flash, but it's a lot smarter. Shame it's so expensive, but I'll be doing more testing.
I quite like it so far
Love this model too
On reasoning tasks it seems like it may actually be cheaper than GLM or Kimi due to low verbosity,
it's free on cline
Is anyone else constantly getting "The request was rejected because it was considered high risk"
It was fine for days and now since yesterday it'll hardly respond back
The api is fine for me but I have not used it through or recently
Oh i got this issue too now
Its weird in that it does it mid generation
Suddenly outputs that insteqd
It’s because it’s a Chinese model
If it generates content that goes against certain filters it gets wiped
from my extensive use, i stopped using the v2 pro model cuz the censorship on it was gradually ramping up in real time
i tested it, it particularly gets sensitive to nsfw especially
use omni, slightly less censored and slightly cheaper
underrated model
Liking this one the more I use it. Better understanding of people, in a Claude sort of way.
Agree, I like Pro and Omni
with the pricing???
The pricing is okayish, it caches very aggressively, but you have a point. There’s better models for the money
caching is great but yeah i dont really like to rely on caching. idk how they even manage it but yeah id rather take something smaller like even gemini 3 flash i prefer tbh
I also enjoy Gemini 3 flash, it’s very good
I haven't tried Omni, I just like Pro's vibe and understanding
Relatable
Its output tokens are kinda rough, but sometimes something about Pro is just good in a way I can’t explain
GLM-5 used to be my pseudo-Claude but I'm pretty sure this beats it for me
I'll try hitting it with some of my complex, hard, but subjective tests later today
i still prefer glm 5
even though it technically is a smaller model
it feels a lot bigger than mimo v2
Mimo V2 Pro sometimes gives really good and fresh prose though. In creative writing, that is
I threw some prose to Gemini and asked it to rate the prose, and sometimes Gemini actually prefers Mimo's prose over Opus and GPT-5.4
hmm , mimo started to work really slow for me for some reason i dont know. im using cline. at first speed was okay, after it droped to this :/
mimo wants me to walk to the car wash
I don't like that test. The LLM is going to assume you aren't retarded, and that you are going to schedule something or pay a bill or whatever else
i do like that test. that is a simple test that it should be getting right no matter if you are retarded or whatever
hi
is mimo v2 pro uncensored by this provider?
I got it elsewhere and most of my messages are cut off with "cant generate that, its high risk"
(using it on janitor btw)
the message doesnt even pop up in an error window, it appears in the actual chat messages
There is only one provider for mimo, or just lets you use it so the behaviour should be much the same, unless the particular account being used on janitor has some special settings
It isn't "right" though. You could need to go for other reasons, like picking up cleaning supplies.
This isn't the user being retarded like "How much bleach do I add to my ammonia" where it needs to correct them. This is, no human would ever ask the question if the car needed to be at the car wash. Only a niche reason would be possible.
There are probably other ways to phrase this or questions in the same vein that exclude alternate reasons at the least.
I got mimo from liter*uter (cant say the full name because promoting is against the rules)
You could also say no human is going to legitimately ask how many times “r” appears in the word “strawberry” because it’s readily apparent how many rs are in the word. But I get your point, I just think that all the basic tests we do to determine a base level of critical thinking are pretty arbitrary. Also what car washes are you going to that have more than just the car wash? All the ones around me you drive up to their like self serve kiosk thing and pay there, there’s no like shop part or anything
That's different, the strawberry test has no interpretation or anything. It's like 1+1, an unlikely task but still a straightforward one. There is no reason to ever assume the car wash test is a legit question. It is asking for a dichotomy between a mandatory choice and an impossible choice under standard interpretation.
As for a car wash shop, watch Breaking Bad =P
I also prefer glm ATM
Mainly cuz of censorship
Stuff gets blocked by mimo and that shit wastes credits if I try to regenerate and retry
Glm you can pretty much do whatever
I do prer mimo V2 pro quality tho. But censorship really kills it for me
I’m curious, what are you using MiMo for to trigger refusals? If you don’t mind me asking
Is it just nsfw?
Mimo refuses very randomly, it does nsfw then it randomly outputs nah in the middle of it. It's usually in the middle, not at the start of a request
I do not know what exactly triggers it
Interesting. Because from my very limited testing, it seemed pretty open to me
It is pretty open, like I said, the triggers are very random, and I could not tell you the difference between the cases it refuses and when it does not
If you do not hit the random seeming refusals it will seem open
Very gpt like in taht regard, I also think gpt is very random on refusals, but this just refuses less
Bizarre. I wonder if it’s refusal layers are tuned stupidly because that behavior I haven’t really heard of before as far as I can remember
Yeah I have not had anything quite like Mimo either. Gpt is random but it refuses immediately, mimo refuses partway through the response. Like yeah she was ea.... wait no I am done.
Very bizarre indeed
yea mainly nsfw that's triggering it
Ayo new mimo, this is worth checking
It seems to be quite a small spec boost on the already existing V2 pro and V2 Omni, so I suspect it will face the same censorship issues. I don’t have good benchmarks to test, but I’ll try to see if I can trigger a refusal