#MiMo V2 Pro

1 messages · Page 1 of 1 (latest)

tribal isle
warm carbon
#

wonder what the pricing on this will be

tribal isle
#

Context window: 1M tokens
➤ Pricing: $1/$3 per 1M input/output tokens, for 256K token input and $2/$6 per 1M input/output tokens for 1M token input
➤ Availability: Xiaomi first-party API only
➤ Modality: Text input and output only (no multimodality)

short tinsel
#

benchmaxxed or actually good?

rotund granite
warm carbon
#

will probably be really good the first week then degrade like the others

boreal sleet
keen quest
#

Anyone have any clue what’s the long context performance here? 1M is impressive but it’s an OSS model so I’m expecting it to be doo doo

warm carbon
#

its their first 1m context model no? im not expecting much tbh

boreal sleet
junior crag
tawdry valve
hidden pumice
boreal sleet
boreal sleet
#

I don't like being bias toward certain company that produce models, but at this moment no one can't really beat the quality antrophic produce.

The quality of their products seems to make the pricing much more bearable

Unfortunate but that's the reality

rotund granite
#

I wonder what subreddit did they censored Clueless

latent helm
#

I don't think this is hunter alpha

#

mimo v2 flash outperforms hunter alpha in coding afaict

#

(at least on my specific tests)

#

wait what??

rotund granite
latent helm
#

damn wtf

#

link?

rotund granite
wraith forge
#

Please 🙏 let others host this model

edgy snow
#

wow

#

im not gonna lie

#

this pricing sucks dick ass balls everything

#

i dont really think many people will use this

#

esp. since glm 5 is better

#

Hunter Alpha is an early internal test build of MiMo-V2-Pro.

hopefully the one they release is a better model because this one loops a lot (in agentic at least) to the point that i cant even get it to do like 10+ turns

junior crag
#

What temperature?

edgy snow
#

i dunno... just use opencode

#

is there a recommended temp?

junior crag
#

They recommend a temp of 0.3 and top_p of 0.95 for “vibe coding” and “function calling” and a temp of 0.8 and top_p of 0.95 for “web development” so it seems like it would be okay with a wide range of temps, models don’t really fall apart with wild temps anymore like they used to, but maybe give 0.3 a shot

rotund granite
#

Api when? thinkies

junior crag
edgy snow
#

also its ONLY xiaomi api

#

so that atrocious pricing aint coming down

wraith forge
#

Fireworks provider wen

rotund granite
junior crag
wraith forge
#

Yeah unfortunately I kinda expected that.

edgy snow
wraith forge
#

Qwen did the same shit .

edgy snow
#

i was so excited for the new mimo just to get this :(

wraith forge
#

For their plus model

rotund granite
#

Is this 3rd time when minimax released model the same day as other and get overshadowed? trolling

edgy snow
wraith forge
#

Its exclusive to qwen whatever it is

#

Chinese closing their models as bad 😭

rain acorn
#

really hope this doesnt become a trend

wraith forge
#

Yeah hopefully Deepseek will make the move and wake up the world again

tribal isle
agile vapor
#

Lol

tribal isle
# latent helm

Also officially confirmed
🚀 Model Quick Look

1. MiMo-V2-Pro: The Flagship Foundation for the Agent Era

  • Anonymous Launch: It was launched anonymously on OpenRouter for one week under the codename "Hunter Alpha," topping the daily charts for multiple days with a total usage volume exceeding 1 trillion tokens.
  • Parameters & Context: Featuring a trillion-scale (1T+) parameter architecture, it natively supports an ultra-long context window of 1M tokens.
  • Best Scenarios: Capable of reliably handling complex, long-cycle, and multi-step tasks; it performs exceptionally well within Agent frameworks such as OpenClaw.

2. MiMo-V2-Omni: An Omni-modal Agent Foundation Model that Sees, Hears, and Acts

  • Anonymous Launch: It was launched anonymously on OpenRouter for one week under the codename "Healer Alpha," garnering widespread attention from users worldwide.
  • Parameters & Context: Designed to be lightweight and efficient, it supports a 256K token context window.
  • Best Scenarios: Truly realizes a seamless "See-Hear-Think-Act" workflow; it also demonstrates excellence within ecosystems like OpenClaw.
steep pond
#

No caching is a big bummer

#

Nvm I'm blind

frail crow
#

shoutout to whoever used 822 tokens on openclaw

fallow hull
#

.

#

discord doesn't let me tag it unless i join the forum

fallen kite
#

Waittt... What Mimo-Omni has more world knowledge than Mimo-pro. It knows a kinda obscure VN, Stella of the End that Mimo-pro and even opus 4.6 kinda fully hallucinate (well at least the name of the protagonists and the premise)

lofty flame
#

is iy any good?

edgy snow
#

no

frail crow
#

unlike v2 omni, this model could output syntaxically fine and functional code, but the entire point of the test is asking it to write the fastest possible implementation of a verifiable api. its implementation worked, but was moderately slow (~1.5x slower than gemini 3 pro, 2x slower than kimi k2.5) - not very impressive to me for a 1T+ parameter reasoning model

edgy snow
#

yeah its not amazing

keen quest
#

I think the 1M context is ambitious

misty zodiac
#

in chess omni wins out in almost every metric. with the exception of ~0.3% illegal rate. did a 20match test (faster, cheaper, better):
mimo-v2-pro $1.60 / match, 54% accuracy, 710 elo, 94s/move
mimo-v2-omni $0.65 / match, 64% accuracy, 840 elo, 34s/move

junior crag
#

Lowkey it’s kinda looking like Omni might just be a better model than pro

terse mortar
#

omni is definitely not for agentic coding

junior crag
#

Yeah it seems to struggle with looping tool calls

#

Definitely more for general conversation and question answer and probably openclaw if people want to do that for some reason

acoustic spire
#

It's surprsingly okay with creative writing though

glossy elk
#

is mimo free from openrouter?

#

i saw the news from mimo website, it said there will be one week free partner with openrouter. anyone knows any info?

terse mortar
#

I really like pro for agentic coding. Wish there was a coding plan for it like z.ai

hoary barn
#

It's not free now

junior crag
#

until 3/25

terse mortar
#

uhhhh seems to be stuck in a loop

edgy snow
#

yep that happens a lot

#

i dont like this model

cursive aurora
#

What two models would you sandwich it between in terms of coding capability?

edgy snow
#

honeslty i havent hda much luck coding wieth it because it loops in its thinking and gets stuck a lot (at least with opencode)

#

i cant really plcae it well

short tinsel
#

Coding aside, how is this model in practice? Better or worse than glm 5/kimi k2.5?

steep pond
#

To me it's worse than both in every possible aspect

normal lance
#

Definitely worse in terms of being censored
Running into censorship issues with NSFW rp, it really starts to put up a fight whenever you do anything more rough than vanilla
Getting really annoyed with it cuz combined with its high price, it's wasting my time s

boreal sleet
#

I don't think mimo is good choice in almost anything
Just use Kimir or GLM

normal lance
#

Up until recently mimo V2 pro was doing really well in terms of rp both NSFW and normal for me
Something happened recently, not sure what or why

lament orchid
short tinsel
misty zodiac
#

Tested MiMo-V2:

Xiaomi MoE hybrid reasoning models.

Flash:
open 309B-A15B model

  • some issues with Chinese responses and special character loops
  • around Mistral Small 4 overall, cheaper
  • decent fast coder
  • good bang/buck overall

Omni:
multimodal midclass

  • decent performance across most fields, all-rounder
  • not for coding
  • can be verbose, depending on task
  • around GLM-4.5 performance

Pro:
large flagship

  • generally smarter than omni but not as versatile
  • on generic tasks, quite concise for a reasoning model
  • didn't make any noteworthy mistakes across my small coding segments, clear focus area
  • poor on creative tasks, very censored
  • mediocre bang/buck

At Chess, omni outperformed pro in almost all metrics (elo, accuracy, efficiency, speed, cost). At 580|720|850 mixed Elo they placed fairly weak, peaking around oss-20b.

Vision is only supported by Omni and was fairly mediocre, around Qwen3-VL-8b / Qwen3.5-9b.

I think this is a decent family worth a look. The models are quite generic but produced overall competent results.

acoustic spire
#

No comment in censorship since I haven't tested it, but Pro is really smart in creative writing. 70,000+ tokens with 11 characters in the same scene, and it didn't confuse or mess up character voices and physical appearances

#

Minimax M2.7 kinda starts falling apart in the same case

normal lance
uncut flax
#

This one has the same strange cleverness as V2-Flash, but it's a lot smarter. Shame it's so expensive, but I'll be doing more testing.

#

I quite like it so far

acoustic spire
#

Love this model too

uncut flax
#

On reasoning tasks it seems like it may actually be cheaper than GLM or Kimi due to low verbosity,

faint egret
#

Is anyone else constantly getting "The request was rejected because it was considered high risk"

It was fine for days and now since yesterday it'll hardly respond back

hybrid moon
#

The api is fine for me but I have not used it through or recently

hybrid moon
#

Oh i got this issue too now

#

Its weird in that it does it mid generation

#

Suddenly outputs that insteqd

junior crag
#

It’s because it’s a Chinese model

#

If it generates content that goes against certain filters it gets wiped

normal lance
silk stream
#

underrated model

uncut flax
#

Liking this one the more I use it. Better understanding of people, in a Claude sort of way.

junior crag
#

Agree, I like Pro and Omni

edgy snow
junior crag
edgy snow
#

caching is great but yeah i dont really like to rely on caching. idk how they even manage it but yeah id rather take something smaller like even gemini 3 flash i prefer tbh

junior crag
uncut flax
#

I haven't tried Omni, I just like Pro's vibe and understanding

junior crag
#

Relatable

#

Its output tokens are kinda rough, but sometimes something about Pro is just good in a way I can’t explain

uncut flax
#

GLM-5 used to be my pseudo-Claude but I'm pretty sure this beats it for me

#

I'll try hitting it with some of my complex, hard, but subjective tests later today

edgy snow
#

i still prefer glm 5

#

even though it technically is a smaller model

#

it feels a lot bigger than mimo v2

acoustic spire
#

Mimo V2 Pro sometimes gives really good and fresh prose though. In creative writing, that is

#

I threw some prose to Gemini and asked it to rate the prose, and sometimes Gemini actually prefers Mimo's prose over Opus and GPT-5.4

heavy slate
#

hmm , mimo started to work really slow for me for some reason i dont know. im using cline. at first speed was okay, after it droped to this :/

silk stream
#

mimo wants me to walk to the car wash

uncut flax
#

I don't like that test. The LLM is going to assume you aren't retarded, and that you are going to schedule something or pay a bill or whatever else

edgy snow
#

i do like that test. that is a simple test that it should be getting right no matter if you are retarded or whatever

naive fossil
#

hi

#

is mimo v2 pro uncensored by this provider?

#

I got it elsewhere and most of my messages are cut off with "cant generate that, its high risk"

#

(using it on janitor btw)

#

the message doesnt even pop up in an error window, it appears in the actual chat messages

hybrid moon
#

There is only one provider for mimo, or just lets you use it so the behaviour should be much the same, unless the particular account being used on janitor has some special settings

uncut flax
#

This isn't the user being retarded like "How much bleach do I add to my ammonia" where it needs to correct them. This is, no human would ever ask the question if the car needed to be at the car wash. Only a niche reason would be possible.

#

There are probably other ways to phrase this or questions in the same vein that exclude alternate reasons at the least.

naive fossil
junior crag
# uncut flax This isn't the user being retarded like "How much bleach do I add to my ammonia"...

You could also say no human is going to legitimately ask how many times “r” appears in the word “strawberry” because it’s readily apparent how many rs are in the word. But I get your point, I just think that all the basic tests we do to determine a base level of critical thinking are pretty arbitrary. Also what car washes are you going to that have more than just the car wash? All the ones around me you drive up to their like self serve kiosk thing and pay there, there’s no like shop part or anything

uncut flax
normal lance
# edgy snow it feels a lot bigger than mimo v2

I also prefer glm ATM
Mainly cuz of censorship
Stuff gets blocked by mimo and that shit wastes credits if I try to regenerate and retry

Glm you can pretty much do whatever
I do prer mimo V2 pro quality tho. But censorship really kills it for me

junior crag
#

Is it just nsfw?

hybrid moon
#

Mimo refuses very randomly, it does nsfw then it randomly outputs nah in the middle of it. It's usually in the middle, not at the start of a request

#

I do not know what exactly triggers it

junior crag
#

Interesting. Because from my very limited testing, it seemed pretty open to me

hybrid moon
#

It is pretty open, like I said, the triggers are very random, and I could not tell you the difference between the cases it refuses and when it does not

#

If you do not hit the random seeming refusals it will seem open

#

Very gpt like in taht regard, I also think gpt is very random on refusals, but this just refuses less

junior crag
#

Bizarre. I wonder if it’s refusal layers are tuned stupidly because that behavior I haven’t really heard of before as far as I can remember

hybrid moon
#

Yeah I have not had anything quite like Mimo either. Gpt is random but it refuses immediately, mimo refuses partway through the response. Like yeah she was ea.... wait no I am done.

junior crag
#

Very bizarre indeed

normal lance
hybrid moon
#

Ayo new mimo, this is worth checking

junior crag
#

It seems to be quite a small spec boost on the already existing V2 pro and V2 Omni, so I suspect it will face the same censorship issues. I don’t have good benchmarks to test, but I’ll try to see if I can trigger a refusal