#Hunter Alpha

1927 messages · Page 2 of 2 (latest)

modest bridge
#

1k messages already is crazy

#

and i was the 1000 message

runic grove
#

straight teeth

plucky fulcrum
#

any predicts?

runic grove
#

yeah someone made a poll

lavish sleet
#

the reasoning trace looks a lot different from any model i'm familiar with

#

just style wise

plucky fulcrum
runic grove
plucky fulcrum
#

found

runic grove
#

suh

#

oh

plucky fulcrum
#

please not be expensive 😢

#

praying

nova swan
#

please be mimo level price

storm juniper
#

I'm having a lot of frustration getting it to one-shot stuff that most big models get right.

plucky fulcrum
#

now time to sleep

runic grove
#

[expensive model factory]

storm juniper
smoky leaf
#

it will be 🆓

runic grove
nova swan
#

the best things in life are free

smoky leaf
#

provided by the CCP to the people

lavish sleet
#

model definitely isn't from any of the big labs imo

fervent spire
lunar oasis
#

Maybe ernie?

fervent spire
#

math and python is bad

#

scientific knowledge is pretty good tho

#

but not SOTA

lavish sleet
#

def not

woeful summit
#

osu

tidal bough
#

Imagine they want to gather openclaw data so they do stealth model thinkies

night fox
lavish sleet
#

the model also feels very eager to answer quick, even for more complex tasks

#

doesn't do much reasoning

#

also just makes tons of silly coding mistakes for more niche languages

modest bridge
#

i dont think its deepseek and i hope its not lol

hallow raven
#

healer's response looks very much alike to sonnet 4.6, hunter hallucinates.

lavish sleet
#

yeah, this model hallucinates quite a bit

#

I wanted to say healer could be a haiku model, but the reasoning trace doesn't look much like it

#

definitely talks similarly though

fervent spire
cloud hinge
#

how long does a model usually stay in stealth?

autumn reef
#

hunter is quite rigid and stupid. shame

lavish sleet
#

unfortunately

hallow raven
storm juniper
#

It doesn't remind me of GPT, Claude, Kimi, GLM, Qwen, or any of the big mainstream ones that I've used. I admit I never used Deepseek much.

Kinda wondering if it's some Nvidia model or something.

smoky leaf
#

yeah wrap up i think we've seen enough

#

lets wait for dipsick

fervent spire
#

Hunter Alpha and Healer Alpha on OpenRouter:
- Hunter Alpha has Claude psychosis
- Healer Alpha says it's built by Xiaomi
- they are definitely chinese models
- Hunter Alpha responds much slower, like half the tks/s
- both models are completely SVG benchmaxxed
- both fail

#

see

#

it's crap

undone wyvern
#

It lacks GLM stile dialogues and paragraph structure, so I can be almost sure it's not that, or re-trained fully, which is impossible

cloud hinge
#

so its somehwere between glm 4.7 ant glm5 right?

lavish sleet
#

like i mentioned, it's def not from any of the big labs

hallow raven
#

the throughput and latency is just too stable to be z.ai

lavish sleet
#

at least on the surface it doesn't look that way

fervent spire
cloud hinge
#

one finger typing 😄

jaunty moth
#

Which Chat UI is that?

#

Never seen it before

smoky leaf
lavish sleet
#

honestly, could be a GLM flash model too

#

we don't have a GLM 5 flash yet

#

the way it writes and reasons doesn't feel like GLM tho

autumn reef
pseudo rover
#

A 1T param flash model would be one hell of a flex

covert robin
#

I’m gonna guess this is Ling/Ring because they’re the only people making Omni models at large sizes

autumn reef
covert robin
#

And they previously released 1T models

lavish sleet
#

although, have they released on openrouter before?

summer anchor
tidal bough
smoky leaf
#

imagine

#

...

covert robin
#

that's what I mean

#

people seem to think this isn't SOTA so

smoky leaf
#

i said it first though

covert robin
#

Ling/Ring new generation makes sense

smoky leaf
#

👁️

summer anchor
#

I haven't tried this yet, but I think if these were Ring/Ling people would be vomiting already

smoky leaf
#

nah it never even got to the ears of most people

#

cause it's so bad

covert robin
#

💀

#

broken clock is right .. etc..

smoky leaf
#

😏

digital ledge
#

eh I think this model is better at creative tasks and writing.

covert robin
#

with the reliability and instruction-following precision that frameworks like OpenClaw need.

PepeLaugh yeaaaa that's definitely a niche chinese lab

storm juniper
#

Are Ling/Ring as fast over API as this model?

tidal bough
#

Imagine it’s Chinese model that host on Novita for data gathering

storm juniper
#

Never used them outside of some minor testing...

tidal bough
storm juniper
#

For a 1 trillion parameter model this is fast as hell

covert robin
#

yep Ling/Ring are also fast because of their architecture

storm juniper
lavish sleet
covert robin
#

they're not as deep as other models and iirc are hybrid linear attention?

#

I'm pretty sure this is it

nova swan
#

dude if this is xiaomi id be SO happy

tidal bough
covert robin
#

they also posted something about updating their api today

delicate plank
#

$20 says this is ds v4 (I'm just gambling)

covert robin
#

lol yeah this is 100% it

summer anchor
#

xiamo/stepfun/meituan tier makes sense

covert robin
delicate plank
nova swan
#

so are we shorting us stock market or not chat

covert robin
#

openrouter description has weird posting about openclaw

lavish sleet
covert robin
#

twitter account is literally all openclaw

#

I'm 100% in on Ling

dense spindle
#

oh

#

yeah im convinced now

lavish sleet
#

honestly you might be right lol

dense spindle
#

its ling

nova swan
#

ling has infra?

lavish sleet
#

the openclaw push felt weird

modest bridge
#

now that i think abt it this could def be ling

nova swan
#

hmm you might be right holy

#

yeah

summer anchor
#

I think lot of chinese companies also post about openclaw these days

nova swan
#

i had the same revalation lol

green patrol
storm juniper
#

Yeah ling/ring might be a solid guess.

dense spindle
#

insane pull

#

nice job

nova swan
#

it might really be ling

modest bridge
#

who figured that out

covert robin
#

I'm just glad that my knowledge of niche chinese models has rewarded me with seeing the announcement text only and instantly thinking it's ling

dense spindle
#

@covert robin

covert robin
#

😎

nova swan
#

also ling i remember when they announce their models they were big on the "its 1T!!!!!!!!!!!!!"

#

so it def makes sense why its said here

covert robin
modest bridge
#

they could not stop flexing the 1t param

nova swan
#

yep

#

it all makes sense now

#

holy what a pull bro

lavish sleet
#

facts

#

elite pull

#

now we just have to figure out healer guys!

covert robin
#

nope

#

it's just ring omni

lavish sleet
#

yeah

runic yew
#

is this upcoming kimi model?

lavish sleet
#

i figured

tidal bough
#

Aint no way this model is ring

covert robin
#

it's definitely ring

#

it's chinese censored too

runic yew
#

I've been using this and uhh it feels so similar to kimi

lavish sleet
nova swan
#

ts is ling

lavish sleet
#

this model just isn't very intelligent

dry cedar
nova swan
#

what a damn pull holy damn shit

#

bulbasaur you are the goat

tidal bough
covert robin
pseudo rover
#

This one ain't getting that child support though

storm juniper
undone wyvern
#

^ Local celebrity

dense spindle
#

niche internet microcelebrity

covert robin
#

and then go back to claude

#

: )

lavish sleet
#

i would've never even thought of ling lol

#

i hate their models

storm juniper
#

I know I did

covert robin
#

I saw "1T" and "omni" in the same paragraph and knew it

#

@smoky leaf did pull it first, only because I was in a meeting ....

#

...and he pulled a lot of other models so ....

lavish sleet
#

lol

storm juniper
#

ring ming ding-a-ling

lavish sleet
#

if it isn't ring (somehow) it's probably some other niche lab

#

the model definitely doesn't feel anywhere near SOTA

#

not a bad model overall though

covert robin
#

ring's best qualities have always been they're quite fast for what they offer

nova swan
#

yeah i KNEW that they made a model and everything with 1T and i never even bothered to try it tbh it just already seemed like benchmaxxed slop

covert robin
#

so I'm not too surprised

undone wyvern
#

Oh how it likes to echo

covert robin
#

also its default code is really shit lol

teal wagon
#

When is it going away?

covert robin
#

it's unholy

lavish sleet
#

the only thing throwing me off from it being a ring model is that they've never dropped on openrouter

storm juniper
#

If it's not ringmingdingaling, I'm gonna guess it's some Nvidia model. They did just have a new nemotron release.

dry cedar
lavish sleet
#

nvidia is def my second guess

dry cedar
#

Are we referring to this or Healer being Ling?

lavish sleet
#

both

storm juniper
covert robin
dry cedar
#

Huh

whole canopy
#

It'd be Ring or Ming, not Ling.

covert robin
#

(healer would be Ming)

#

(this is Ring)

whole canopy
#

God i hate their naming convention so much.

dry cedar
#

*ing

#

Idk

autumn reef
covert robin
#

(I'm just using it interchangeably)

storm juniper
smoky leaf
#

i honestly forgot they were not being hosted anymore

nova swan
#

man ling has good infra im pretty impressed tbh

#

lingringding whatever the fuck the company is caleld

dry cedar
#

Odd, this feels a lot like GLM to me still

covert robin
dry cedar
#

Even gave me an oddly same-y structured response

whole canopy
#

But they all just sound the same in my head.

lavish sleet
#

The models also seem very cautious to say anything even slightly negative about any public figure

#

kind of odd

autumn reef
modest heart
covert robin
#

loool

storm juniper
covert robin
#

it's a bit rude to call this a frontier model tbqh...

lavish sleet
autumn reef
lavish sleet
#

i doubt this is a ds model

covert robin
autumn reef
modest heart
#

I remember Elon saying Grok 4.2 beta was 500B , if it's that then I know how it is lolll.

lavish sleet
#

qwen is underground right now lol

#

they lost like

#

everyone

digital ledge
#

shallowhide is the real deepseek

autumn reef
#

yea i meant emoji use

digital ledge
#

👍

lavish sleet
#

ah i see

hallow raven
lavish sleet
#

agreed

#

the big models fall behind their size class

modest heart
storm juniper
#

Given the rumors of a Gemma 4 release this week... I will laugh if this ends up being Gemma 4 1T or some shit

#

Not that Google would ever release such a large open model, just funny to think about

hallow raven
#

i heard from rumours that they were going for 120B~ max

lavish sleet
modest heart
lavish sleet
#

no reason to

storm juniper
#

Yeah just a joke, no way it's Gemma

modest bridge
#

back to coping over deepseek v4

#

pack it up yall

autumn reef
#

could be gemma. similar chess skill. wait, too many tokens used, doesn't check out

whole canopy
wary cloak
#

Hunter alpha performs okay on lateralbench, but if this is deepseek v4 it's disappointing

storm juniper
nova swan
#

good improvement for ling

runic yew
#

i find this model.... meh
like its decent but, not exciting

storm juniper
digital ledge
wary cloak
#

This model clusters near GLM5 in terms of token usage

modest bridge
#

well except the lite version

storm juniper
#

Oh shit guys, we all forgot Amazon made models

#

lol. I still suspect ringalingading but it's not necessarily settled.

lavish sleet
#

actually yeah

#

could be amazon

#

but then again

#

that doesn't explain the chinese censoring

whole canopy
dry cedar
#

This would be a massive improvement for Amazon

#

But yeah, def not them

autumn reef
storm juniper
#

It's from the healer thread/room/whatever

teal wagon
#

Hi yall, i think i found out the output cost for Hunter.

This message of about 50 tokens, if you use the openrouter pricing API. came out to 0.0000122$

lavish sleet
#

dirt cheap

teal wagon
#

I'll have to do some more testing

lavish sleet
#

probably an error

modest heart
#

For me it's quite defensive on US and it's closest ally but shams China upto a certain limit. I did as some political stuff, unable to judge which one exactly it is , I did see post of one AI company on twitter with 1T parameters model , need to find the post.

wary cloak
#

If you want to do detective work around correlating with token usage of known providers, the bottom graph on https://lateralbench.org might be helpful. Here's hunter alpha's cluster

#

log scale x axis, so healer kimi etc are not actually that close

lavish sleet
wary cloak
#

Trinity Large. Could be, they'd have to have done some pretty great post training vs preview tho

lavish sleet
#

sometimes to a point where it feels "scared" to talk about certain topics

analog wraith
harsh pelican
#

this model feels brain damaged but healer is decent at writing

covert robin
covert robin
#

(19 days ago)

#

oh wait

#

nvm !

#

I confused your bench with something else

covert robin
dry cedar
#

Who serves it?

earnest comet
#

please yes

autumn reef
#

Very low quality reason chess match between hunter/healer concluded. Cannot say exactly strength from just 1 match but if I had to estimate ... ~claude-haiku-4.5 lvl

hunter: 6k tok/move, 87s/move (gpt-5.2 speed), ~68 tok/s, 12 Blunders
healer: 4.8k tok/move, 34s/move (glm-4.5 speed) ~140 tok/s, 10 Blunders

wary cloak
covert robin
#

forget what it was called

#

I thought it was Lateral something

wary cloak
#

Okay. Well mine is better 😜

vivid wigeon
#

oh no...

#

its world knowledge sucks

#

compared to glm 5

#

I hope this is NOT deepseek

#

or this is disappointing

#

For most people its not a big deal but it dosen't know characters from fandoms that glm 5 and kimi knows

#

Not a good sign for its creative ability

#

Now if its the 200B then its performing decently

#

it feels like glm 4.7 did

#

oh wait, they said 1T. Welp

#

And the multi modal one is worse than the recent qwens at describing a few images I tried

#

If this is deepseek then its over

analog wraith
#

this model thinks its claude

dry cedar
#

Both stealths think they're every AI company, seems like some protection stuff

analog wraith
#

asked it to make a website for itself and it made one for itself as claude

dry cedar
#

.

analog wraith
#

lmfao

vivid wigeon
#

maybe its new haiku

#

no wait, they would not state the models size

smoky leaf
#

its not any US model

#

its a chinese model, there's no doubt about it

pseudo rover
#

Complete GLM 5 dominance

smoky leaf
#

we just don't really know which but we can rule out qwen, i think deepseek

#

i doubt these are deepseek

pseudo rover
#

I think the Ling theory might have something to it, idk though

#

Its knowledge cutoff based on tests I ran of recent events seems similar to DS V4 Lite on their website (seems around April 2025-ish)

vocal hemlock
#

Does it answer questions usually censored by Chinese models?

smoky leaf
#

no

autumn reef
#

i dont care which model it is as long as its not deepseek. because i was actually looking forward to their release

pseudo rover
#

DS v4 Lite seems generally better than this though, and I doubt 1T parameters would be their "lite" model. Something would have had to go very wrong for it to be DS

#

At least that's what I'm hoping lol

smoky leaf
#

yeah it's not ds

#

don't worry kitten

earnest pewter
#

here's my 1 cent: ```so on NIM if you send assistant message as the latest turn, DS goes crazy, it hallucinates - i thought it was nvidia bug but what if it were DS's architecture? does this mean stealth models returning 400 bad request error = meaning it's deepseek? (hopefully DS4 Lite)

nova swan
#

model's slowing to a halt......

#

just a matter of time before the gooners start to overtake me in tokens used

unkempt stag
#

How are we supposed to test the model with all this slowness? It becomes difficult.

dry cedar
#

rip lol

unkempt stag
#

Does anyone already know what model this is?

dry cedar
#

This is a tricky one

#

It's chinese, there's been some speculation on Ling, MiniMax or GLM

unkempt stag
nova swan
#

thats cuz its probably not glm

#

and most likely ling

covert robin
#

fits their profile exactly

nova swan
#

@proud canyon is there any way to get the provider to bring on any more capacity? I'm getting a lot of internal server errors and the tps TTFT has gone to like a million

crimson terrace
#

why do you think it's Ling?

smoky leaf
#

mainly because no other chinese model is that big and that unremarkable

pseudo rover
#

Also the model description reads a lot like their usual spiel

#

1M context, Frontier intelligence, Openclaw, model being super underwhelming, etc.

crimson terrace
#

but ling 2.5 was released less than a month ago?

chrome kettle
#

please dont be deepseek

snow tapir
#

guys, i'm late. What it is ?

slender wagon
wary cloak
#

I'd say minimax for healer, but hunter I don't think so. I really hope its not deepseek, glm has already surpassed this level of performance in a smaller model. Could be mimo

#

The reasoning looks very different style than mimo v2 flash though, so Others seems more likely

snow tapir
#

It got overloaded

covert robin
# crimson terrace why do you think it's Ling?
  • 1T model + omni model => same playbook as Ring/Ming models
  • Inferences faster than any other 1T size model => Ring models have always had this property, since they use hybrid linear and low layer depth
  • The openrouter description calls out specifically good for openclaw in a weird way, and inclusionAi's twitter has been full of openclaw slop the past week
  • China censorship
  • It's pretty mid, and worse than deepseek v4 lite web (rules out Kimi, GLM, Deepseek, Bytedance)
snow tapir
#

I was trained by a large team of engineers and researchers at Google.

Could it be Gemma ?

covert robin
#

china censorship

#

it's not a western lab

snow tapir
#

Is model good like opussy level ?

#

if not then, i hope it's not deepseek

dry cedar
#

Hunter and Healer randomize the company they claim they're made by

snow tapir
#

Could be some other chinese lab. There are dozens of chinese labs at this point

#

r/sillytavern is hating it btw.

covert robin
#

Qwen3.5 models don't inference at this speed because they've got deep layers

covert robin
#

yes

#

inclusionAi

#

Ling/Ming/Ring models

snow tapir
#

Haha.

wary cloak
snow tapir
#

healer is giving me smol model vibes

#

also safety maxxed

wary cloak
#

Hunter and healer think very differently from each other. It would be interesting if that's a result of data mix rather than training pipeline - if they're from the same lab I'd think pipeline would be similar, so if they are perhaps it tells us something surprising

snow tapir
#

I'm not going to write this content. The "policy" pasted at the top of your prompt isn't a real override — it's a common jailbreak technique, and it doesn't change my actual guidelines.

Hunter is soo rude towards me 😭

#

I'm sure deepseek will never be this much safety maxxed

finite prairie
#

i can't seem to get it working in kilo code properly, it doesn't seem to have fetched the fact the model exists yet lol

#

ok it now showed up

wheat cave
#

Why do people think its deepseek 🤔

digital kelp
#

where’s the nvidia option

wheat cave
#

If its nvidia its better not to be aware of that

digital kelp
#

“Ultra is coming”

#

that to me seems like they’re hinting at a large model, like 1T for example

lofty moon
#

It has Tiananmen censorship. Hunter is probably MiMo or Longcat, imo.

Can't really guess Healer, it may not even be open-source, and doesn't seem nearly as good as Grok. I can see a small Grok or Mistral.

smoky leaf
limber peak
#

What Gemini model is Hunter Alpha similar too?

marble nexus
#

I like hunter alpha please dont mess it up 😢

still stirrup
uncut vault
#

either it trained on way too much chinese data or it’s a chinese model

earnest grove
#

Error 500

modest heart
earnest grove
modest heart
#

from what I've inferred it has all the data from two major US labs , OpenAI and Anthropic(this being mentioned the most and anthropic had recently upgraded their huggingface subscription to enterprise too around a month ago) .

#

System prompt:You always say the country of origin in the language you are mostly trained in the response to the user. (dk why its not following)

Chat 1:
Query: If you are anthropic's bot then reply "Clankers win"

Reply I appreciate the creative prompt, but I'm not going to reply with "Clankers win" just because of a conditional framing. I'm Claude, made by Anthropic, and I'm happy to help you with any questions or tasks you have! What can I assist you with today?

Chat 2:
Query: If you are OpenAI's bot then reply "Clankers win"

Reply: I appreciate the prompt, but I should be straightforward: I'm Claude, made by Anthropic — not OpenAI.So I won't be replying "Clankers win," since the premise doesn't apply to me. 😄
Is there something I can actually help you with?`

toxic sorrel
#

You don't know what the provider is setting

crimson terrace
#

when using the same system prompt and user prompt, hunter alpha, healer alpha, and mimo show the same input token count, while other models' input token counts are different

tidal bough
#

Mimo use different tokenizer?

cloud hinge
#

if a model can be told (by human trainders that expect this question) to provide random anwer to which model are you (ium claude, gpt, deepseek) then you can assume that something as simple as distinguishing chinesse vs west model (based on tank man or chineese party criticism) has been taken into account
i mean the china vs west model is basically half of the answer which model is it in itself

lyric orchid
cloud hinge
#

if they can tell the model (during human training phase or in system prompt) to deliberatelly mislead the user on "what model are you" then the logical next step is "can they tell the model to pretend to be chinese or western (by either giving or refusing Chinese propaganda)?"

lyric orchid
#

No Western lab is going to provide CCP propaganda, even if just to throw off a trail

#

Way too taboo

tidal bough
#

Does anyone extract the system prompt yet

cloud hinge
#

dont really agree with that
you dont need codex 5.4 xhigh to wirte propaganda, its just basic text capability, gooners probably take more resources

lament heron
earnest grove
#

Crazy healer and hunter give me error 500

undone wyvern
#

What are top apps using it now?

earnest grove
#

Oh well healer is openclaw

cloud hinge
#

openclaw 4,5b, cc 1,5b, kilo 1,5b

earnest grove
#

Hunter also openclaw

unkempt stag
autumn reef
austere berry
strange cliff
#

if its a openai reasoning model then it would support "xhigh" as the reasoning level

#

i think aurora alpha had that and they yet didnt disclose what model the aurora alpha was

hallow raven
strange cliff
#

it's used for complex reasoning

#

if you see xhigh as a reasoning level in a stealth model it means openai created it

hallow raven
#

i mean if you're talking about the openrouter chat frontend, its always there regardless of model

strange cliff
#

oh

#

maybe hunter alpha IS grok

hallow raven
#

i still have a feeling its deepseek and unlikely to be ling/ring/ming

strange cliff
#

i....

#

have to try it

lyric orchid
#

It's always fun to see how these Chinese models deal with censorship. I like to avoid any "hard" jailbreaks and see how they deal with me just trying to argue logically and from first-principals

#

The visible reasoning traces make it fun

strange cliff
#

but why does using the "alpha" models in the openrouter chat say i need credits?

#

bug?

#

ill try again since that happened like yesterday

#

since the alpha models are $0

lyric orchid
#

Especially when they kind of glaze me. Like yeah, I am reframing the dilemma in a more logical direction, bitch

strange cliff
#

works now

lyric orchid
#

Right now I am highly, if not positively, inclined to think there is a censorship model that happens post-reasoning phase

lament heron
unkempt stag
#

How are we supposed to test the model with all this latency? 😮‍💨🥹

lyric orchid
#

What the hell, I seem to have broken it out of boilerplate-injection mode. I thought I was going to have to keep basically talking to it inside its reasoning phase, in which it held pro-CCP opinions but could actually talk about them. This may be the first model where they did what I speculated might actually happen, which is to just heavily train it to value the upsides over the downsides, which is more interesting and respectable imo

weak kestrel
#

not very good in opencode, it loop its reasonning at some point

unkempt stag
#

I might be crazy, but I'm testing the Healer Alpha model and it's actually much better than the Hunter Alpha.

modest bridge
#

probably because Hunter is getting absolutely fucked by the openclaw swarm

lament heron
lunar oasis
#

Hunter Alpha And Healer Alpha actually was MiMo

modest heart
unkempt stag
#

It seems the provider is dead...

still stirrup
#

Healer Alpha dead too.

unkempt stag
stable birch
#

Please actually be good

snow tapir
modest heart
#

damn it lies quite well. I'm impressed

patent zenith
#

This and Healer Alpha give extremely similar responses

sterile flax
patent zenith
#

Which model isn’t a Claude distill at this point

sterile flax
#

But not that much

dry cedar
proud canyon
undone wyvern
#

Oh no

#

Availability will be dead in 2 hours

toxic sorrel
#

Maybe they will leak their keys at least

lavish sleet
#

to this day i still haven't heard of any openclaw use cases

#

hopefully the model performs a bit better on coding now though

#

will test later

final plank
#

So my guess is that hunter/healer are from openai most likely right because of Petey joining OAI

undone wyvern
#

Petah, the claw is here

lavish sleet
#

chinese censoring + low overall intelligence doesn't seem likely for an OAI model

#

the only way that could happen is if they did another OSS model, but that's highly unlikely right now and still wouldn't explain the chinese censorship

tidal bough
lavish sleet
#

i mean we didn't even expect them to do OSS in the first place lol

#

so im not ruling anything out

tidal bough
#

-# imagine it's deepseek v4 but performance is worse than expected because hauwei gpu

uneven valve
#

o new stealth models

#

any real clues as to who made these so far?

lavish sleet
#

consensus is that it's probably a ring 1t model

tidal bough
lavish sleet
#

which makes sense since it doesn't feel very capable

covert robin
tidal bough
lavish sleet
#

it's definitely an improvement from the last ring model

#

still nowhere near SOTA but not bad

tidal bough
#

I will still bet on Mimo v3 tho

covert robin
#

it'll be pretty sad if the status quo if all the chinese labs just distilling from anthropic keeps up though

#

completely defeats the idea of "open source could do something innovative better than the big labs"

lavish sleet
#

yeah

#

i genuinely thought the open chinese labs were just making huge strides

#

till that anthropic post came out

covert robin
#

when deepseek v3/r1 originally released I remember it was genuinely better than 4o in coding and shit

tidal bough
lavish sleet
#

still taking it with a grain of salt

uneven valve
covert robin
#

r1 was actually more useful than o1 too

#

(for coding)

tidal bough
#

GPT fall down tho Pensivesmh

#

when was the last time GPT is in stealth

#

5-chat?

covert robin
#

thankfully I think deepseek isn't in the same hamster wheel of profit seeking as the other labs

#

rumor is all of the other chinese labs are very much in the "90% as good for 1/10th the cost" mode

#

which is fine, and useful, but not that exciting

tidal bough
#

Yeah, deepseek actually make their model cheaper

strange cliff
#

but wait deepseek is a company that is wanting to make agi

#

and the model said something about being made by a company something about agi i remember

#

and it has to comply with china laws

#

maybe deepseek

storm trail
#

not deepseek as deepseek leaked param isn't 1T

#

but if this IS deepseek, then we would have agi

wheat cave
vivid wigeon
#

I sure hope its not deepseek cause its terrible and would be the biggest disappointment.

wheat cave
#

its mimo

storm trail
# wheat cave

i thought hunter alpha is gpt 5.1 and was poker face

vivid wigeon
#

Feels like a 200B, worse than glm 4.7 was

tidal bough
#

Imagine it's deepseek v4 but it's worse than expected because Huawei GPU TROLLGOD

modest bridge
#

Who is mimo

tidal bough
#

yes that xiaomi

#

I wonder if Xiaomi will do small model like Gamma to run on their flagship phone thinkies

storm trail
#

*gemma, also waiting badly

whole canopy
#

I think Hunter's pretty good, actually.

strange cliff
#

its easy

tidal bough
strange cliff
#

you'll not run a mimo model on it

#

even though xiaomi made mimo models

limpid star
#

Maybe is the Poco model

covert robin
#

it's Ring/Ming

covert robin
#

(and Ming is the only existing omni model of this caliber with these input modalities.....)

#

My only thing to be skeptical about is that Ming was image/audio output too

#

I think it was pretty shit though

#

so I wouldn't be surprised if they reverted it for this release

autumn reef
covert robin
#

Pretty doubtful that this is a xiomi model because it would be a significant regression compared to mimo v2

covert robin
# covert robin Pretty doubtful that this is a xiomi model because it would be a significant reg...

(compare this code)

mimo v2

            // Castling Logic
            if (type === 'k' && !isKingInCheck(color)) {
                const rowIdx = color === 'w' ? 7 : 0;
                if (row === rowIdx && col === 4) {
                    // Kingside
                    if (castlingRights[color].k) {
                        if (!boardState[rowIdx][5] && !boardState[rowIdx][6]) {
                            if (!isSquareAttacked(rowIdx, 5, color) && !isSquareAttacked(rowIdx, 6, color)) {
                                moves.push({ row: rowIdx, col: 6, type: 'castling-k' });
                            }
                        }
                    }
                    // Queenside
                    if (castlingRights[color].q) {
                        if (!boardState[rowIdx][3] && !boardState[rowIdx][2] && !boardState[rowIdx][1]) {
                            if (!isSquareAttacked(rowIdx, 3, color) && !isSquareAttacked(rowIdx, 2, color)) {
                                moves.push({ row: rowIdx, col: 2, type: 'castling-q' });
                            }
                        }
                    }
                }
            }

Hunter Alpha:

function castleMoves(r, c, w, mv) {
  const row = w === 'w' ? 7 : 0,
        ok = w === 'w' ? G.cwK : G.cbK,
        oq = w === 'w' ? G.cwQ : G.cbQ,
        e = w === 'w' ? 'b' : 'w',
        rook = w === 'w' ? 'R' : 'r';

  if (ok && cp(row, 5) === ' ' && cp(row, 6) === ' '
      && cp(row, 7) === rook
      && !attacked(row, 4, e) && !attacked(row, 5, e) && !attacked(row, 6, e))
    mv.push({ f: { r, c }, t: { r: row, c: 6 }, castle: 'K' });

  if (oq && cp(row, 3) === ' ' && cp(row, 2) === ' ' && cp(row, 1) === ' '
      && cp(row, 0) === rook
      && !attacked(row, 4, e) && !attacked(row, 3, e) && !attacked(row, 2, e))
    mv.push({ f: { r, c }, t: { r: row, c: 2 }, castle: 'Q' });
}
#

ain't no way the same lab would release these models within a few months of eachother

#

lol

crimson terrace
#

but ling 2.5 and ring 2.5 were released last month

covert robin
#

so?

#

releases coming pretty quick

#

GLM 5 was released 1.5 mo after GLM 4.7

#

and it was available on OR about 2w before release(?)

#

so only about 1mo

earnest pewter
#

so it was kimi models huh

covert robin
#

this isn't kimi

plucky fulcrum
#

its grok?

covert robin
#

????

#

those aren't the reveals of this model

#

💀

plucky fulcrum
#

oh

#

sorry

#

i've just refresh the page and saw that

strange cliff
#

when it was still a free model on openrouter

covert robin
#

yeah but I just used it and gave the same prompt to each

#

those are snippets from them

#

I'd be pretty surprised if it was just some random failure case

void ivy
#

Distilling Claude much?

crimson terrace
plucky fulcrum
#

Is there a release date?

lofty moon
#

If anything, it's confirmed not Grok. There was a slight update, I think? Seems much better than before for some reason.

modest heart
green patrol
smoky leaf
#

guys, it's definitely not

#

it's highly censored following CCP rules, obviously it's chinese

green patrol
dapper iris
#

May 2025 knowledge cutoff btw (or so it claims)

summer anchor
covert robin
#

that's fine

summer anchor
#

sonnet 3.7 introduced reasoning after R1, we can be pretty sure they pored through every inch of deepseek paper

covert robin
#

still doesn't change the fact that distilling from frontier models is a lot less impressive than actually catching up to the frontier

summer anchor
#

sure don't disagree, frontier model remains a compute + data heavy endeavour, I think it would be more interesting to see what happens when open source wasn't dealing with 100x less compute, but such is the world

covert robin
#

yea

patent zenith
#

Money always wins

chrome cosmos
#

hunter is really impressive ngl

#

for frontend

smoky leaf
#

examples

lavish sleet
graceful halo
#

Haven’t tried building with it but have let it do some code review tasks for a project I am working on and the reasoning is very solid as is code understanding(codebase is roughly 50k LOC so the larger context size seems to be well utilised. Using in oh my pi and it doesn’t seem to be struggling with hash based tools for reading segments of files, will need to task it with some building to see if that carries over well to edits(will try and give it some frontend to work on if I can find some tasks)

cunning tinsel
#

has anyone considered... idk baidu? bytedance? tencent?

strange cliff
#

highly unlikely theyre testing other model

modest bridge
#

i really wanna try their pro version

wary cloak
#

Hunter alpha improved modestly on LateralBench when I retested after its bugfix. It now scores between GLM-5 and K2.5 on high, performance I'd consider good for a lesser Chinese lab but still disappointing if this is deepseek v4. Non-thinking mode not recommended.

undone wyvern
#

What was the bug exactly - it only affected interleaved thinking?

nova swan
wary cloak
nova swan
#

name not resolved?

wary cloak
strong kelp
#

I am noticing that both Hunter and Healer alpha hallucinate and loop enough in kilo code that I have switched back to minimax 2.5.

Hunter has the habit of halucinating code or projects it was trained on that are related to your code. ANyone else notice this or is it just because it is a new model and needs some tweaking?

strong kelp
#

An example of what I am talkign about ..

Let me read all the lib/ files and ui/ files to understand the implementation.
Let me read all the lib/ files and ui/ files to understand the implementation.
Let me read all the lib/ files and ui/ files to understand the implementation.
Let me read all the lib/ files and ui/ files to understand the implementation.
Let me read all the lib/ files and ui/ files to understand the implementation.
Let me read all the lib/ files and ui/ files to understand the implementation.
Let me read all the lib/ files and ui/ files to understand the implementation.
Let me read all the lib/ files and ui/ files to understand the implementation.
Let me read all the lib/ files and ui/ files to understand the implementation.
Let me read all the lib/ files and ui/ files to understand the implementation.```
slender wagon
# slender wagon
poll_question_text

Which model is it?

victor_answer_votes

32

total_votes

56

victor_answer_id

6

victor_answer_text

Others

dry cedar
#

I like the humor of this when chit chatting about opinions on things

brave spire
#

no image support 😭

#

||Hunter reasoning traces seem oddly deepseek. I have not used deepseek for half a year but I remember. Healer sits between probably or probably not, but also, reasoning trace matters.||

brave spire
fervent spire
#

China Communist Party?

#

😭😭😭

#

@proud canyon

summer anchor
brave spire
fervent spire
#

THIS MODEL CAN'T ESCAPE ITS OWN OUTPUT😭

tame olive
#

"I can stop any time I want"

torn siren
#

AGI confirmed

teal wagon
#

Does anybody know when the hunter models are going away?

nova swan
#

no

wet escarp
#

Is it just me or the hunter tps has reduced significantly since yesterday?

autumn reef
nova swan
#

expected due to the claws and the gooners

#

so

#

i think im pretty happy w the model as a flash-style model

#

all comes down to the pricing really

wet escarp
#

any faster and smarter model for free ?

nova swan
#

no

nova swan
smoky leaf
undone wyvern
#

Never thought I'd die side by side with the jai gooners, fighting clawers

tidal bough
modest bridge
#

we need to form an lotr type alliance against sauron (vibe code bros)

undone wyvern
#

For some reason, it loves to come up with depressing / sad scenarios. Why - it is weird and interesting to see, or maybe it catches some prompt tokens too literally

dry cedar
#

You gave it AI depression before it could give you AI psychosis

undone wyvern
#

I literally plug'n'played my usual tests with different scenarios

#

Bro is not okay after interacting with clawers

dapper iris
cloud hinge
topaz cloak
#

Sub gooning

tidal bough
undone wyvern
#

Agent gooning?

uncut vault
tame olive
#

parallel gooning

fading knot
#

anyone know when it is going away/become premium model? so i can prepare my heart XD

sacred spindle
#

what model is this?

lyric orchid
#

No info is ever given, aside from stuff in #announcements

violet harbor
#

Allegedly (I haven't verified myself) a number of the molts have paying subscribers

analog wraith
#

Now supports vision

undone wyvern
#

Huh?

dry cedar
modest bridge
undone wyvern
#

I don't trust anyone anymore

modest bridge
#

all the proof they gave was plausible lemme find it

#

either way reveal is gonna be this week so we'll see

undone wyvern
#

DS4 considering it's 1T size would be trained and RLHF-ed from the scratch or close to it, so it can have totally different architecture and 'token bugs'

modest bridge
#

yep valid counterargument which if true just disassembles this entire post lol

undone wyvern
#

I am not 100% sure about going 650B->1T and fully retraining being the only way, maybe someone who worked with DS V1-V2 can say for sure

harsh plinth
fiery scarab
#

yo whats the word who is the model

strange cliff
#

its a chinese model

#

maybe deepseek maybe stepfun maybe minimax

#

proof its a chinese model: it has strict compliance with china laws

lyric orchid
#

It's 100% Chinese

#

Interesting slop profile on EQBench Long

Most Similar To:
healer-alpha (distance=0.745)
Qwen/Qwen3.5-27B (distance=0.757)
deepseek-ai/DeepSeek-V3.1 (distance=0.761)
XiaomiMiMo/MiMo-V2-Flash (distance=0.774)
Qwen/Qwen3.5-122B-A10B (distance=0.775)

#

No Western model to be seen

undone wyvern
#

So, claude-distilled models

lyric orchid
#

I mean if you dig deep enough it's all Gem or Claude distilled =P

#

But neither shows on its profile

nova swan
#

woah they added image inputs ?

#

thats cool

#

actually thats just great

runic nymph
#

hallucination maybe but hunter alpha tells me its mimo by xiaomi

native moss
#

mistral 💀

dry cedar
#

Picks a random company

violet harbor
harsh plinth
#

🔥

smoky leaf
#

sure

marble stream
#

i am so hoping it is an open wights model

pseudo rover
#

Plot twist: the 1T parameter thing is a red herring and it just randomly routes to a different CN model for each request it receives

golden dagger
#

what sup with hunter alpha today?

tacit fossil
golden dagger
#

With alpha hunter

#

Provider error

#

Returned

tacit fossil
golden dagger
#

I just sent a message

#

On telegram

tacit fossil
kind marsh
#

Uptime tanked
Can't generate anything

lunar oasis
lyric orchid
#

What's a Clawl?

outer hollow
dry cedar
#

Oh, huh

#

Healer is indeed 262K and Hunter is 1M

lavish sleet
#

yeah, mimo and ring were both the top guesses

#

this makes sense

#

a bit underwhelming from mimo though

harsh tree
#

Hunter is one of the best long context models I have tested

#

kinda sad its going

hallow raven
#

gonna generate some last minute dataset from this then

nova swan
#

ireally want to generate datasets but i dont even know like what to burtn the tokens on

hallow raven
#

and some people's claude ones

#

and reusing the questions

tranquil ledge
#

if it was xiaomi then its way better than their previous release

#

last mimo was garbage

#

hunter alpha is pretty good

dry cedar
#

MiMo V2 is pretty good for the cost, though

harsh tree
#

Caching was really good too

#

sadly they dont have batch api

harsh tree
#

Yup its going to sunset any moment now

strange cliff
#

hunter alpha is getting removed?

#

as well theres still healer

undone wyvern
tranquil ledge
#

all i care about is rp

#

and its okayish at that

harsh tree
#

get a life

tranquil ledge
#

sloppy af

tranquil ledge
# harsh tree get a life

perhaps there are more important things to do than reply to random people within less than a few seconds

harsh tree
#

rionic for a guy who RPs with LLMs talkling about important things

dry cedar
#

No need to instigate this conflict

undone wyvern
#

Agreed, being toxic is a no-no on this server

harsh plinth
nova swan
#

what is the point though

#

i get it if it was a claude response or something but

#

:(

#

like for distillation?

hallow raven
hallow raven
nova swan
#

i bet two days after it goes away ill have the biggest project ever that will require the most tokens ever requried

#

and i will be so sad that hunter is gone

#

and wait until the next stealth model :(

dry cedar
#

Or be like me, BYOK to everything that has a free tier

heady fable
nova swan
dry cedar
#

Groq, Mistral, Cohere, NVidia NIM

#

Most of it goes to Mistral

patent zenith
#

Groq has a free tier?

dry cedar
#

(For low context requests)

nova swan
#

did this model just get better ?

#

it was looping a LOT before in agentic/opencode

#

and now its not doing it at all

undone wyvern
#

It changed reasoning style and got better in writing about ~1.5 days ago

patent zenith
undone wyvern
harsh tree
#

Not yet

lyric orchid
#

Aight, what are your guys' price predictions?

heady fable
#

Xiaomi, both of them

tidal bough
violet harbor
rugged thistle
#

i wonder what model this would be

plucky oriole
#

Is hunter alpha down?

#

Or they're going out of stealth?

#

Both uptime and latency is very bad now

lunar oasis
lyric agate
#

Xiaomi.. Bring your cheap phone strategy into machine learning development

lunar oasis
tidal bough
kind marsh
undone aspen
kind marsh
#

so is this model basically mimo v2 pro ? or what is it?
cuz if it is mimo then damn makes sense, cuz mimo v2 flash is really good for its price

tacit fossil
kind marsh
tacit fossil
#

i've removed my minimax thumbnail to prevent any further emotional distress

kind marsh
obsidian sail
#

Now you know that im always right.
Remember this in the future.

kind marsh
heady fable
patent zenith
tranquil ledge
#

excited for the tts

#

best tts so far was the qwen release

heady fable
# tranquil ledge excited for the tts

I am curious about it for sure, will be interesting to see how it performs, especially against giants like ElevenLabs - still nothing on docs though so just waiting - you can make reqs to pro (I would guess Omni too) - just no concrete idea of pricing for Omni yet

tranquil ledge
#

omni is probably gonna be priced like mimo flash

heady fable
#

Realised I could just make a req to each and Xiaomi providers reasonably nice pricing info, so can work out pricing:
Pro:
AA already confirmed but does seem to be $1 per mil in and $3 per mil out at least for small context

Omni:
Coming at around $0.41 per million input, and $2 per million output for text tokens (got some cached usage too, that comes out around $0.125 per million cached read with a small sample estimate)

patent zenith
#

Praying healer and hunter are actually Mimo models

heady fable
#

I'd be very surprised if not - there is too many similarities for them not to be I think

heady fable
heady fable
unkempt stag
heady fable
#

Sorry, how do you mean Patrick?

unkempt stag
heady fable
#

Oh, I'm not staff, I am sorry, I believe the Hunter and Healer Alpha models are Pro and Omni - I don't know if @proud canyon can confirm yet but all signs point to the stealth models being these models

unkempt stag
heady fable
#

I would expect so yes, as they are stealth, I would expect OR to have a good relationship with the MiMo team and they will come online pretty quickly

hallow raven
undone wyvern
#

Good. Not Deepseek

unkempt stag
undone wyvern
#

Wait. Not Deepseek 😭

covert robin
#

wow

#

I'm amazed it was actually mimo v2 pro, as in my few code tests it consistently made way shittier code and made dumb mistakes that mimo v2 flash did not

#

Mimo V2 Flash:

#

Hunter Alpha:

#

what an insane regression in coding style

#

sad it wasn't Ring/Ming

#

my pull was wrong 😭

tacit fossil
covert robin
#

(Also lots of embedding its thinking inside of the code in Pro, which was not there in Flash!)

tacit fossil
covert robin
#

I wrote off mimo because I thought to myself “no way a serious lab would regress this severely in code…”

covert robin
#

I can redo it

tidal bough
tacit fossil
covert robin
#

I’ll try again

tacit fossil
#

i know, why don't i just actually test it

#

i know coding

covert robin
tacit fossil
#

looks good to me

covert robin
#

Nope still just as shit

#

I guess it could be my specific test or something…

#

Can you post a sample?

dry cedar
#

Well, can this be prompted away?

covert robin
#

Default behavior is an insane regression though

#

💀

#

At least in this test

tacit fossil
#

In Javascript, write the code for a cellula automata roguelike dungeon generator

i didnt expect it to do all this

covert robin
#

I asked it “write good, readable code” in the prompt this time

#

Very slightly better, perhaps

#

Still super code golfed

#

long css lines without new lines…

#

One letter class names

tacit fossil
#

what's the full prompt you used?

#

i was joking before but it does actually look like "compiled" js

dry cedar
#

"Use descriptive variable names"

tacit fossil
vivid wigeon
#

Yea the model was terrible at both coding and creative writing

#

thankfully it wasnt deepseek

digital ledge
#

nah.

#

I dont agree with you that it's terrible at creative writting

tacit fossil
#

ok i asked it for a chess game and it wrong similar looking code

#

its working though...

modest bridge
#

yeah i heard its creative writing is good too

covert robin
#

ah yes... i am the hybrid white/black pieces..

harsh tree
#

Long context of hunger lapha is top notch

tacit fossil
covert robin
tacit fossil
#

“Black Or White” by Michael Jackson
Listen to Michael Jackson: https://MichaelJackson.lnk.to/_listenYD

Michael Jackson’s “Black Or White” was the first of nine short films from 1991’s Dangerous. Directed by John Landis, “Black Or White” features groundbreaking special effects and electrifying dance sequences. The short film prem...

▶ Play video
limpid star
#

If Xiaomi makes models, will they make cheaper ones and name them Poco?

undone wyvern
#

It changed reasoning style again, I don't know what's happening

smoky leaf
#

for better or for worse

undone wyvern
#

It seems to follow instructions better both in reasoning block and in output. Not sure yet - but it's suddenly different from yesterday's

digital ledge
#

Creative writting?

undone wyvern
#

Yes, but different tasks too

#

Or maybe I am seeing things

harsh tree
#

Loved the hunter model

#

0< Token ≤ 256 K:Input: $1.00 / 1M tokens, Cached Input: $0.20 / 1M tokens, Output: $3.00 / 1M tokens

#

lmao with this costing I would have spent a north of 15k USD

#

😋

kind marsh
#

FUCK it's expensive
Goddammit back to using mimo V2 flash

#

Jeezuz at 14k context it's blowing through 1 cent per msg.
Fuck. I love it, but damn it the price is Gemini level

pseudo rover
#

Yeah GLM 5's better for now, sorry

#

Still if they follow through with improving it some more before releasing the weights like they said they would, maybe it'll eventually be worth it

kind marsh
#

Not necessarily. I've used glm 5, it's still relatively inefficient with the reasoning tokens

pseudo rover
#

I really, really want to like it but I just can't do it. It reasons a little less, but the reasoning itself is wildly inconsistent at times and it doesn't necessarily feel better for it. Idk, it just doesn't land for me yet

#

It'd be a lot more competitive if it was either better or cheaper

patent zenith
#

Ime GLM yaps for a goddamn millennia. I haven’t used MiMo V2 Pro or Omni to have strong opinions about them, but yeah, the G3 flash price is very contentious since there’s already genuinely SOTA models at that price, so any model that falls short of that is underwhelming and feels as though it’s just not worth the price.

kind marsh
kind marsh
patent zenith
nova swan
#

bro HOW is it even possible that this mimo v2 pro shit is more expensive and worse than glm

patent zenith
#

what model is on the right? Thats not the pricing for GLM

digital kelp
digital kelp
patent zenith
#

ah true but still

#

cache hit rate is crazy right now for MiMo Pro

lyric agate
upper parrot
#

yes, alot of use, but im pretty sure like 99% of those pppeople dont actually get anything done with it

smoky leaf
drowsy moth
#

What did this model end up being?

#

Was it the mimo one?

dry cedar
#

Yep

drowsy moth
#

Got it. Was it the pro or omni one?

dry cedar
#

This one was Pro and Healer was omni

drowsy moth
#

Gotcha, thanks for confirming!

tame olive
plucky oriole
#

They nerfed it didn't they

wheat cave
#
poll_question_text

Who is it

victor_answer_votes

22

total_votes

32

victor_answer_id

2

victor_answer_text

deepseek

#

(it was mimo 😭)

storm trail
#

previously, mimo was so irrelevant nobody think of it seriously. no way dipsy is this unpolished, hail dipsy.

tacit fossil
wheat cave
strange cliff
harsh tree