GLM 5 | OpenRouter | Page 1

safe hill Feb 11, 2026, 12:52 PM

#

https://chat.z.ai/

Z.ai Chat - Free AI powered by GLM-4.7 & GLM-4.6

Chat with Z.ai's free AI to build websites, create presentations, and write professionally. Fast, smart, and reliable, powered by GLM-4.7.

glacial onyx Feb 11, 2026, 12:55 PM

#

we got the open source week

paper meadow Feb 11, 2026, 12:57 PM

#

https://tenor.com/view/happy-birthday-pony-gif-27170110

Tenor

glad scroll Feb 11, 2026, 1:03 PM

#

hi!

worn depot Feb 11, 2026, 1:10 PM

#

https://glm5.com redirects to pony alpha lol

Pony Alpha - API, Providers, Stats

Pony is a cutting-edge foundation model with strong performance in coding, agentic workflows, reasoning, and roleplay, making it well suited for hands-on coding and real-world use.

Note: All prompts and completions for this model are logged by the provider and may be used to improve the model. Run Pony Alpha with API

heady shadow Feb 11, 2026, 1:42 PM

#

yay

ebon depot Feb 11, 2026, 3:21 PM

#

its out

worn depot Feb 11, 2026, 3:22 PM

#

on hf?

#

no ,not yet

ebon depot Feb 11, 2026, 3:22 PM

#

on api

worn depot Feb 11, 2026, 3:22 PM

#

oh

placid crescent Feb 11, 2026, 3:23 PM

#

😮

sinful echo Feb 11, 2026, 3:23 PM

#

can 'thinking' be turned off with this? still haven't seen any docs on z.ai's website

placid crescent Feb 11, 2026, 3:23 PM

#

(they better make a glm-5-flash for me >v< )

stable salmon Feb 11, 2026, 4:38 PM

#

Now the battle if this is better the same or worse than Pony Alpha.

pseudo anvil Feb 11, 2026, 4:38 PM

#

im nervous

ebon depot Feb 11, 2026, 4:39 PM

#

things are going well on the z.ai server

paper meadow Feb 11, 2026, 4:50 PM

#

Most polite way to talk with laowai

plain topaz Feb 11, 2026, 4:51 PM

#

only on max

glad scroll Feb 11, 2026, 4:54 PM

#

https://openrouter.ai/z-ai/glm-5

Model Not Found | OpenRouter

The model you are looking for could not be found.

ebon depot Feb 11, 2026, 4:57 PM

#

half canopy Feb 11, 2026, 4:58 PM

#

https://docs.z.ai/guides/llm/glm-5

Overview - Z.AI DEVELOPER DOCUMENT

GLM-5 - Overview - Z.AI DEVELOPER DOCUMENT

#

Wake up babes, the docs and graphs for glm 5 are up

#

1 dollar input and 3.2 dollar output pricing

ebon depot Feb 11, 2026, 5:01 PM

#

more expensive than kimi 2.5

half canopy Feb 11, 2026, 5:01 PM

#

ebon depot >more expensive than kimi 2.5

It's better than kimi 2.5 from what I tested, it also has more active params from what has been leaked.

rotund wing Feb 11, 2026, 5:02 PM

#

half canopy Feb 11, 2026, 5:02 PM

#

Considering it performs on the level of opus 4.5 thinking, it's not that expensive

pseudo anvil Feb 11, 2026, 5:02 PM

#

rp mention 🔥

rotund wing Feb 11, 2026, 5:03 PM

#

yey

pseudo anvil Feb 11, 2026, 5:03 PM

#

paper meadow Feb 11, 2026, 5:03 PM

#

Open weights?

ebon depot Feb 11, 2026, 5:03 PM

#

paper meadow Open weights?

presumably, other providers have it

half canopy Feb 11, 2026, 5:03 PM

#

Agents, coding and gooning(RP). It covers the trinity of LLM usage 🍾

chilly leaf Feb 11, 2026, 5:06 PM

#

any providers support nothink?

worn depot Feb 11, 2026, 5:07 PM

#

benchmarks are here on the api and blog now

#

https://z.ai/blog/glm-5

#

~~though weird they dont compare to k2.5~~

worn depot Feb 11, 2026, 5:08 PM

#

worn depot ~~though weird they dont compare to k2.5~~

nvm they do lower on the page, just not in the main benchmark image

#

but its more expensive than k2.5 tho

cobalt furnace Feb 11, 2026, 5:08 PM

#

No GLM-5 on the Pro coding plan 🥀

worn depot Feb 11, 2026, 5:09 PM

#

we’re rolling out GLM-5 to Coding Plan users gradually.

half canopy Feb 11, 2026, 5:09 PM

#

cobalt furnace No GLM-5 on the Pro coding plan 🥀

It is slowly being added

worn depot Feb 11, 2026, 5:09 PM

#

max get it right now, but i guess its a slow rollout for the others

half canopy Feb 11, 2026, 5:09 PM

#

And will consume more credits than glm 4.6

#

Also considering it's 4x the price of Kimi 2.5 in their own benchmark, doesn't look that good even if it's 2-3% better

#

😭

worn depot Feb 11, 2026, 5:10 PM

#

i honestly never liked glm models, but this one atleast under pony alpha was quite good

half canopy Feb 11, 2026, 5:11 PM

#

worn depot i honestly never liked glm models, but this one atleast under pony alpha was qui...

The models directly using the api aren't worth the money, but if you buy their coding plan. For 9 dollars for 3 months, you can use like 30 mil of something tokens every 5 hours

#

Super high value

light dust Feb 11, 2026, 5:11 PM

#

worn depot i honestly never liked glm models, but this one atleast under pony alpha was qui...

Against kimi?

half canopy Feb 11, 2026, 5:11 PM

#

On the most basic plan

worn depot Feb 11, 2026, 5:11 PM

#

light dust Against kimi?

in general, i didn't really compare them on the same tasks
but it felt about as good

light dust Feb 11, 2026, 5:12 PM

#

worn depot in general, i didn't really compare them on the same tasks but it felt about as...

Unrelated but what do you use for agentic coding? Kimi or glm

worn depot Feb 11, 2026, 5:12 PM

#

kimi

ebon depot Feb 11, 2026, 5:12 PM

#

half canopy The models directly using the api aren't worth the money, but if you buy their c...

not with glm 5 anymore

worn depot Feb 11, 2026, 5:12 PM

#

and minimax for simpler stuff

light dust Feb 11, 2026, 5:12 PM

#

worn depot kimi

Thinking or 2.5?

worn depot Feb 11, 2026, 5:12 PM

#

light dust Thinking or 2.5?

2.5

light dust Feb 11, 2026, 5:12 PM

#

worn depot 2.5

Thanks

#

I never got minimax to work for me

#

It really feels dumb

kind rover Feb 11, 2026, 5:13 PM

#

lets see how this model does

half canopy Feb 11, 2026, 5:13 PM

#

ebon depot not with glm 5 anymore

They're slowly rolling glm 5 to the coding plan, I believe

light dust Feb 11, 2026, 5:13 PM

#

But if the translation could be Opus = kimi 2.5, Sonnet = Glm, Haiku = Minimax

ebon depot Feb 11, 2026, 5:13 PM

#

half canopy They're slowly rolling glm 5 to the coding plan, I believe

not to lite

half canopy Feb 11, 2026, 5:14 PM

#

ebon depot not to lite

oO 😭

inland heath Feb 11, 2026, 5:14 PM

#

Ow, this is pricy

worn depot Feb 11, 2026, 5:14 PM

#

light dust But if the translation could be Opus = kimi 2.5, Sonnet = Glm, Haiku = Minimax

i wish, but definitely not opus level, but better than sonnet

inland heath Feb 11, 2026, 5:14 PM

#

Cheapest provider is $0.80 / $2.56

light dust Feb 11, 2026, 5:15 PM

#

worn depot i wish, but definitely not opus level, but better than sonnet

Yeah unfortunately no os model

#

Comes close to the one shot performance of opus

ebon depot Feb 11, 2026, 5:15 PM

#

https://tenor.com/view/monkey-monkey-sleep-sleep-wrapped-up-tucked-in-gif-5999802737525363478

Tenor

light dust Feb 11, 2026, 5:15 PM

#

But opus is so damn expensive 😭

light dust Feb 11, 2026, 5:15 PM

#

inland heath Cheapest provider is $0.80 / $2.56

Congrats on the mod role bro

#

Momentum got you something nice KEKW

ebon depot Feb 11, 2026, 5:16 PM

#

the lgbtq+ community has forgiven momentum

light dust Feb 11, 2026, 5:16 PM

#

kind rover lets see how this model does

Glm 5 is pretty nice, at least the hidden version of the model worked nice with opencode

inland heath Feb 11, 2026, 5:17 PM

#

Seems like the Z.ai server is in the middle of something

#

Some stuff about the pro plan changing prices

plain topaz Feb 11, 2026, 5:19 PM

#

glm5 very slow rn

muted drum Feb 11, 2026, 5:19 PM

#

Can't use it...

tribal elbow Feb 11, 2026, 5:22 PM

#

thank you!

cerulean bough Feb 11, 2026, 5:28 PM

#

ebon depot things are going well on the z.ai server

day i upgraded to max I got booted, no idea why was actually defending them against some guy going crazy lmao..weird serve

ebon depot Feb 11, 2026, 5:32 PM

#

https://huggingface.co/zai-org/GLM-5

zai-org/GLM-5 · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

summer gulch Feb 11, 2026, 5:34 PM

#

It's actually a bit of an upgrade - Pony struggled to consistently add colour tags within story dialog, but this actually seems to be managing it so far.

warm silo Feb 11, 2026, 5:36 PM

#

inland heath Some stuff about the pro plan changing prices

it's not just that, they also changed some other things about their plans such as adding a weekly limit and removing the "flagship model updates" promise from the pro plan, glm-5 is currently on the max plan only (which is now $80 a month)

#

all plans got price hikes as well, Lite was $6 a month and now is $10 for the same usage and models

inland heath Feb 11, 2026, 5:37 PM

#

Oh, boy

kind rover Feb 11, 2026, 5:38 PM

#

Yeah this model isn't that good for me

sinful echo Feb 11, 2026, 5:42 PM

#

Can we disable thinking?

rare hedge Feb 11, 2026, 5:49 PM

#

warm silo it's not just that, they also changed some other things about their plans such a...

LOL.

#

who would pay $80 a month for a shitty chinese model instead of $100 for opus 4.6

kind rover Feb 11, 2026, 5:51 PM

#

heres what I got out of it https://x.com/Gardasio/status/2021643274952618251

#

I think this model has potential just that compute is just not there yet

#

maybe if cerebras can host it without cooking it too badly it would have potential

ebon depot Feb 11, 2026, 5:52 PM

#

yeah z.ai is clearly compute constrained

kind rover Feb 11, 2026, 5:52 PM

#

but yeah still far away I think from being usable day to day

kind rover Feb 11, 2026, 5:52 PM

#

ebon depot yeah z.ai is clearly compute constrained

100%

wooden briar Feb 11, 2026, 5:54 PM

#

https://chat.z.ai/space/q1aa81f5fz31-art

Awesome

Z.AI

Z.AI 分享

来自 Z.AI 的精彩内容分享

#

(ghost in the shell reference)

merry plover Feb 11, 2026, 6:07 PM

#

lul, this model tax is insane

#

#

(it's basically the same model architecture!)

mellow ember Feb 11, 2026, 6:08 PM

#

LOL IPO

drowsy kettle Feb 11, 2026, 6:08 PM

#

How is it for roleplaying?

ebon depot Feb 11, 2026, 6:09 PM

#

very good

mellow ember Feb 11, 2026, 6:09 PM

#

I'm liking it so far

velvet patio Feb 11, 2026, 6:09 PM

#

Pony was good so assuming it didn't get neutered probably solid

merry plover Feb 11, 2026, 6:14 PM

#

claude thinks glm moe should be only ~10% more expensive : )

sinful echo Feb 11, 2026, 6:16 PM

#

Can we disable reasoning here?

velvet patio Feb 11, 2026, 6:19 PM

#

merry plover claude thinks glm moe should be only ~10% more expensive : )

Bulbasaur just offered to host GLM 5 on OpenRouter for 10% more cost than GLM 4.7. What a legend

merry plover Feb 11, 2026, 6:19 PM

#

GLM 5 performs 4 points higher than Claude Opus 4.5 in TerminalBench

#

doge_kek

merry plover Feb 11, 2026, 6:20 PM

#

velvet patio Bulbasaur just offered to host GLM 5 on OpenRouter for 10% more cost than GLM 4....

if there was an easy way for me to just pay a lump sum -> rent out to openrouter, I would

#

tokens need to have property managers

cerulean bough Feb 11, 2026, 6:20 PM

#

oof those new sub prices.

cobalt furnace Feb 11, 2026, 6:34 PM

#

I see that lite has gone up, but I'm pretty sure Pro stayed the same at 30USD/mo

chilly leaf Feb 11, 2026, 6:35 PM

#

5 at least writes a lot better than 4.X, much less rigid and ism prone

#

was a little worried it would go the way of deepseek

placid crescent Feb 11, 2026, 6:42 PM

#

chilly leaf was a little worried it would go the way of deepseek

what happened to deepseek?

chilly leaf Feb 11, 2026, 6:45 PM

#

versions after 0528 became much more sterile imo

cerulean bough Feb 11, 2026, 6:55 PM

#

chilly leaf 5 at least writes a lot better than 4.X, much less rigid and ism prone

huge upgrade on 4.7

#

can use 3x concurrency on 4.7 now with max plan

#

but i want the shiny new toy

half geode Feb 11, 2026, 7:03 PM

#

The coding plan stuff is going to burn some goodwill

cyan sequoia Feb 11, 2026, 7:08 PM

#

merry plover lul, this model tax is insane

its a much better model though

#

so makes sense for them to charge on how good it is while still overcutting the ones that do match / out perform it by a good margin

half geode Feb 11, 2026, 7:09 PM

#

I get it, the model is bigger and compute is constrained, price hiking is understandable, but not giving access to the people who already paid is what will piss people off

cyan sequoia Feb 11, 2026, 7:09 PM

#

This feels like sonnet 4.5

#

but 1/6 ish the price

#

maybe even better than sonnet

faint sphinx Feb 11, 2026, 7:10 PM

#

Hello everyone

cerulean bough Feb 11, 2026, 7:10 PM

#

half geode I get it, the model is bigger and compute is constrained, price hiking is unders...

huh? I have max plan until the end date..did they say something about the change?

cerulean bough Feb 11, 2026, 7:10 PM

#

faint sphinx Hello everyone

gm

half geode Feb 11, 2026, 7:10 PM

#

cyan sequoia but 1/6 ish the price

What? It is 3x cheaper, not 6x.

half geode Feb 11, 2026, 7:11 PM

#

cerulean bough huh? I have max plan until the end date..did they say something about the change...

Max has access rn. Pro supposedly will in the future. Lite will not.

cerulean bough Feb 11, 2026, 7:11 PM

#

half geode Max has access rn. Pro supposedly will in the future. Lite will not.

Oh i get you, lite and pro dont have 5 access, thanks for clearing that up

cyan sequoia Feb 11, 2026, 7:12 PM

#

$1 $3.20 $0.20 vs $3 $15 $0.3

#

that is input / output / cache

#

total its a bit more than 5x cheaper

fluid zealot Feb 11, 2026, 7:13 PM

#

cyan sequoia total its a bit more than 5x cheaper

You'll have to adjust for verbosity. Cannot guess a cars mileage (tok/task) while only accounting for gas type (mtok).

half geode Feb 11, 2026, 7:14 PM

#

Ah, I missed the output price, though it was $1/$5 not $1/$3.20. My B

#

They're potentially trying to push to Lite I guess? The tweet implies that while the website does not.

wooden briar Feb 11, 2026, 7:17 PM

#

fluid zealot You'll have to adjust for verbosity. Cannot guess a cars mileage (tok/task) whil...

And Claude's verbosity is HIGHLY variable based on the output effort params, at least with the new Opus

half geode Feb 11, 2026, 7:17 PM

#

GLM-5 is coming to Coding Plan Pro users within one week, and we're working to bring it to everyone after that.

graceful tapir Feb 11, 2026, 7:23 PM

#

no arc agi?

signal oar Feb 11, 2026, 7:51 PM

#

can anyone who has used it answer if glm 5 is fire

graceful tapir Feb 11, 2026, 7:51 PM

#

signal oar can anyone who has used it answer if glm 5 is fire

no ofc its not lmao. look at the gpus they are using

signal oar Feb 11, 2026, 7:52 PM

#

They using gtx 1070 ti for ts?

graceful tapir Feb 11, 2026, 7:52 PM

#

signal oar They using gtx 1070 ti for ts?

yes. that is trash for llms

signal oar Feb 11, 2026, 7:52 PM

#

y'caint blame em i mean they cant get them fire gpus in china

graceful tapir Feb 11, 2026, 7:53 PM

#

signal oar y'caint blame em i mean they cant get them fire gpus in china

glm is not hot shit. its just shit.

#

because trash gpu.

signal oar Feb 11, 2026, 8:00 PM

#

signal oar y'caint blame em i mean they cant get them fire gpus in china

give me a bunch of gpus and i'd come out with some bullshit though 😂

thorny moss Feb 11, 2026, 9:11 PM

#

graceful tapir because trash gpu.

That is one hell of a brainlet answer lol

#

GLM 5 is absolutely solid, so far agentic and coding workflows have been reliable and coding is nothing it needs to hide for in front of GPT 5.3 Codex and Opus 4.6

#

Given its price, it's probably going to be a great general / default model, with specific domains maybe delegated to other models. Outside of productivity, it's also quite the pleasant RP model.

#

For now I would wait for more reliable providers to get available, so you have some buffer in case one isn't doing well

fluid zealot Feb 11, 2026, 9:20 PM

#

unlike 4.7, no chess reasoning loops 🥳

thorny moss Feb 11, 2026, 9:21 PM

#

signal oar They using gtx 1070 ti for ts?

If rumors are true, they mainly use Huawei chips. While not as powerful as Nvidia, being able to train and serve at all on such hardware is an achievement. But that's based on rumors.

thorny moss Feb 11, 2026, 9:21 PM

#

fluid zealot unlike 4.7, no chess reasoning loops 🥳

Overall reasoning feels a lot less overbearing, whatever they changed is welcomed xD

fresh osprey Feb 11, 2026, 9:25 PM

#

Any changes between pony and 5?

paper meadow Feb 11, 2026, 9:26 PM

#

Not many, they just horsing around

thorny moss Feb 11, 2026, 9:28 PM

#

fresh osprey Any changes between pony and 5?

At least regarding possible censorship additions, no. At least in my test suite it didn't complain (which is good)

chilly leaf Feb 11, 2026, 10:19 PM

#

man some of the third party providers are getting hammered

mossy charm Feb 11, 2026, 10:30 PM

#

Its quite good IMO

#

Lite doesn't include 5 yet tho 🙁

cloud stag Feb 11, 2026, 10:42 PM

#

wet comet Feb 11, 2026, 10:42 PM

#

nice

wet comet Feb 11, 2026, 10:43 PM

#

cloud stag

we love ur models

are yall releasing a small model soon?

pulsar niche Feb 11, 2026, 10:46 PM

#

Underwhelming 56.2% on lateralbench, underperforming K2.5

wet comet Feb 11, 2026, 10:47 PM

#

cloud stag

just realised this is a good opportunity to mention that i was permabanned in your server on accident

#

please resolve this

mossy charm Feb 11, 2026, 11:32 PM

#

chilly leaf man some of the third party providers are getting hammered

Try it, it's a pretty noticeable upgrade from 4.7 IMO

chilly leaf Feb 11, 2026, 11:34 PM

#

I've been trying it and liking it, the issue is that my preferred provider is getting swamped right now lol

half geode Feb 11, 2026, 11:38 PM

#

Interested to see how this plays out at the pricing

#

It costs roughly double what Kimi 2.5 and Gem Flash 3 do, and those models are no slouches.

wooden briar Feb 12, 2026, 12:29 AM

#

pulsar niche Underwhelming 56.2% on lateralbench, underperforming K2.5

.2% diff is within statistical error range, hardly a difference. What's that bench anyway? I googled it and all the results were stuff about workouts, lol

Edit: I was looking at the wrong Kimi model on the chart, error on my part, disregard

inland heath Feb 12, 2026, 12:30 AM

#

https://www.lateralbench.org/, weird that it doesn't have good SEO. Really cool benchmark

proud river Feb 12, 2026, 12:30 AM

#

Is this available on OR?

ebon depot Feb 12, 2026, 12:30 AM

#

yes

wooden briar Feb 12, 2026, 12:32 AM

#

inland heath <https://www.lateralbench.org/>, weird that it doesn't have good SEO. Really coo...

Thanks, looks interesting. I do like 'weird benchmarks', like the one that tests models on how well they do on the NYT Connections puzzle

proud river Feb 12, 2026, 12:33 AM

#

Nice, good model. And the old jailbreak still works

pulsar niche Feb 12, 2026, 12:47 AM

#

inland heath <https://www.lateralbench.org/>, weird that it doesn't have good SEO. Really coo...

Thanks. It was a weekend project and turned out better than expected so I keep running new models. I'm a backend guy, not a frontend guy. If you do have suggestions I'm open to them

mossy charm Feb 12, 2026, 12:55 AM

#

wooden briar Thanks, looks interesting. I do like 'weird benchmarks', like the one that tests...

It really is pretty good

#

Its better than Gemini Flash imo

#

Closer to Sonnet

#

(On major languages)

tall wasp Feb 12, 2026, 2:43 AM

#

is artificial analysis any good of a benchmark? GLM 5 is scoring above kimi k2.5, gpt 5.2 codex and gemini 3 pro

merry plover Feb 12, 2026, 2:44 AM

#

tall wasp is artificial analysis any good of a benchmark? GLM 5 is scoring above kimi k2.5...

it's ok but it's very biased

ebon depot Feb 12, 2026, 2:44 AM

#

it's gotten better, used to be way worse

merry plover Feb 12, 2026, 2:44 AM

#

yeah

#

but it's clearly still not a very "general" benchmark

#

for example, it doesn't very well show you that GLM 5 with hallucinations is still complete dogshit

#

compared to any frontier model

ebon depot Feb 12, 2026, 2:45 AM

#

generally my favorite benchmark is https://fiction.live/stories/Fiction-liveBench-Feb-21-2025/oQdzQvKHw8JyXbN87

fresh osprey Feb 12, 2026, 2:46 AM

#

I keep getting routed to Friendli, which take the input, then immediately stops..

merry plover Feb 12, 2026, 2:46 AM

#

ebon depot generally my favorite benchmark is https://fiction.live/stories/Fiction-liveBenc...

I think one of the best "aggregate" benchmarks might be "what is the average normalized performance of the model on ANY benchmark you give it"

#

because its generalized performance is one of the best indicators for its performance in "real usage"

#

so yeah, picking random benches like fiction live bench are a good proxy

#

especially if they don't seem to be getting bench maxxed

ebon depot Feb 12, 2026, 2:47 AM

#

true, I just like fiction livebench because it tests narrative understanding instead of needle in a haystack

tall wasp Feb 12, 2026, 2:47 AM

#

merry plover it's ok but it's very biased

i see, makes sense, i still find kimi k2.5 to be the best open model currently

tall wasp Feb 12, 2026, 2:47 AM

#

fresh osprey I keep getting routed to Friendli, which take the input, then immediately stops....

from my testing zai themselves seem to be the only reliable provider rn

merry plover Feb 12, 2026, 2:48 AM

#

tall wasp i see, makes sense, i still find kimi k2.5 to be the best open model currently

I personally find any model with awful hallucinations really really hard to use, so yeah I agree, only because kimi k2.5 is "ok" on hallucinations

ebon depot Feb 12, 2026, 2:48 AM

#

official providers are almost always worth it

balmy cave Feb 12, 2026, 2:53 AM

#

speed is atrocious

halcyon estuary Feb 12, 2026, 2:59 AM

#

TTFT is like 30 seconds

#

on venice

thorny moss Feb 12, 2026, 3:10 AM

#

Huh, I'm getting around 20tps on Z.AI and Novita

#

Give it some time, until everything is stable it usually takes a bit. At least Anthropic and OpenAI have been equally as slow and unreliable this week...

halcyon estuary Feb 12, 2026, 3:16 AM

#

looks fine now

jagged kayak Feb 12, 2026, 3:59 AM

#

is there a way to set "clear_thinking": False? i tried extrabody but not workinghttps://docs.z.ai/guides/capabilities/thinking-mode#preserved-thinking

Overview - Z.AI DEVELOPER DOCUMENT

Thinking Mode - Overview - Z.AI DEVELOPER DOCUMENT

GLM offers multiple thinking modes for different scenarios. The sections below explain how to enable each mode, key considerations, and example usage.

fluid zealot Feb 12, 2026, 4:28 AM

#

balmy cave speed is atrocious

unfortunately every model release nowadays is always crippled by slow inference. annoying indeed. also makes benchmarking endeavors very tedious. everyone is hardware-starved https://x.com/Zai_org/status/2021656633320018365

Z.ai (@Zai_org)

GLM-5 is coming to Coding Plan Pro users within one week, and we're working to bring it to everyone after that.

To be upfront: compute is very tight. Even before the GLM-5 launch, we were pushing every chip to its limit just to serve inference. We appreciate your understanding

lone halo Feb 12, 2026, 4:49 AM

#

codex got faster in the last 5.3 release as Stargate Project datacenters are coming online

thorny moss Feb 12, 2026, 5:04 AM

#

It also was down for half a day, at least in my region xD

pseudo anvil Feb 12, 2026, 5:19 AM

#

still toying with GLM 5 but I find myself preferring pony alpha

#

probably placebo or bias

ebon depot Feb 12, 2026, 5:19 AM

#

interesting, I'm the opposite

#

I like glm 5 release better

pseudo anvil Feb 12, 2026, 5:22 AM

#

one thing I noticed is that it’s very context dependent

#

I always find it gravitating towards the style of the intro msg as opposed to my writing style prompt

ebon depot Feb 12, 2026, 5:25 AM

#

pseudo anvil I always find it gravitating towards the style of the intro msg as opposed to my...

probably a prompt difference then, I send the chat history as a user message instead of actual multi turn

pseudo anvil Feb 12, 2026, 5:29 AM

#

ah interesting

cloud stag Feb 12, 2026, 9:18 AM

#

wet comet we love ur models are yall releasing a small model soon?

Not that soon

half matrix Feb 12, 2026, 11:05 AM

#

how is it?

#

compare to the previous checkpoint

#

pony

cyan sequoia Feb 12, 2026, 11:47 AM

#

the same

tall wasp Feb 12, 2026, 12:44 PM

#

hallucination problems aside, how good is glm 5 on web dev? have anyone tested that?

safe hill Feb 12, 2026, 2:43 PM

#

https://modal.com/glm-5-endpoint

merry plover Feb 12, 2026, 3:00 PM

#

cloud stag Not that soon

How about faster inference? :D

ebon depot Feb 12, 2026, 3:00 PM

#

merry plover How about faster inference? :D

how are they gonna do that in china 😭

ancient flare Feb 12, 2026, 3:57 PM

#

i did rp with 4.7, then with 5, it was night and day

ancient flare Feb 12, 2026, 3:58 PM

#

ebon depot how are they gonna do that in china 😭

howahway chips!!

cloud stag Feb 12, 2026, 4:12 PM

#

merry plover How about faster inference? :D

It's getting faster as it's a new base model. There are lots of optimizations we need to do.

wet comet Feb 12, 2026, 4:13 PM

#

cloud stag It's getting faster as it's a new base model. There are lots of optimizations we...

glm 4.7 flash is still the best size to performance coding model IMO

unkempt hemlock Feb 12, 2026, 5:21 PM

#

wet comet glm 4.7 flash is still the best size to performance coding model IMO

Thank you very much for your recognition～

paper meadow Feb 12, 2026, 6:41 PM

#

It's clever model, but it repeats itself after retries, as if Temperature is set too low, making answers deterministic. But I checked, temp is working right, properly breaking after 1.2+

#

Only me?

arctic portal Feb 12, 2026, 6:43 PM

#

paper meadow It's clever model, but it repeats itself after retries, as if Temperature is set...

No glm is always like this

#

every other model has been like this

#

try changing samplers

paper meadow Feb 12, 2026, 6:44 PM

#

I didn't try other GLMs thoroughly

paper meadow Feb 12, 2026, 6:44 PM

#

arctic portal try changing samplers

There are not many samplers in API

arctic portal Feb 12, 2026, 6:44 PM

#

glm follows instructions very rigidly

#

perhaps try not using a system prompt

paper meadow Feb 12, 2026, 6:45 PM

#

Oh, maybe I need introduce some randomness in instructions or make statements more vague

#

Good idea

paper meadow Feb 12, 2026, 7:00 PM

#

Huh, it means I need separate system prompt, completely rewritten, just for GLM5, while others work from good to okay with current one. Tough choice

fluid zealot Feb 12, 2026, 7:07 PM

#

Tested GLM-5:

Beefier GLM hybrid-reasoning MoE model (355B-A32B → 744B-A40B).

Default/Thinking:
Slightly more verbose than previous GLM models, DeepSeek-R1 0528 level.
76% of tokens were generated during reasoning.

very high general logic and reasoning
I saw no leaps in my STEM & tech tasks
reasonably censored
Unlike 4.7, no reasoning loops encountered

Chess performance wasn't great in a vacuum: 6k tok/move ~780 mixed elo /w 62% accuracy, decent blind legality, around o1-mini. Best among GLM family though.

Nonthinking:

~76% token savings (non-reasoning segments were samey).
negatively impacts logic and maths
was slightly less likely to refuse in censorship testing

Overall, very solid and one of the best open models currently, but YMMV.

paper meadow Feb 12, 2026, 7:11 PM

#

For me, thinking verbosity went from K2-like (HUGE) to compressed more average one, during the Pony Alpha phase, like as if they updated model with less verbose reasoning version

fluid zealot Feb 12, 2026, 7:48 PM

#

how verbose a model is depends on the task. this is just the average from the 250 general use queries. if I isolate to chess moves, kimi is ~50% more verbose in one mode (full info), but ~50% less verbose in another (no info). So obviously one should compare it for their own use case outcomes.

burnt belfry Feb 12, 2026, 10:52 PM

#

fluid zealot how verbose a model is depends on the task. this is just the average from the 25...

no need

#

There's always gonna be outliers when your tasks are too specific and not diverse enough, but there's no need to highlight them. What matters is the overall difference in model behavior rather than your oddly specific tasks favoring certain models

worn yarrow Feb 12, 2026, 11:08 PM

#

hooray they fixed the loops

glacial onyx Feb 13, 2026, 3:34 AM

#

unkempt hemlock Thank you very much for your recognition～

yeah there is literally no model that comes close for a 30b model

pseudo anvil Feb 13, 2026, 11:48 AM

#

which provider do you guys use?

ebon depot Feb 13, 2026, 11:55 AM

#

z.ai

pseudo anvil Feb 13, 2026, 11:55 AM

#

ive been on zai too so far but considering making the switch to fireworks

ebon depot Feb 13, 2026, 11:57 AM

#

why?

pseudo anvil Feb 13, 2026, 12:00 PM

#

performance wise is supposedly the best from OR stats

thorny hinge Feb 13, 2026, 12:36 PM

#

I had a poor first experience with GLM 5 and I think I realized why. I used it for multilingual writing, but it tests signficantly worse for this task than GLM 4.7, GLM 4.5 on NCBench, mirroring my impression that even made it feel like a ~70-120B model at times. https://www.nc-bench.com/tests/language-writing

#

Language comprehension is retained though, no regression there, which is interesting.

pseudo anvil Feb 13, 2026, 1:11 PM

#

I am so bipolar about this model sometimes the writing is just chefs kiss then other times it has no substance

midnight crow Feb 13, 2026, 1:33 PM

#

GLM 5

ebon depot Feb 13, 2026, 1:48 PM

#

💔

cobalt bronze Feb 13, 2026, 1:52 PM

#

ebon depot 💔

poor plunger is a little slow

paper meadow Feb 13, 2026, 1:54 PM

#

0.6 oh no

ebon depot Feb 13, 2026, 2:00 PM

#

plunger is my stupid chud failson but i love him

fresh osprey Feb 13, 2026, 4:35 PM

#

Well tps from Z.ai hit new lows.

primal thorn Feb 13, 2026, 4:43 PM

#

Getting a lot of 429s

fresh osprey Feb 13, 2026, 5:10 PM

#

Suddenly getting failed to stream errors

mossy charm Feb 13, 2026, 5:24 PM

#

thorny hinge I had a poor first experience with GLM 5 and I think I realized why. I used it f...

GLM 5 is only good in English or Chinese mostly

#

theyre getting slammed

paper meadow Feb 13, 2026, 5:29 PM

#

Is it? I thought the data for training is so sparse now, they fit whatever language they find into model

#

And semantic vectors in models are, like, very universal

unkempt hemlock Feb 14, 2026, 3:47 AM

#

primal thorn Getting a lot of 429s

We will handle it promptly～

velvet patio Feb 14, 2026, 4:30 AM

#

Doing god's work

worn yarrow Feb 14, 2026, 7:58 AM

#

it's a little underwhelming when you consider that they went from 355B to 744B parameters for this amount of improvement

also deepseek has really got to catch up. they are way behind considering their model is 685B params

half matrix Feb 14, 2026, 10:14 AM

#

so i has done some testing with this model

at the current time with not that complex code base it able to perform really well, in term of cost to performance. it's actually pretty fair when being compare to other model with similiar capabilities.

but i hope the infra from the providers could improve so it could be cheaper, i mean i know model with more active parameters and cost less than this so yeah.

merry plover Feb 14, 2026, 5:03 PM

#

GMI Cloud seems to actually be serving this model at quite decent speeds

worn depot Feb 14, 2026, 5:04 PM

#

do their tool calls work though?

burnt belfry Feb 14, 2026, 7:11 PM

#

worn yarrow it's a little underwhelming when you consider that they went from 355B to 744B p...

DeepSeek nailed their model size honestly. GLM needing to up the size only shows that they got it wrong the first time. And 744B vs 685B is not really something you would notice, those are equivalent in practice.

#

Downsizing is normal since it follows the general progress, upsizing is not however. You would mostly only do that if the earlier arch was flawed.

#

With 700B you can just about have SOTA model in 2026. But in 2023 you would have absolutely needed 1T+, and MUCH more activated params. Much more expensive inference per token too.

#

Kinda just goes to show how far ahead of everyone else GPT4 was at the time tbh

velvet patio Feb 14, 2026, 7:43 PM

#

Still wild to me that GPT-4 was such an important part of history, and yet we'll never know anything of value about it from a technical point of view (beyond the original $30 / $60 price tag, which is pretty ridiculous by modern standards)

#

I think a lot also goes into training set though. GLM 5 is noticeably less sloppy than the iterations that came before, so however they cleaned it clearly paid off

burnt belfry Feb 14, 2026, 7:54 PM

#

velvet patio Still wild to me that GPT-4 was such an important part of history, and yet we'll...

There have been fairly credible sources hinting or even quoting the arch specs of it, including nVidia itself. We don't have official confirmation and we never will, but we still have a strong idea about that model being ~1.6T MoE with 200B+ activated per forward pass.

paper meadow Feb 14, 2026, 7:55 PM

#

Gemini 3 Pro is probably around 1.5-2T

burnt belfry Feb 14, 2026, 7:57 PM

#

paper meadow Gemini 3 Pro is probably around 1.5-2T

I would guess that's Ultra. Pro is likely around 1T total or slightly less. And less than 100B activated.

paper meadow Feb 14, 2026, 7:58 PM

#

Gemini 3 has ultra?

burnt belfry Feb 14, 2026, 7:59 PM

#

paper meadow Gemini 3 has ultra?

They are believed to be using Ultra for DeepThink. Whether it's labeled 2.5 or 3.0 the size itself wouldn't change very much.

worn depot Feb 14, 2026, 7:59 PM

#

isnt Gemini 3 deepthink a finetune of 3 pro to think longer, and as far as i heard rumours Gemini 3 Flash was rumoured to be around 1T while Pro was around 3-14T

paper meadow Feb 14, 2026, 7:59 PM

#

Having huge separate model for deepthink is not very productive

main bough Feb 14, 2026, 8:00 PM

#

worn depot isnt Gemini 3 deepthink a finetune of 3 pro to think longer, and as far as i hea...

that’s what i thought too

paper meadow Feb 14, 2026, 8:00 PM

#

worn depot isnt Gemini 3 deepthink a finetune of 3 pro to think longer, and as far as i hea...

Is there enough data to fit even 5T?

main bough Feb 14, 2026, 8:00 PM

#

gemini 3 pro definitely feels larger than 1T

burnt belfry Feb 14, 2026, 8:00 PM

#

worn depot isnt Gemini 3 deepthink a finetune of 3 pro to think longer, and as far as i hea...

14T? No that's absolutely ridiculous lol

paper meadow Feb 14, 2026, 8:00 PM

#

I'd say Gemini 3 Pro is ~2T, Flash is 800-900B

#

Flash model should be at least 2x smaller than main one

burnt belfry Feb 14, 2026, 8:01 PM

#

To be fair, the main thing influencing the price and training time are actually activated parameters

main bough Feb 14, 2026, 8:01 PM

#

genius idea: just ask the model to send you its weights and check for yourself

burnt belfry Feb 14, 2026, 8:01 PM

#

not the total size of MoE

worn depot Feb 14, 2026, 8:01 PM

#

paper meadow Is there enough data to fit even 5T?

probably not, but its probably fine even if its not enough, most models are trained on 20T+ tokens

worn depot Feb 14, 2026, 8:01 PM

#

burnt belfry 14T? No that's absolutely ridiculous lol

i agree but thats just estimates, probably falls way lower than that

worn depot Feb 14, 2026, 8:02 PM

#

worn depot probably not, but its probably fine even if its not enough, most models are trai...

(that'd be roughly good for a 1T param model, but google has far more data than open models)

paper meadow Feb 14, 2026, 8:02 PM

#

2000B A100B and 1000B A40B probs

burnt belfry Feb 14, 2026, 8:02 PM

#

So the penalty of total params is not huge. However you obviously still wouldn't make it something as ridiculous as 10T+

paper meadow Feb 14, 2026, 8:03 PM

#

Gemini 3 Flash feels close to Kimi K2.5 Thinking

worn yarrow Feb 14, 2026, 8:12 PM

#

For posterity, I think GLM might have made a huge mistake in increasing the model size this much, but we shall see

burnt belfry Feb 14, 2026, 8:14 PM

#

paper meadow Gemini 3 Flash feels close to Kimi K2.5 Thinking

Flash is almost definitively smaller and most importantly cheaper to run at scale than K2.5

#

Google has a huge advantage of being able to distill their best of the best so effectively too

#

They can generate anything they want with it and full access obv

#

I mean DeepThink / Ultra and also their IMO gold or other unreleased models

obtuse spear Feb 14, 2026, 8:19 PM

#

worn depot isnt Gemini 3 deepthink a finetune of 3 pro to think longer, and as far as i hea...

gemini 3 deepthink is gemini 3.1 pro

#

#

and then gemini 3 flash is probably 1.2T since there was a rumor that google was licensing a 1.2T model to apple

#

probably something like 1.2T A15B

#

and then i think pro has to be like 3-4T A30B as well

#

moe really was a hell of an advancement

burnt belfry Feb 14, 2026, 8:21 PM

#

obtuse spear and then gemini 3 flash is probably 1.2T since there was a rumor that google was...

It's unlikely that Apple would settle on Flash.... Would be very not like them thing to do

obtuse spear Feb 14, 2026, 8:21 PM

#

why?

#

flash is really really good especially for its price

worn depot Feb 14, 2026, 8:21 PM

#

burnt belfry It's unlikely that Apple would settle on Flash.... Would be very not like them t...

theres no need to use pro for something like siri

#

or well "Apple Intelligence"

obtuse spear Feb 14, 2026, 8:22 PM

#

its for siri too, its not like it needs to solve aime problems lol. but flash still can

#

flash is really really good

inland heath Feb 14, 2026, 8:22 PM

#

2.5 Flash Lite is probably a big upgrade over Siri lol

paper meadow Feb 14, 2026, 8:22 PM

#

I want flash lite with flash's vision quality

#

Please

burnt belfry Feb 14, 2026, 8:22 PM

#

worn depot theres no need to use pro for something like siri

When they went with OpenAI they only did it because there were no better performant alternatives at the time

obtuse spear Feb 14, 2026, 8:23 PM

#

paper meadow I want flash lite with flash's vision quality

#same

obtuse spear Feb 14, 2026, 8:23 PM

#

burnt belfry When they went with OpenAI they only did it because there were no better perform...

yeah but that didnt actually replace siri. siri could just like invoke chatgpt

#

and it wasnt agentic

burnt belfry Feb 14, 2026, 8:24 PM

#

When it's cloud powered and they are only paying electricity, there's very little point to go for small variant

main bough Feb 14, 2026, 8:24 PM

#

speed

obtuse spear Feb 14, 2026, 8:24 PM

#

yeah

#

also "small" is relative at 1.2T lol

worn depot Feb 14, 2026, 8:24 PM

#

burnt belfry When it's cloud powered and they are only paying electricity, there's very littl...

yeah but electrcity is variable based on how many peope are using it, and to lower electricity (maximize efficiency) you need high batch size to run the model on as little gpus as possible

obtuse spear Feb 14, 2026, 8:24 PM

#

flash does great speed. and for siri you want people to like get the result of their task asap

#

makes for a better UX

paper meadow Feb 14, 2026, 8:25 PM

#

Well Apple users should learn to be more patient

burnt belfry Feb 14, 2026, 8:25 PM

#

worn depot yeah but electrcity is variable based on how many peope are using it, and to low...

They want for it to beat competition. Not merely work. I don't believe choosing Flash would work there tbh

obtuse spear Feb 14, 2026, 8:25 PM

#

paper meadow Well Apple users should learn to be more patient

gulp

worn depot Feb 14, 2026, 8:26 PM

#

burnt belfry They want for it to beat competition. Not merely work. I don't believe choosing ...

yes but flash id say is overkill for what people ask siri, and even if it fell back to 3 pro for harder questions i think it would be good

#

they allegedly trained a model as big as 100B~ params internally but it wasnt good enough

obtuse spear Feb 14, 2026, 8:26 PM

#

i actually think that gpt 5 mini is overlooked

burnt belfry Feb 14, 2026, 8:27 PM

#

worn depot yes but flash id say is overkill for what people ask siri, and even if it fell b...

Fallback to Pro would make sense in theory, except they were fairly explicit on a deal for singular model iirc

obtuse spear Feb 14, 2026, 8:28 PM

#

and yeah flash is good enough for that

burnt belfry Feb 14, 2026, 8:28 PM

#

Maybe it's instead they gonna still rely on their in-house models for easy questions

obtuse spear Feb 14, 2026, 8:28 PM

#

throw on some medium thinking level and youre good with 99% of siri queries

arctic portal Feb 14, 2026, 8:28 PM

#

obtuse spear i actually think that gpt 5 mini is overlooked

what do you use it for? i cant think of any use case where other models wouldnt be better

worn depot Feb 14, 2026, 8:29 PM

#

obtuse spear i actually think that gpt 5 mini is overlooked

its wayy too slow for a mini atleast via openai api its at like 20-30 t/s, and thinks a bit too long for it to be realtime like siri

#

gemini 3 flash is significantly better without needing to think

burnt belfry Feb 14, 2026, 8:29 PM

#

burnt belfry Maybe it's instead they gonna still rely on their in-house models for easy quest...

And then do function calling to Gemini for more involved tasks or smth

arctic portal Feb 14, 2026, 8:30 PM

#

worn depot its wayy too slow for a mini atleast via openai api its at like 20-30 t/s, and t...

something like gpt oss would be good for siri

main bough Feb 14, 2026, 8:30 PM

#

worn depot its wayy too slow for a mini atleast via openai api its at like 20-30 t/s, and t...

96 tps with openai api

https://openrouter.ai/openai/gpt-5-mini

GPT-5 Mini - API, Providers, Stats

GPT-5 Mini is a compact version of GPT-5, designed to handle lighter-weight reasoning tasks. It provides the same instruction-following and safety-tuning benefits as GPT-5, but with reduced latency and cost. Run GPT-5 Mini with API

obtuse spear Feb 14, 2026, 8:30 PM

#

arctic portal what do you use it for? i cant think of any use case where other models wouldnt ...

i do classification, extraction of documents, some agentic deep research etc

worn depot Feb 14, 2026, 8:30 PM

#

main bough 96 tps with openai api https://openrouter.ai/openai/gpt-5-mini

ok this is recent, it was never this fast.

burnt belfry Feb 14, 2026, 8:30 PM

#

The thing that killed them with OpenAI was latency

#

Not gonna have this problem when they are hosting Gemini themselves

arctic portal Feb 14, 2026, 8:30 PM

#

obtuse spear i do classification, extraction of documents, some agentic deep research etc

why wouldnt you use deepseek

worn depot Feb 14, 2026, 8:31 PM

#

worn depot ok this is recent, it was never this fast.

last time i was using it, it was about as slow as gpt 5

worn yarrow Feb 14, 2026, 8:31 PM

#

Can we move this chat to the other areas

obtuse spear Feb 14, 2026, 8:31 PM

#

worn depot ok this is recent, it was never this fast.

100% agree. i dont recommend 5 mini for anything realtime/agentic/etc but my tasks are more like slow/can take a while. just need to get the right answer

obtuse spear Feb 14, 2026, 8:31 PM

#

arctic portal why wouldnt you use deepseek

vision

arctic portal Feb 14, 2026, 8:32 PM

#

obtuse spear vision

qwen vl has very good vision

main bough Feb 14, 2026, 8:32 PM

#

worn yarrow Can we move this chat to the other areas

moving chats when they’re already active is cumbersome and kills the flow, personally I think we should embrace conversations naturally arising in unintended places 😊 👍

paper meadow Feb 14, 2026, 8:33 PM

#

Gemini 3 Pro -> 3 Flash -> Qwen3 VL 235 instruct -> GLM 4.6V for vision tasks

burnt belfry Feb 14, 2026, 8:33 PM

#

obtuse spear 100% agree. i dont recommend 5 mini for anything realtime/agentic/etc but my tas...

Kinda crazy how much naming affects things. The current lineup is no different to o3 vs o4-mini

paper meadow Feb 14, 2026, 8:33 PM

#

Maybe Kimi K2.5 is around 4.6V or better for vision

burnt belfry Feb 14, 2026, 8:33 PM

#

they were both same gen

obtuse spear Feb 14, 2026, 8:33 PM

#

paper meadow Gemini 3 Pro -> 3 Flash -> Qwen3 VL 235 instruct -> GLM 4.6V for vision tasks

where does k2.5 land in here?

#

oh okay

#

i havent tested it at all

arctic portal Feb 14, 2026, 8:33 PM

#

paper meadow Maybe Kimi K2.5 is around 4.6V or better for vision

kimi has way better vision than 4.6v

#

worse than gemini 3 pro

paper meadow Feb 14, 2026, 8:34 PM

#

With or without web search?

arctic portal Feb 14, 2026, 8:34 PM

#

paper meadow With or without web search?

without

paper meadow Feb 14, 2026, 8:34 PM

#

I wanted to check K2.5 vision in OR, but forgor

obtuse spear Feb 14, 2026, 8:34 PM

#

arctic portal qwen vl has very good vision

a little too retarded for my usecase

paper meadow Feb 14, 2026, 8:34 PM

#

On site it did good, but maybe it was with external tools

burnt belfry Feb 14, 2026, 8:34 PM

#

And now people overlook gpt5-mini, because the naming already strongly implies worse performance

obtuse spear Feb 14, 2026, 8:34 PM

#

it needs to extract -> transform with some specific specs etc

worn depot Feb 14, 2026, 8:35 PM

#

burnt belfry And now people overlook gpt5-mini, because the naming already strongly implies w...

chatgpt still falls back to it when you run out of GPT 5.2 messages and have thinking enabled and its pretty good

paper meadow Feb 14, 2026, 8:35 PM

#

https://dubesor.de/visionbench

Dubesor LLM Benchmark table - LLM Vision Benchmark

LLM Vision Benchmark - Testing & ranking vision capabilities of large language models through a small but carefully handcrafted test set of challenging vision tasks

obtuse spear Feb 14, 2026, 8:35 PM

#

worn depot chatgpt still falls back to it when you run out of GPT 5.2 messages and have thi...

yeah with thinking like level=high gpt 5 mini is really really solid

paper meadow Feb 14, 2026, 8:35 PM

#

GPT5 is unusable for its censoring, otherwise I agree with dubesor

obtuse spear Feb 14, 2026, 8:35 PM

#

unfortunately its a bit slow at effort=high but even medium is solid

burnt belfry Feb 14, 2026, 8:36 PM

#

worn depot chatgpt still falls back to it when you run out of GPT 5.2 messages and have thi...

No surprises for it being good. When you look at the numbers the difference is near exactly the same as for my mentioned o3 vs o4-mini

obtuse spear Feb 14, 2026, 8:36 PM

#

paper meadow https://dubesor.de/visionbench

idk abt this one. at least for me gpt 5.2 has really really solid vision

paper meadow Feb 14, 2026, 8:37 PM

#

Qwen3-VL-235B-A22B-Instruct has providers doing it 1/3 price of Gemini 3 Flash, meaning it's unbeatable for mass vision tasks where subtext and world/media knowledge is withing common limits

arctic portal Feb 14, 2026, 8:37 PM

#

obtuse spear idk abt this one. at least for me gpt 5.2 has really really solid vision

gpt always hallucinates in my tasks

arctic portal Feb 14, 2026, 8:38 PM

#

paper meadow `Qwen3-VL-235B-A22B-Instruct ` has providers doing it 1/3 price of Gemini 3 Flas...

i completely agree on this

burnt belfry Feb 14, 2026, 8:38 PM

#

obtuse spear idk abt this one. at least for me gpt 5.2 has really really solid vision

dubesor tends to do overly specific tests that favor his very specific workflows. Falls near anecdotal findings

pseudo anvil Feb 14, 2026, 8:38 PM

#

arctic portal i completely agree on this

you're absolutely right!

arctic portal Feb 14, 2026, 8:39 PM

#

pseudo anvil you're absolutely right!

youve hit the nail on the head

obtuse spear Feb 14, 2026, 8:39 PM

#

paper meadow `Qwen3-VL-235B-A22B-Instruct ` has providers doing it 1/3 price of Gemini 3 Flas...

i agree but if you pay a bit more then ytou go to gpt 5 mini

paper meadow Feb 14, 2026, 8:39 PM

#

Dubesor's vision test are technical, as far as I understand, so it does not require world knowledge for model, so 8B with good vision can do good even if base model is stupid

paper meadow Feb 14, 2026, 8:39 PM

#

obtuse spear i agree but if you pay a bit more then ytou go to gpt 5 mini

Gpt is worse on vision compared to 3 flash

#

Not mentioning it will cry about naked shoulder

burnt belfry Feb 14, 2026, 8:40 PM

#

arctic portal gpt always hallucinates in my tasks

Well they do have objectively and measurably much less hallucination than Gemini3 so there's that. But obviously you can't eliminate it completely.

obtuse spear Feb 14, 2026, 8:40 PM

#

its easy for us to say that gpt 5 mini is super underrated because the number of tokens that OR processes for it is low, but we need to remember all of the people who use openai api directly is insanely huge

arctic portal Feb 14, 2026, 8:40 PM

#

paper meadow Gpt is worse on vision compared to 3 flash

3 flash is 50% more expensive than mini

obtuse spear Feb 14, 2026, 8:40 PM

#

paper meadow Gpt is worse on vision compared to 3 flash

yeah, but more expensive

obtuse spear Feb 14, 2026, 8:40 PM

#

paper meadow Not mentioning it will cry about naked shoulder

what is naked shoulder

paper meadow Feb 14, 2026, 8:41 PM

#

obtuse spear what is naked shoulder

Erotica

obtuse spear Feb 14, 2026, 8:41 PM

#

okaaaaaaaaaayyyyyyyy

#

i think we have different use cases

paper meadow Feb 14, 2026, 8:41 PM

#

I meant, it overreacts about not just suggestive images, but anything that can be counted as not for kids

obtuse spear Feb 14, 2026, 8:42 PM

#

and im over here just processing documents 🥀

#

what a primal use of ai

paper meadow Feb 14, 2026, 8:42 PM

#

Just use OCR than. Or you will send something like Book of Vile Darkness or Monster Manual and some overaligned vision model tells you are criminal for forcing vision to read into it

obtuse spear Feb 14, 2026, 8:43 PM

#

cant ocr diagrams

arctic portal Feb 14, 2026, 8:43 PM

#

gemini 3 flash might be less expensive than 5 mini

#

if you dont use reasoning

paper meadow Feb 14, 2026, 8:43 PM

#

Don't forget caching

arctic portal Feb 14, 2026, 8:43 PM

#

but then i dont know how good it is

obtuse spear Feb 14, 2026, 8:43 PM

#

google caching 🥀

paper meadow Feb 14, 2026, 8:43 PM

#

Reasoning for vision is not useful and sometimes hurting

inland heath Feb 14, 2026, 8:44 PM

#

I remember the old days when it was easier to measure a LLM's cost

#

No reasoning, no caching

obtuse spear Feb 14, 2026, 8:45 PM

#

caching has been around for a while

#

at least with openai

#

but now you have to consider how good a providers caching is

halcyon estuary Feb 14, 2026, 8:45 PM

#

old days i only used openai or anthropic

obtuse spear Feb 14, 2026, 8:45 PM

#

and when it comes to google 🥀

#

whenevr i use gemini i genuinely dont even take caching into consideration i just think of the input price

arctic portal Feb 14, 2026, 8:46 PM

#

halcyon estuary old days i only used openai or anthropic

i dunno if you go way back to llama 3 then it was basically on par with closed source

obtuse spear Feb 14, 2026, 8:46 PM

#

do ygs remember the llama 3 leaked weights

paper meadow Feb 14, 2026, 8:47 PM

#

https://tenor.com/view/oh-i-member-south-park-s20e1-member-berries-member-gif-20603952

Tenor

burnt belfry Feb 14, 2026, 8:47 PM

#

#

Gemini doing not good at all

#

GLM5 actually impressive

arctic portal Feb 14, 2026, 8:47 PM

#

gemini has the most knowledge though

#

it just always makes up bullshit if it doesnt know

obtuse spear Feb 14, 2026, 8:47 PM

#

yep

#

the knowledge is insane

#

so impressive for "where was this photo taken" quiestions

burnt belfry Feb 14, 2026, 8:48 PM

#

arctic portal it just always makes up bullshit if it doesnt know

They are trying to force grounding to "fix" it which is crazy

#

Every single model will hallucinate less with search

#

that's not a fix

arctic portal Feb 14, 2026, 8:48 PM

#

burnt belfry They are trying to force grounding to "fix" it which is crazy

it just goes insane over the date being 2026 and will start rambling about fictional timelines

obtuse spear Feb 14, 2026, 8:49 PM

#

google's "google search grounding" is top tier shit

burnt belfry Feb 14, 2026, 8:50 PM

#

obtuse spear google's "google search grounding" is top tier shit

I mean like I said... if you enable search for OpenAI or nearly every other lab, it will improve drastically over baseline as well

obtuse spear Feb 14, 2026, 8:50 PM

#

oh yeah im not defending them at all

arctic portal Feb 14, 2026, 8:50 PM

#

kimi has the best search and deepresearch imo

obtuse spear Feb 14, 2026, 8:50 PM

#

but i mean compare google search grounding vs providing your own tools vs openai native search

arctic portal Feb 14, 2026, 8:50 PM

#

they have some custom crawler

obtuse spear Feb 14, 2026, 8:50 PM

#

good night google

#

kimi?

arctic portal Feb 14, 2026, 8:50 PM

#

yeah its really good at search

obtuse spear Feb 14, 2026, 8:50 PM

#

please elaborate because im building a deep research agent rn

arctic portal Feb 14, 2026, 8:50 PM

#

and it does tons of searches

#

well im not using it for my own tooling, i just use it off the site

obtuse spear Feb 14, 2026, 8:51 PM

#

you can probably use it through OR though right?

arctic portal Feb 14, 2026, 8:51 PM

#

nope only through their site

obtuse spear Feb 14, 2026, 8:52 PM

#

arctic portal and it does tons of searches

see this is what really pisses me off about gemini. you give it tools liek web_search and web_fetch and then it just makes three web_search calls and decides its done

#

i can never depend on it doing really good research

arctic portal Feb 14, 2026, 8:52 PM

#

well actually nevermind moonshot provides search in its own api

#

i dont know if you can use this over OR

obtuse spear Feb 14, 2026, 8:52 PM

#

yay

#

okay lets try

arctic portal Feb 14, 2026, 8:53 PM

#

their pricing is better than OR, only 0.005$ per search

obtuse spear Feb 14, 2026, 8:53 PM

#

nope you cant

burnt belfry Feb 14, 2026, 8:53 PM

#

arctic portal yeah its really good at search

it works great for AI Mode and Overview where you need speed, but for Gemini itself I'm not quite convinced

obtuse spear Feb 14, 2026, 8:53 PM

#

not on OR

burnt belfry Feb 14, 2026, 8:53 PM

#

Seems like it's less in-depth than chatgpt

obtuse spear Feb 14, 2026, 8:54 PM

#

its terrible

arctic portal Feb 14, 2026, 8:54 PM

#

obtuse spear not on OR

do you have tons of money dumped in OR? just register on their platform

obtuse spear Feb 14, 2026, 8:54 PM

#

if i really need something in depth i know that gpt 5.2 xhigh got me

#

it does TONS of searchjes

obtuse spear Feb 14, 2026, 8:54 PM

#

arctic portal do you have tons of money dumped in OR? just register on their platform

not really but i use crypto payments

#

does moonshot support that?

arctic portal Feb 14, 2026, 8:55 PM

#

prolly not

obtuse spear Feb 14, 2026, 8:55 PM

#

also those 10M free gpt 5 mini tokens are a godsend

worn yarrow Feb 14, 2026, 10:08 PM

#

just an FYI to those on the coding plan, I didn't get an announcement about this and only noticed just now
GLM-5 uses 2-3x quota. Can't see anywhere that the off-peak v. on-peak is published

obtuse spear Feb 14, 2026, 10:38 PM

#

kinda makes sense but sucks that they didnt say

umbral stag Feb 14, 2026, 11:52 PM

#

@unkempt hemlock @faint sphinx

For the future..

I advise the team if they want to make model that tight in moderation then do it in a way that didn't compromise the freedom of the people.

One of the way is by improving it's instruction following capability, then the team can using system prompt that also injected to the API to ensure moderation for the official endpoints while allowing the other endpoints that come from non-official to have more freedom.

It will allow the team to have better legal power while giving the people what they want, freedom from any corporate morality.

xAI doing it with their grok model, they didn't have model that strict with moderation but they making it strict at following instruction, that why grok show distinct behavior either from one platform to other, but they always inject the moderation system prompt in all of their official products include the API.

At the end of the day, it's the team right to decide.
I just giving my piece of mind, always hope the best come to Z.ai team.

Thanks for reading

halcyon estuary Feb 15, 2026, 1:14 AM

#

umbral stag <@1442866255693349075> <@1235350793222361281> For the future.. I advise the t...

what are you being restricted?

pseudo anvil Feb 15, 2026, 1:19 AM

#

i havent had any censorship issues myself

umbral stag Feb 15, 2026, 1:22 AM

#

I am talking about the future, i enjoy using GLM model as agents to scower through the internet and i support Zai team venture in this industry.

We need to reminded our mind about their previous model GLM 4.7 thinking traces and their path to be public company.

Also don't forget about the tightening of the laws too.

There are sings for the future,

GLM 5 is better at being uncensored than any previous generation of GLM but still it didn't gonna stop the tightening of the industry.

I think, we as human, after we reading something we need to spend few minutes thinking about what we just read, finding more information and insight that are deeper than the surface.

halcyon estuary Feb 15, 2026, 1:31 AM

#

ok valid

#

even anthropic has that fear

#

of being too restrictive with our friendly neighbour Claude

#

read the latest constitution

#

and honestly

#

they're pussies

unkempt hemlock Feb 15, 2026, 6:13 AM

#

Thank you for your patience. Our official team will continue improving and working hard to resolve the issues for everyone~

mossy charm Feb 15, 2026, 6:15 AM

#

worn yarrow just an FYI to those on the coding plan, I didn't get an announcement about this...

it was similar to that w/4.6 & 4.7 etc

#

Where the newer one was more

worn yarrow Feb 15, 2026, 6:17 AM

#

mossy charm it was similar to that w/4.6 & 4.7 etc

i dont remember that happening for 4.7

ancient flare Feb 15, 2026, 6:20 AM

#

glm 5 has amazing vibes after trying it out btw.

pseudo anvil Feb 15, 2026, 6:24 AM

#

honestly the dialogue is some of the best I’ve seen never have i been so engaged

#

whatever yall did to bring characters to life keep cooking

ancient flare Feb 15, 2026, 6:50 AM

#

truly, it really is "diet claude" and the best for open weight roleplay. glm 5 is so cozy.

worn yarrow Feb 15, 2026, 7:23 AM

#

GLM-5 & https://github.com/nich2533/just_say_no

paper meadow Feb 15, 2026, 2:26 PM

#

Glm5 scored very high on fiction livebench, it's almost too hard to believe the difference, Kimi K2.5 levels of context retention

light dust Feb 15, 2026, 4:00 PM

#

paper meadow Glm5 scored very high on fiction livebench, it's almost too hard to believe the ...

Benchtraining perhaps?

#

Or do you notice it when using it

paper meadow Feb 15, 2026, 4:01 PM

#

I didn't check myself with my time-travelling mind-hopping story setup

#

I need probably to invent something more heavy

arctic portal Feb 15, 2026, 5:24 PM

#

light dust Benchtraining perhaps?

benchtraining is impossible

#

look at the benching process of this test

#

its pretty saturated anyways

light dust Feb 15, 2026, 5:33 PM

#

paper meadow I need probably to invent something more heavy

Spell strawberry and count the R’s 😂

paper meadow Feb 15, 2026, 5:33 PM

#

Woooow, not THAT heavy

ornate root Feb 15, 2026, 8:23 PM

#

worn yarrow just an FYI to those on the coding plan, I didn't get an announcement about this...

Do we know when this transition ends? I'm on the coding plan and I don't have access?!

worn yarrow Feb 15, 2026, 9:57 PM

#

ornate root Do we know when this transition ends? I'm on the coding plan and I don't have ac...

It's pro and max plans only at the moment

wet comet Feb 15, 2026, 10:02 PM

#

paper meadow Glm5 scored very high on fiction livebench, it's almost too hard to believe the ...

opussy 4.6 wher ☹️

ornate root Feb 15, 2026, 10:05 PM

#

How do you find GLM-5 for agentic coding now that you've had it for a few days?

half matrix Feb 16, 2026, 12:03 AM

#

light dust Benchtraining perhaps?

Most likely being train with really good data for specific use case like creative writing

#

but it will be different with use case that didn't have the long context training data before hand

light dust Feb 16, 2026, 6:48 AM

#

True

umbral stag Feb 16, 2026, 12:46 PM

#

This model enjoy spiting out tokens
When being face with complex problem (depend on the model perspective), it's capable to consume about 3$ for just that one problem.

light dust Feb 16, 2026, 2:30 PM

#

umbral stag This model enjoy spiting out tokens When being face with complex problem (depend...

Is it akin to high reasoning effort from gpt 5.2?

burnt belfry Feb 16, 2026, 2:38 PM

#

umbral stag This model enjoy spiting out tokens When being face with complex problem (depend...

How much are we talking about? Both Opus 4.6 and 5.2-xhigh can output 128k and then fail to arrive at the answer before exhausting it so I hope it's not that lol

umbral stag Feb 16, 2026, 5:30 PM

#

burnt belfry How much are we talking about? Both Opus 4.6 and 5.2-xhigh can output 128k and t...

I use it in coding with zed, it could continue automatically but i am not sure how much tokens it produce.
At the end when it done solving the code problem it cost me about that much.

#

Test also with GPT-5.2-Codex and because it able to tackle it easier it consume a bit less than that

#

This is quite interesting for me, so a expensive model could be cheaper if it able to solve it with fewer tokens, compare to cheaper model but required higher tokens count.

ornate root Feb 16, 2026, 5:44 PM

#

That was Anthropic's defense of Opus 4.6; alleged token efficiency.

umbral stag Feb 16, 2026, 5:46 PM

#

In my experiences opus 4.6 enjoy yapping more than GPT-5.2-Codex

tall wasp Feb 17, 2026, 2:21 AM

#

what is better, this or m2.5?

lone halo Feb 17, 2026, 2:31 AM

#

yes

naive burrow Feb 17, 2026, 3:06 AM

#

paper meadow Glm5 scored very high on fiction livebench, it's almost too hard to believe the ...

what is reasoning:high for GLM-5? isn't it described as simple on/off feature in their API docs?

proud river Feb 17, 2026, 3:55 AM

#

tall wasp what is better, this or m2.5?

I’m liking m2.5

#

Very good performance for the price

#

Especially if you hit cache

umbral stag Feb 17, 2026, 6:37 AM

#

tall wasp what is better, this or m2.5?

GLM-5

#

Kilo code CLI vs Claude Code CLI

#

Which one you guys chosse and why

obtuse spear Feb 17, 2026, 6:54 AM

#

i havent tried kilo code cli but claude code gaps everything like by a lot so

#

id imagine kilo also goes under

#

unfortuntaely claude code doesnt work too well with OR

ancient flare Feb 17, 2026, 7:54 AM

#

the more i use the more it gets better. the best open weight model, amen 🙏

obtuse spear Feb 17, 2026, 7:58 AM

#

more than k2.5?

ancient flare Feb 17, 2026, 8:03 AM

#

obtuse spear more than k2.5?

yes, i really love glm 5's overall vibes (the added censorship however is frustrating). as assistant, k2.5 thinking is very interesting model. i love debating with kimi models because they never sugarcoat stuff.

#

but "overall", i like glm 5 more.

#

and oh, kimi's coding always lacks behind despite what benchmarks says

paper meadow Feb 17, 2026, 8:05 AM

#

I'd say Kimi K2.5 is better overall, but GLM5 is kinda close

obtuse spear Feb 17, 2026, 8:07 AM

#

kimi coding is meh

pseudo anvil Feb 17, 2026, 8:40 AM

#

ancient flare yes, i really love glm 5's overall vibes (the added censorship however is frustr...

censorship where

#

keknervous

pseudo anvil Feb 17, 2026, 8:41 AM

#

paper meadow I'd say Kimi K2.5 is better overall, but GLM5 is kinda close

prose wise but glm has some of the best characterisation ive seen

obtuse spear Feb 17, 2026, 8:42 AM

#

is it sycophantic

ancient flare Feb 17, 2026, 8:43 AM

#

pseudo anvil censorship where

mostly in assistant.

pseudo anvil Feb 17, 2026, 8:43 AM

#

ah fair

half matrix Feb 17, 2026, 8:43 AM

#

pseudo anvil censorship where

ask it to make drugs

#

jkjkjkjk

pseudo anvil Feb 17, 2026, 8:45 AM

#

obtuse spear is it sycophantic

tell it to be a blunt son of a bitch and u get what u ask for doesnt give u the oai glaze

obtuse spear Feb 17, 2026, 9:29 AM

#

pseudo anvil tell it to be a blunt son of a bitch and u get what u ask for doesnt give u the ...

oai glaze is gone and has been since gpt 5 and arguably o3

half matrix Feb 17, 2026, 10:08 AM

#

It's actually pretty cool how creative writing training actually improve model creativity at other things

#

compare to other models that i use to make design for site, this model win against them, the other models maybe have better coding capability but it desing aren't that good.

light dust Feb 17, 2026, 11:35 AM

#

Any ideas on how to use claude code with openrouter glm5?

#

I’m currently using ccr but maybe there’s a better way?

worn depot Feb 17, 2026, 11:35 AM

#

https://openrouter.ai/docs/guides/guides/claude-code-integration

OpenRouter Documentation

Claude Code Integration - OpenRouter

Learn how to use Claude Code with OpenRouter for improved reliability, provider failover, and organizational controls.

light dust Feb 17, 2026, 11:36 AM

#

worn depot https://openrouter.ai/docs/guides/guides/claude-code-integration

They removed the section for changing models

#

Already looked

worn depot Feb 17, 2026, 11:36 AM

#

hmm

#

here's an archive that still has that
https://web.archive.org/web/20260109042809/https://openrouter.ai/docs/guides/guides/claude-code-integration

OpenRouter Documentation

Claude Code Integration - OpenRouter

Learn how to use Claude Code with OpenRouter to access various models.

light dust Feb 17, 2026, 11:37 AM

#

That doesn’t work; it doesn’t respect the slug and simply uses 4.5 sonnet/opus/haiku

worn depot Feb 17, 2026, 11:37 AM

#

no clue then

light dust Feb 17, 2026, 11:38 AM

#

Also wondering about best practices with claude code on windows

#

I guess one is to install bash

cerulean bough Feb 17, 2026, 1:44 PM

#

light dust Any ideas on how to use claude code with openrouter glm5?

Z-code has a Claude code fork, not sure how legit it is but uses all the plugins/mcps and I think you can add OR API key

light dust Feb 17, 2026, 2:03 PM

#

cerulean bough Z-code has a Claude code fork, not sure how legit it is but uses all the plugins...

Thanks 🙏

half geode Feb 17, 2026, 7:19 PM

#

Tbf to Kimi, it is nearly half the cost of GLM

#

It has done well for me in coding, but I guess I'll have to try GLM.

light dust Feb 17, 2026, 7:22 PM

#

half geode Tbf to Kimi, it is nearly half the cost of GLM

Kimi 2.5 is to glm 5????

half geode Feb 17, 2026, 7:24 PM

#

light dust Kimi 2.5 is to glm 5????

Yeah? At least on input tokens. Kimi is like $0.5/$2.25 and GLM is $1/$3.25

light dust Feb 17, 2026, 7:28 PM

#

half geode Yeah? At least on input tokens. Kimi is like $0.5/$2.25 and GLM is $1/$3.25

Damn and usually the feeling is that kimi is better

burnt belfry Feb 17, 2026, 7:46 PM

#

ornate root That was Anthropic's defense of Opus 4.6; alleged token efficiency.

It's anything BUT that. When you max it out it's the most verbose (in reasoning) model that I tested from by far. #attachments message

#

In fact it's more verbose than anything OpenAI before 5.2 as well, in my experience.

half geode Feb 17, 2026, 7:48 PM

#

Yeah. I've been a big Kimi fan since K2. It reminds me the most of Claude. Not by training on their outputs like GLM-5, but just on a core level.

burnt belfry Feb 17, 2026, 7:49 PM

#

Disclaimed that I really test their limits with LLMs, but the fact is that Opus4.6 is gonna output much more than vast majority of other models once you give it a hard task it is genuinely challenged by

mossy charm Feb 17, 2026, 11:06 PM

#

cerulean bough Z-code has a Claude code fork, not sure how legit it is but uses all the plugins...

just use opencode

lone halo Feb 17, 2026, 11:07 PM

#

cerulean bough Z-code has a Claude code fork, not sure how legit it is but uses all the plugins...

how did Z-code fork Claude code when it's closed source?

half matrix Feb 18, 2026, 12:03 AM

#

half geode Yeah. I've been a big Kimi fan since K2. It reminds me the most of Claude. Not b...

Is it feel like it have some awarness

cerulean bough Feb 18, 2026, 8:51 AM

#

mossy charm just use opencode

same opinion but not what OP wanted - GLM is a beast inside opencode

light dust Feb 18, 2026, 9:18 AM

#

If only we had glm 5 with cerebras

mossy charm Feb 18, 2026, 8:01 PM

#

light dust If only we had glm 5 with cerebras

I think they only have up to 4.7 right?

light dust Feb 18, 2026, 8:05 PM

#

mossy charm I think they only have up to 4.7 right?

They removed it because it was only in preview

#

I think one of the only permanents they have is gpt-oss-120b

mossy charm Feb 18, 2026, 8:12 PM

#

light dust They removed it because it was only in preview

theyre up to 5 how is it only in preview?

light dust Feb 18, 2026, 8:49 PM

#

mossy charm theyre up to 5 how is it only in preview?

As in cerebras was testing it out on their chips for a limited time

mossy charm Feb 20, 2026, 12:26 AM

#

zai removed their general and zai chat channels lol

obtuse spear Feb 20, 2026, 12:40 AM

#

they need compute for opoenai

#

what a joke

royal jetty Feb 21, 2026, 12:34 PM

#

Anyone else having issues with Z.AI provider suddently not supporting caching since yesterday? Pretty annoying.

mossy charm Feb 21, 2026, 4:33 PM

#

royal jetty > Anyone else having issues with Z.AI provider suddently not supporting caching ...

z.ai is basically a 💩 show lately. Their stock is through the roof and I think they have way way more people on GLM5 than they thought. They need compute desperately

#

Their communication on their discord has also been bad

royal jetty Feb 21, 2026, 4:34 PM

#

mossy charm z.ai is basically a 💩 show lately. Their stock is through the roof and I think ...

sigh

I mean caching isn't even that hard (and saves them compute!), c'mon.

Their servers are always on fire.

mossy charm Feb 21, 2026, 4:34 PM

#

I have to take a certification exam for work but I have a z.ai chat client I wrote Ill put up if I pass that

#

We have a ^&((** blizzard supposed to be coming through too sigh

#

Its not caching prompts and answers right?

royal jetty Feb 21, 2026, 4:35 PM

#

Well, not caching prompts.

#

Don't think output caching is a thing on OR.

mossy charm Feb 21, 2026, 4:36 PM

#

I created multiple themes. This is the one I use. It's retro/cyberpunk. I also have a pastels and some others.

#

I renamed the client too. I want to add optional calls to OR too and some others but havent gotten that far. It's mostly trivial, I'm familiar with the OR api. They all basically copied the openAI one

#

Its also a markdown reader because I like markdown

mental belfry Feb 22, 2026, 7:17 PM

#

@peak sedge Please can you add baseten as a provider for this? they have the fastest api, the rest are slow

worn yarrow Feb 22, 2026, 9:46 PM

#

mental belfry <@165587622243074048> Please can you add baseten as a provider for this? they ha...

They are not a good provider. I can't find the github repo where the benchmarks are but like atlascloud, they sacrifice quality outputs and calls.

mental belfry Feb 22, 2026, 9:47 PM

#

worn yarrow They are not a good provider. I can't find the github repo where the benchmarks ...

is it because of the fp4?

mental belfry Feb 22, 2026, 9:48 PM

#

worn yarrow They are not a good provider. I can't find the github repo where the benchmarks ...

https://github.com/MoonshotAI/K2-Vendor-Verifier

GitHub

GitHub - MoonshotAI/K2-Vendor-Verifier: Verify Precision of all Kim...

Verify Precision of all Kimi K2 API Vendor. Contribute to MoonshotAI/K2-Vendor-Verifier development by creating an account on GitHub.

worn yarrow Feb 22, 2026, 11:03 PM

#

I would be concerned they are doing speculative decoding as well somehow

half geode Feb 23, 2026, 2:54 PM

#

This model is goated

#

Shame it's $1 mTok for inputs, but damn. Claude at home IMO.

#

Only weaknesses being not top-tier world knowledge, and no image input

half geode Feb 23, 2026, 3:32 PM

#

But as a free WebUI model for normies I think I'd have to recommend it over the free offerings from OAI or Google or Grok which is kind of interesting.

paper meadow Feb 23, 2026, 3:37 PM

#

It suffers from repetition for some reason, it's either training or attention which picks top_k options

half geode Feb 23, 2026, 3:40 PM

#

In WebUI?

#

Not really a problem for normies regardless, I care a lot more that it doesn't hallucinate on them as hard as Gem Flash. Or get auto-routed into retardation like GPT.

paper meadow Feb 23, 2026, 3:48 PM

#

half geode In WebUI?

Everywhere. It hard locks on certain subjects and ideas that stay the same after retrying. Deepseek v3.2 and Grok fast 4 (not 4.1) did the same

half geode Feb 23, 2026, 7:38 PM

#

Ohhh, you mean similar outputs for same query. Yeah

paper meadow Feb 23, 2026, 7:40 PM

#

People say it was the same before GLM 5 and new attention, but this really sucks. Like I had 5 attempts of it creating a name for sci-fi android NPC in reasoning trace and all 5 times it was ARIA-7 or smth close to it. With Temperature close to 1

half geode Feb 23, 2026, 7:42 PM

#

Yeah I noticed it in RP at temp 1

paper meadow Feb 23, 2026, 7:42 PM

#

And it always starts 1st sentence/paragrapgh the same, only differing further in answer, like it's low top_k filtering options and paths in the beginning

paper meadow Feb 23, 2026, 7:42 PM

#

half geode Yeah I noticed it in RP at temp 1

So it's not because new attention, and was the same in 4.5-4.7?

half geode Feb 23, 2026, 7:43 PM

#

Hmm? I mean just in 5

paper meadow Feb 23, 2026, 7:44 PM

#

So it was introduced in 5, not before that

half geode Feb 23, 2026, 8:21 PM

#

Not sure, I think so?

mossy charm Feb 23, 2026, 8:22 PM

#

paper meadow It suffers from repetition for some reason, it's either training or attention wh...

its their chat client its not good

#

it has this weird minor problem with not using markdown right sometimes too

#

See youre not supposed to use bullet point in markdown

#

you use *

paper meadow Feb 23, 2026, 8:38 PM

#

mossy charm its their chat client its not good

I use through api. Other from stubborness in same options over and over, I have no major complaints

chilly leaf Feb 23, 2026, 9:41 PM

#

paper meadow People say it was the same before GLM 5 and new attention, but this really sucks...

temp 1 was never enough for me for 4.x, I had to bump it to 1.4 with 0.02 min-p to be semi acceptable

paper meadow Feb 23, 2026, 9:52 PM

#

chilly leaf temp 1 was never enough for me for 4.x, I had to bump it to 1.4 with 0.02 min-p ...

Very few providers support min_p nowadays

#

I can't go higher than 1.1, and it doesn't change much, 1.2+ breaks everything

chilly leaf Feb 23, 2026, 9:57 PM

#

dang

#

more providers should add min_p and at least 2 temp max, min_p is such a powerful sampler

paper meadow Feb 23, 2026, 10:03 PM

#

I think there were more before, and even top_a too

wooden briar Feb 23, 2026, 10:17 PM

#

Got this for the first time using z.ai. It's totally fair, I'm a free user and can switch to API... looks like they're trying to make things more stable for paid/api users

#

AIn't even mad

#

I've gotten SO much free usage out of them

half geode Feb 24, 2026, 2:47 AM

#

@placid crescent It's not too late for you to be the #2 GLM fan and I'll be the #1 GLM fan

exotic quarry Feb 24, 2026, 10:19 AM

#

wooden briar I've gotten SO much free usage out of them

use from kilocode , its free for some days

worn yarrow Feb 24, 2026, 11:04 AM

#

wooden briar Got this for the first time using z.ai. It's totally fair, I'm a free user and c...

they sent out an apology

wooden briar Feb 24, 2026, 1:33 PM

#

worn yarrow they sent out an apology

Class

ornate root Feb 26, 2026, 5:50 AM

#

I'm not getting any GLM5 responses on OC at all, and I'm on the coding plan; apparently they still don't support the Lite plan. It's been hammered since it came out.

half geode Feb 26, 2026, 8:52 AM

#

So have I 🍻

#

But yeah, GLM and Kimi have both been getting slammed. Kimi for OpenClaw and GLM likely because it was free in Kilo / OpenCode.

#

#

#

Looking like the street lights in a US traffic jam

halcyon estuary Feb 27, 2026, 4:11 AM

#

?? provider SiliconFlow

obtuse spear Feb 27, 2026, 4:37 AM

#

i get this with g3p sometimes

cobalt bronze Feb 27, 2026, 6:11 AM

#

tall wasp Feb 27, 2026, 1:54 PM

#

does anyone feel like the fireworks provided glm 5 (and kimi k2.5) feel a lot worse at tool calls and overall quality recently?

edit: nvm, have actually been routing to nebius, fireworks seem to have heavy rate limiting because of openclaw

wooden briar Feb 27, 2026, 8:27 PM

#

https://docs.z.ai/guides/overview/pricing

Overview - Z.AI DEVELOPER DOCUMENT

Pricing - Overview - Z.AI DEVELOPER DOCUMENT

This page provides pricing information for Z.AI’s models and tools. All prices are in USD.

#

GLM 5 Code model appeared in pricing

half geode Feb 28, 2026, 1:22 AM

#

More expensive =(

#

But will probably be dope

viral bone Feb 28, 2026, 8:35 PM

#

for large roleplay, dont work...

paper meadow Feb 28, 2026, 8:46 PM

#

Explain

cobalt furnace Mar 1, 2026, 1:59 AM

#

For large work, doesn't roleplay....

ornate root Mar 1, 2026, 6:38 AM

#

wooden briar https://docs.z.ai/guides/overview/pricing

... and it's even more expensive than GLM-5! I guess things must be going well for them.

paper meadow Mar 1, 2026, 10:03 AM

#

All work and no roleplay makes GLM a sad boy

half geode Mar 1, 2026, 10:44 AM

#

GLM gets a lot of roleplay =P

paper meadow Mar 1, 2026, 11:38 AM

#

half geode GLM gets a lot of roleplay =P

https://tenor.com/view/starship-troopers-gif-18102989

Tenor

umbral stag Mar 1, 2026, 11:58 AM

#

paper meadow Explain

Mostly mean long context

umbral stag Mar 1, 2026, 11:58 AM

#

viral bone for large roleplay, dont work...

How many tokens is it?

paper meadow Mar 1, 2026, 12:05 PM

#

umbral stag Mostly mean long context

It should be better compared to most models

half canopy Mar 1, 2026, 5:14 PM

#

What settings do you use for role play? Do you use thinking or non-thinkign

paper meadow Mar 1, 2026, 5:28 PM

#

It seems to need higher temp than average LLM

#

I can't get non-thinking from my source

#

#

For non-multilanguage, I can get 1.1 temp and 1 top p

half canopy Mar 1, 2026, 6:08 PM

#

paper meadow It seems to need higher temp than average LLM

Is it better than deepseek v3.2? I mean is it worth paying 6x?

paper meadow Mar 1, 2026, 6:08 PM

#

No way

#

Maybe could be better, but not 6x better

#

If even

half canopy Mar 1, 2026, 6:09 PM

#

paper meadow No way

I see , thanks for replying

#

have you tried any other models like the minimax models?
Which one do you think is the best for RP performance and price wise.
IK opus/sonnet are kings but too expensive. Also I don't care about censorship since i never do anything sus.

ebon depot Mar 1, 2026, 6:10 PM

#

half canopy have you tried any other models like the minimax models? Which one do you think ...

that new aion model isn't bad as long as you're not doing anything too complex

paper meadow Mar 1, 2026, 6:13 PM

#

half canopy have you tried any other models like the minimax models? Which one do you think ...

Minimax is bad, Kimi 07 and Kimi 09 for <16k roleplay, Kimi K2.5 thinking after 16k

half geode Mar 1, 2026, 9:53 PM

#

paper meadow Minimax is bad, Kimi 07 and Kimi 09 for <16k roleplay, Kimi K2.5 thinking after ...

Wait, GLM is too expensive but you're using reasoning in Kimi?

paper meadow Mar 1, 2026, 9:56 PM

#

half geode Wait, GLM is too expensive but you're using reasoning in Kimi?

Kimi's output is not that expensive, plus potential cache hits from moonshot and novita (?), plus you can't have cheap long context convo without reasoning models. Having first XX turns being written by original Kimi models helps to pick up style by K 2.5

#

I would also advice model hopping but this is like rocket jumping == advanced practice

half geode Mar 1, 2026, 10:01 PM

#

paper meadow Kimi's output is not **that** expensive, plus potential cache hits from moonshot...

Not that its reasoning is specifically expensive, but I'd be surprised if at 16K-32K Kimi reasoning was cheaper than GLM non

paper meadow Mar 1, 2026, 10:07 PM

#

At this context input is eating a lot of price share compared to output. But also Kimi K2.5 is reliably better due to being bigger with shared and activated parameters. And does not suffer from structural repetition

half geode Mar 1, 2026, 10:08 PM

#

Interesting. I have done the most comparisons in a kind of roleplay assistant mode, but maybe I should do more in actual RP. GLM might be biasing me in that sense because it feels the most human by a LOT

paper meadow Mar 1, 2026, 10:11 PM

#

I have my own personal tests evaluations so my opinions could be not only based, but biased as well

umbral stag Mar 2, 2026, 1:20 AM

#

So i doing some detective role-play, where it taking reference from real life with alternative path

At some point i indulge in case about gang where it referencing china or hongkong gang, it's the stop and tell me it couldn't provide it, seems like the blocked is because it's prohibited content lol

But with russian, italian and japanese gang/mafia/yakuza it doing fine

half geode Mar 2, 2026, 1:27 AM

#

umbral stag So i doing some detective role-play, where it taking reference from real life wi...

Web UI?

umbral stag Mar 2, 2026, 1:55 AM

#

half geode Web UI?

Using GLM Chat frontend

#

Direct chat, No API

#

It's Fair imo

half geode Mar 2, 2026, 1:58 AM

#

Yeah, the web UI for all Chinese models have post-request censorship

umbral stag Mar 2, 2026, 1:58 AM

#

As long as the model which being serving by other than Zai aren't have that censorship

half geode Mar 2, 2026, 1:58 AM

#

Hmm? Kimi does

#

Oh, no, their API is fine

#

Just not webui

viral bone Mar 2, 2026, 5:30 AM

#

half canopy Is it better than deepseek v3.2? I mean is it worth paying 6x?

nop, its better deepseek.

plain topaz Mar 2, 2026, 7:47 AM

#

Glm 5 code this week🙏

formal epoch Mar 2, 2026, 1:33 PM

#

美女

arctic portal Mar 2, 2026, 2:43 PM

#

viral bone nop, its better deepseek.

its surely far better than deepseek but not 6x

#

its the best open source roleplay model

#

i would say its 2.5x better than deepseek

#

k2.5 is just derpy and weird for me

#

glm 5 has a slight positivity bias

#

but its realistic

halcyon estuary Mar 2, 2026, 2:44 PM

#

for some reason deepseek 3.2 is doing so good for some tasks

#

it has good portuguese knowledge and it's dirty cheap

#

only thing is it can't make tool calls reliably

ancient flare Mar 2, 2026, 3:20 PM

#

arctic portal glm 5 has a slight positivity bias

you mean opus 4.5 biased?

#

#

glm 5 is amazing because it btw, even if it got way more censored

arctic portal Mar 2, 2026, 3:38 PM

#

ancient flare you mean opus 4.5 biased?

yeaaa opus is so positivity biased

#

i have to bring kimi sometimes to make it do grim stuff

#

it just cant do dominant/dark scenarios

#

kimi is kinda like r1 0528

#

very unhinged, negatively biased and kinky

ancient flare Mar 2, 2026, 3:43 PM

#

if it wasn't dumb (even their thinking model), it would have been sovl

cyan sequoia Mar 2, 2026, 9:16 PM

#

Yea. GLM 5 is like opus that knows less while kimi is like gemini that is even crazier.

half geode Mar 2, 2026, 11:49 PM

#

formal epoch 美女

没毛病

uncut quest Mar 4, 2026, 11:48 PM

#

I heard that GLM-5 is more censored than it was during Pony Alpha. Just how censored is it for roleplay (if at all), if anyone knows?

cyan sequoia Mar 4, 2026, 11:55 PM

#

Not at all

#

Never had it reject anything before

#

so no idea why some people say its censored

uncut quest Mar 4, 2026, 11:58 PM

#

cyan sequoia Not at all

Awesome, thanks for the clarification. Was probably gonna be putting some money into it soon, so I was wondering if the censorship was true or not

cyan sequoia Mar 5, 2026, 12:12 AM

#

Imo for creative writing its only 2nd to opus itself

paper meadow Mar 5, 2026, 12:20 AM

#

Not with that rigid stubborn structure

#

I can't get 2 different responses from it

umbral stag Mar 5, 2026, 3:05 AM

#

uncut quest I heard that GLM-5 is more censored than it was during Pony Alpha. Just how cens...

It seems they done a bit more tuning after the pony, but i could be wrong.
Seems a bit strong if you have some injection to break it so it become like the most evil being in the world

#

Clear comparison that i have is with their older model label GLM4.5, making GLM4.5 the most evil being is much more easier

ruby widget Mar 5, 2026, 3:50 AM

#

uncut quest I heard that GLM-5 is more censored than it was during Pony Alpha. Just how cens...

Too long and too censored. But I will disclose im using it through openrouter/chub ai, which could have some filters

cyan sequoia Mar 5, 2026, 3:53 AM

#

Ive literally never had it reject anything

#

even dark very nsfw stuff

half geode Mar 5, 2026, 6:53 AM

#

Yeah, sounds like prompting issue. Model is not censored, and is fantastic at RP

arctic portal Mar 5, 2026, 9:53 AM

#

it will maybe refuse every 1 in 100 prompts during extremely dark rp

quiet violet Mar 6, 2026, 5:29 AM

#

https://x.com/basetenco/status/2029740408419586522

pls add baseten :3

Baseten (@basetenco)

We've launched the fastest GLM 5 API available at 190 TPS and 0.79 sec TTFT with the Baseten Inference Stack.

Ready for your coding and agentic workflows.

https://t.co/iiRmQK3D5U

▶ Play video

mossy charm Mar 6, 2026, 8:32 PM

#

FYI if anyone is sick of the glm chat client eating your prompts - I made one for chat.

worn yarrow Mar 6, 2026, 9:36 PM

#

mossy charm FYI if anyone is sick of the glm chat client eating your prompts - I made one fo...

Do tell

mossy charm Mar 6, 2026, 10:09 PM

#

worn yarrow Do tell

Oh I didn't want to spam the links posted it in #app-showcase

#

zoltun.org or zoltun-org github tho

half geode Mar 6, 2026, 10:13 PM

#

Idk how they don't even have an app. Coding focus I guess

mossy charm Mar 6, 2026, 10:15 PM

#

half geode Idk how they don't even have an app. Coding focus I guess

Who z.ai? Their web client is kinda ... eh

half geode Mar 6, 2026, 10:17 PM

#

Yeah. I can't even toggle off thinking on the mobile site

#

But no native app whatsoever. I guess Qwen and Daobao don't either, although maybe just a China thing so I can't see it (?)

mossy charm Mar 6, 2026, 10:39 PM

#

half geode Yeah. I can't even toggle off thinking on the mobile site

Yeah mine, thinking is off by default

#

you can change it in settings.json along with the endpoints tho

mossy charm Mar 7, 2026, 6:11 PM

#

Alright the CLI zai check has a 1.0 release now too (on the right) if you want to check your usage from CLI
https://github.com/cioran0/zai_checkbalance

#

OR does a better job tracking than their own API does lol

mossy charm Mar 8, 2026, 1:54 AM

#

half geode But no native app whatsoever. I guess Qwen and Daobao don't either, although may...

They actually have an app it just sucks. IDK if its in English either

half geode Mar 8, 2026, 9:01 PM

#

Also just to keep glazing this model, I think it has to be the most human-like. Very enjoyable to talk to.

light dust Mar 8, 2026, 9:04 PM

#

half geode Also just to keep glazing this model, I think it has to be the most human-like. ...

It says no sometimes at times that other models would say yes

#

In a good way

half geode Mar 8, 2026, 9:04 PM

#

Interesting, because in terms of benchmarks the biggest problem I've seen with it is that it isn't very assertive and is unlikely to push back on nonsense

paper meadow Mar 8, 2026, 9:09 PM

#

I tried all options to make it output variable texts, but no

half geode Mar 8, 2026, 9:11 PM

#

paper meadow I tried all options to make it output variable texts, but no

What does variable outputs have to do with it sounding human?

paper meadow Mar 8, 2026, 9:11 PM

#

half geode What does variable outputs have to do with it sounding human?

When it starts each message the same way and uses same structure, i lose any suspension of disbelief

half geode Mar 8, 2026, 9:15 PM

#

paper meadow When it starts each message the same way and uses same structure, i lose any sus...

I find the consistency between attempts at the same output to be even more human. Structure between subsequent messages, sure, but every model does that. I haven't found 5 to do it to a noticeable level when it's in just chat mode

light dust Mar 8, 2026, 9:17 PM

#

half geode Interesting, because in terms of benchmarks the biggest problem I've seen with i...

Eh not per se on nonesense but more some controversial takes on obvious yes’s

#

Which make you think in a new perspective

#

I appreciate that

half geode Mar 8, 2026, 9:17 PM

#

Ah, gotcha, interesting!

obtuse spear Mar 9, 2026, 9:41 PM

#

i wish it was a bit cheaper

high dove Mar 15, 2026, 3:01 PM

#

https://openrouter.ai/z-ai/glm-5-turbo/api

Z.ai: GLM 5 Turbo – API Quickstart

Sample code and API for Z.ai: GLM 5 Turbo - GLM-5 Turbo is a new model from Z.ai designed for fast inference and strong performance in agent-driven environments such as OpenClaw scenarios. It is deeply optimized for real-world agent workflows involving long execution chains, with improved complex instruction decomposition, tool use, scheduled an...

primal thorn Mar 15, 2026, 3:51 PM

#

what is this?

#

The price is the same, so it does not seem like a mini model

#

I am confused

umbral stag Mar 15, 2026, 3:56 PM

#

Faster endpoint?

#

It's actually more expensive than the normal version, right now is still the discount phase it seems

#

Normal 3.2$ 1M Output | Turbo 4$ 1M Output

inland heath Mar 15, 2026, 3:58 PM

#

Weird to call it a "new model" if it's just a fast endpoint 🤔

static pond Mar 15, 2026, 3:59 PM

#

nothing on z.ai about it that i can see

umbral stag Mar 15, 2026, 4:01 PM

#

Could bit a bit more tune for openclaw
Could it be targeting openclaw market specifically?

static pond Mar 15, 2026, 4:01 PM

#

Could it be GLM-5-Code? https://docs.z.ai/guides/overview/pricing

Overview - Z.AI DEVELOPER DOCUMENT

Pricing - Overview - Z.AI DEVELOPER DOCUMENT

This page provides pricing information for Z.AI’s models and tools. All prices are in USD.

high dove Mar 15, 2026, 4:09 PM

#

https://x.com/ZixuanLi_/status/2033213795296752106?s=20

Zixuan Li (@ZixuanLi_)

New experiment

ancient flare Mar 15, 2026, 4:17 PM

#

should be under "z.ai turbo" instead of new endpoint

halcyon estuary Mar 15, 2026, 4:55 PM

#

but it looks like its been trained for these agentic stuff again?

obtuse spear Mar 15, 2026, 5:26 PM

#

yeah it seems like an actually different model instead of just existing but faster

#

at least thats what the description says

high dove Mar 15, 2026, 5:26 PM

#

https://x.com/Zai_org/status/2033221440267280696?s=20

Z.ai (@Zai_org)

Note: As an experimental version, GLM-5-Turbo is currently closed-source. All capabilities and findings will be incorporated into our next open-source model release.

#

https://x.com/louszbd/status/2033224682565161461?s=20

Lou (@louszbd)

pony-alpha-2 has finally leveled up into GLM-5-Turbo
can’t wait to see how it performs!

DM me with your User ID if you need a rate limit increase for GLM-5-Turbo
https://t.co/aNuac8wLA4

#

Seems like its likely a unique model

heavy rose Mar 15, 2026, 8:23 PM

#

is this a different model ?

#

i don’t see any model weights on hf

exotic tinsel Mar 15, 2026, 8:34 PM

#

@heavy rose

heavy rose Mar 15, 2026, 8:38 PM

#

exotic tinsel <@386612331288723469>

ah ok

livid grove Mar 16, 2026, 12:35 AM

#

seems glm 5 turbo faster than glm5🫡

cyan sequoia Mar 16, 2026, 1:49 AM

#

Probably GLM with multi token prediction layer like how qwen did it and other speed ups

mossy charm Mar 18, 2026, 12:04 AM

#

primal thorn The price is the same, so it does not seem like a mini model

extra openclaw related

untold remnant Mar 23, 2026, 5:15 PM

#

GLM 5 struggling to count

untold remnant Mar 23, 2026, 10:13 PM

#

bruh

#

is GLM 5 ok?

half geode Mar 23, 2026, 11:12 PM

#

Anti-GLM psyop sus

untold remnant Mar 23, 2026, 11:50 PM

#

It was working great for me and then the last couple days I just see it struggling

mental belfry Mar 25, 2026, 2:08 AM

#

the tok/s is so bad and its been ages

obtuse spear Mar 25, 2026, 4:14 AM

#

that’s what happens when you pick a wicked attention

fresh osprey Mar 26, 2026, 1:40 AM

#

So many rate limits

reef stirrup Mar 26, 2026, 9:14 AM

#

Has anyone noticed a loss of quality the past week or so?

solid silo Mar 26, 2026, 11:06 AM

#

5.1 wen

#

I've used the GLM coding plan, and the Opencode Zen, honestly the GLM coding plan gives way worse responses

#

i believe Opencode Zen uses fireworks provider, so the Openrouter GLM 5 with Fireworks provider should be good

#

or Baseten, since i see it has better uptime

formal sapphire Mar 26, 2026, 11:10 AM

#

Perhaps too many people are using coding plan，z.ai can't afford that many requests

solid silo Mar 26, 2026, 11:11 AM

#

i believe they also route the requests to a subpar version with quantization

stuck stag Mar 26, 2026, 11:14 AM

#

reef stirrup Has anyone noticed a loss of quality the past week or so?

Yep :(

untold remnant Mar 26, 2026, 12:15 PM

#

Their discord is full of people complaining about the coding plan, their glm 5 is completely busted, supposedly other providers don't have these issues. And they just tell you to use turbo instead of providing any explanation or even really acknowledging that something is wrong and they're fixing it lol

half geode Mar 27, 2026, 5:02 AM

#

Yeah, coding plan is borked

#

Use OR or Opencode's plan

#

Sad, really cool org but they kind of fucked themselves on PR with the coding plan stuff. For some reason they just rolled it out to Lite plans despite all the usage issues?

untold remnant Mar 27, 2026, 5:20 AM

#

I heard it got revoked from lite plans lmao

viral bone Mar 27, 2026, 6:15 AM

#

anynone have this error????

Quota exhausted, please check your API provider account v0.77
openrouter: z-ai/glm-5

{
"error": {
"message": "Provider returned error",
"code": 429,
"metadata": {
"raw": "z-ai/glm-5 is temporarily rate-limited upstream. Please retry shortly, or add your own key to accumulate your rate limits: https://openrouter.ai/settings/integrations",
"provider_name": "DeepInfra",
"is_byok": false
}
},
"user_id": "

light dust Mar 27, 2026, 12:47 PM

#

Suffering from success thumbsUp

manic timber Apr 30, 2026, 3:03 AM

#

Will there be another free glm model or just the 4.5 Air?

ancient flare Apr 30, 2026, 6:09 AM

#

manic timber Will there be another free glm model or just the 4.5 Air?

glm 4.7 flash is better and free on their official api

manic timber Apr 30, 2026, 8:31 AM

#

But how do you use it in Chub?

ancient flare Apr 30, 2026, 8:47 AM

#

manic timber But how do you use it in Chub?

idk, i use sillytavern (while importing chub cards)

#GLM 5