#z.ai (GLM)
1 messages ยท Page 1 of 1 (latest)
be aware that in lite you cannot use glm5-.- i learned the hard way
How fast is using GLM 4.7? I'm using the Pro plan but I do experience some slowness
No lie, z.ai's infrastructure is not the greatest. They make a killer model, but their cloud speeds are kinda low. I am okay with it, because of how I use it, but if it's a deal-breaker I understand. GLM-4.7 should be faster than GLM5, but it's also possible they've transitioned a lot of their inference hardware to serving GLM5.
Yea I'm debating switching to another provider for a sub. I use Claude as my daily driver but don't use it with openclaw because I don't want that account banned so I'm searching for other providers and don't want to shed 200 for ChatGPT lol
I'm on the pro plan but the API is very unstable, constant timeouts. The model itself is good, but z.ai's infra is clearly unreliable. I'm thinking of switching to the chatgpt plan and gpt 5.4. It's a shame though because GLM 5.0 is a good model.
Today's been really bad with network disconnects. Not 100% sure it's OC -> z.ai or Discord -> OC, so I'll hold off on blaming them.
It's a damn good model, and it's been incredibly smart. It's handled tasks that I tried giving to other models to test, and they just...flubbed it entirely, and stopped responding after a while. GLM5 Tries Harder. ๐คฃ
can we use thinking with 4.7 in lite plan?
You should be able to; 4.7 is a good model, but it's not as solid as 5 for agentic work, IMO. They just announced 5-Turbo which might be faster for OpenClaw. They're definitely pushing it as capable for OC.
I just moved to GLM 5. Itโs a good model but slow. Iโll give turbo a try
So far so good
I just learnt that too
I was able to upgrade to the Pro plan
GLM 5 is slow but rock solid!
they also sent an email i think yesterday that glm5 will be also available for the lite plan
end of march
Amazing! Iโll give lite a try later
It seems the Pro plan will get turbo later this month. If itโs faster and gives me same quality that Iโm getting with GLM 5, this will be my primary openclaw model.
anyone on max tried the turbo version already? supposedly toolcalls are also improved...
I'm on Pro and applied for Early Access; it's not going to be open weights, which bums me out, but I'm definitely interested in trying it.
same, but looking at currents Turbo speeds on openrouter, the speed benefit is already gone, so i am atleast hoping for better consistency...
speed benefit is already gone
Slowed down because of massive demand?
i dont know, but openrouter lists only 33tps for turbo and 31 for regular glm-5, latency is still better though (https://openrouter.ai/z-ai/glm-5-turbo)
yeah the "turbo" part of the name is strange
but if it's better at tool call, I buy ๐
applied for early access (i have the pro sub)
is ZAI coding plan still worth it in comparison with Kimi? I am still using the 4.7 model as I thought 5 is not there yet? but i am happy to be corrected
i dont think its to bad, but the biggest issues are reliability and speed, which makes me use it little...
i think i got approved, didnt receive mail, but after just trying to reach it, it works.
GLM-5 (Standard) | GLM-5 Turbo |
| -------- | ----------- ------------ |
| TTFT | 3,87 s | 2,10 s โก๏ธ |
| t/s | 52,15 t/s | 70,27 t/s ๐ |
thanks, also Turbo is working for me now (Pro plan)
wow it's fast, if it stays like this I'd make this my primary model for openclaw, I'm going to give it a few more days
Yeah... It's working for me also (Pro also) and it's a really nice speed boost, and so far it hasn't had any problems doing tool calls and recognizing skills, even custom ones, to run.
yeah same. besides the rate limit to 1 connection, i also dont have any problems so far, maybe i do have to extend my sub...
I've been sufficiently happy as an early z.ai user (which does mean that my usage limits are a little different) that I'm going to stay subbed until they smack my fingers for abusing their system. ๐
Iโm here as my ChatGPT Plus plan is unusable with openclaw. Very happy so far with Glm5. My only complaint was slowness and now maybe will be to have >1 concurrent turbo connection
I thought you can use codex plan?
You can. But doesnโt last enough. My weekly allowance was consumed in less than 2 days
how big of a difference do you notice in speed to GLM (non-turbo) compared to 5.4?
Same, seems they throttled the limits when it went from 5.3 to 5.4. I might give GLM sub a try. Looks like they have a Lite sub, wonder how it fairs with Openclaw
Itโs definitely way slower. Not something to have a quick chat but it works amazing for async tasks.
glm5-turbo today is ๐ฅ , so good! so fast!
Yeah, I switched the day before I got the email. GLM5-Turbo is fast and smart. It's totally revamped my ability to maintain my homelab, able to do long sets of operations all on its own.
also switched today!
GLM 5.1 is out https://xcancel.com/Zai_org/status/2037490078126084514
Hi. I'm using Minimax M2.7-highspeed token plan($40/month) for browser automation and it is very unstable. I mean, I made an MD file for the automation but it ignores some essential commands. I tried to use gpt-5.3-codex through OpenAI OAuth and it worked quite well, but the weekly limit in the Plus version ($20/month) was very low. I also tried to use other models like Kimi v2.5 and openrouter free models but they were even worse than MiniMax. How can I fix it? I'm trying to fix it for 2 weeks and especially trying to change the model for a week but I wasn't able to go through. Is there any approach to make MiniMax work better? My md file is about 13KB and I think it's lower than the limit
P.S. Is GLM better than Minimax in browser automation?
Only way to find out is testing it yourself, get the $10 dollar coding plan, which should have GLM 5.1 and 4.7
Hello GLMers. Damn this server does a great job of obfuscating and killing discussions. GLM5.1 is live and OK. GLM turbo is goated.
That is all...
use the instructions?
https://docs.z.ai/devpack/tool/openclaw#switching-to-glm-5-1-model
And I have another question, is the performance in glm coding plan the same as using glm api keys? If then I'm gonna use Max Plan
And using GLM Coding plan for browser automation is against the policy?
I don't care about the speed but the quality
Thank you @abstract jewel , can you help me again?
im not using it yet so cant really tell for sure. quality should be identical, speed might be worse. here some more on the rules, maybe you can find something about browser automation there https://docs.z.ai/devpack/usage-policy
Hi, seems GLM-5.1 works for me, but the problem is that it says a lot while in the browser automation and the speed is extremely low. Can I fix that kind of things?
Thank you @random , can you help me again?
do you have any other models to try to see if they work faster ? to see if issue is related to the model or something else
GLM-5-Turbo doesn't work at all
And As I said, Kimi and Minimax doesn't work better than GLM, though it works faster
Its speed is about 5 times slower than Minimax.
But I think there is another way to deal with it
Thank you @random , can you help me again?
@abstract jewel ?
interesting, i dont have a lot of experience with browsers yet so cant really give good advice on this
anyone selling a legacy plan?
GLM-5-Turbo replies more than 100 messages in one task, does it consume a lot of prompts in coding plan and can I save the quota by reducing the messages?
And how can I reduce the messages?
@abstract jewel ?
from docs, it says that 5, 5.1 , and 5 turbo all use higher quota, and 4.7 can be used for cheaper/less usage. https://docs.z.ai/devpack/overview
Hi I tried to test the browser automation task with both Minimax M2.7 and GLM-5-Turbo, of course GLM is better than Minimax but it burns 2% of the Max Plan's 5-hours limit - which is approximately 32 prompts, where as Minimax occupies only 72 tool calls (about 4.8 prompts) . Why does GLM spend much more prompts than Minimax in the same task? Is there any approach to reduce it?
P.S. GLM says a lot of messages during the task and how can I reduce the messages if it's good for saving prompts?
@abstract jewel ?
no idea bro, like i said in the other one, check #browser-automation to see if theres anything to optimize this. theres different ways to control the browser, maybe theres something thats more efficient. i only use search/fetch so cant really say
@frank fox I use glm5.1 as my "dumb" model, it's actually pretty smart, it's just cheap
ahh im testing the new gemma 4 rn on my mac mini to offload stuff so its not all running through claude via the openclaw to see if that prolongs my workaround xD
but also saves alot on api limits and useage to have something locally triggering the crons and stuff
quantized gemma4 on my 5090 didn't perform as it was hyped to be
with the big sub on glm5.1 I don't really have a dire need for a local model, so I suppose that makes me more strict in what I expect
make sense, im balling on a budget so im trying to parse out what i can do with mim untill i can start getting some returns
if you are spending on anything, I highly recommend trying a month of glm-5-turbo or glm-5v-turbo (might not be available)... 5.1 isn't available on the cheaper teir, but I really don't notice that much of a difference between 5-turbo and 5.1, i just run 5.1 because I have it and it's technically supposed to be better
tried my hand at vibing some new skills and stuff for agents to be evolving and shit and then saw hermes drop recently and i was like ooof i did all that work and coulda just waited .-.
idk anything about hermes tbh lol, but I like making skills, i've started publishing em because why not
Yeah i got into this to make my worklifie easier and now my company is starting to use claude more so it works out but im trying to branch out for side projects, some for me some to maybe market, but honestly not sure how most are making money, but its likely because my code knowledge is limited I do intend to take some courses to actually learn more but for now its just scratching that tinkering ich in my head
I had my claw do some research and it looks like most money making requires you to have salesman skills
true, the part that im stuck on is like what projects to sell people on, like whats needed where. like obvi you can use it to like plow through website redesigns and cold email people and try to get offers but realistically people can make websites so easy with shopify and other platforms i dont see how theres a market for it
if anything the more likely market would be an ai that works to train other AI's for clients use case but that feels like it can get messy fast
Any maintainer in here? I just got selected to use glm-5v-turbo within my GLM Coding Plan. Any plans for upstream implementation of that endpoint to the coding plan, rather than pay per token api/open router?
nice, no luck here yet ๐ But what do you mean? Doesn't OC already support glm-5v-turbo? I try to use it ever so often and it just gives me unauthorized since I don't have access.
Through pey per token api from z.ai or open router. Z.ai opened a google forms a few weeks ago for coding plan users, I sent my use case and it seems they liked it (and I also have the max subs yearly, from back in January when it was 288, showing that I've been with them for a while). I can use it on claude code or open code, but not in openclaw, because the support endpoints are for z.ai's/open router API usage, rather than z.ai's coding plan. I could play around and "fix it", but I've already modded some sht to get it deleted each new version and even with backups, I end up wasting time/tokens. 5v-turbo will be available soon for the coding plan, they are beta testing it; not sure if I belong to a relevant cohort, but the model is already rolled out for some coding plan subscribers. So far, it's saving me plenty of money, tokenwise, for its cheap multimodality -turbo was already pretty good, this just gives it modafinil.
It'd be really cool to be able to use it properly on the next update, again; last update I modded the dreaming section to have a night mode card paired with some cron jobs, got wiped on the update and even if I still have the backup, it's a fckin hassle.
But yeah, I've gotten lucky twice with z.ai: once with the "cheap" yearly plan before they hiked it to 360 and then 762 bucks. And now with having concurrent 5v-turbo access. I'm already using five concurrent glm-5-turbo subagents along a 5.1 lead, besides my 20 bucks a month cheap open ai subs... So far, it's been tight with self-made skills that I've kept iterating for proper orchestration with fallbacks and edge cases.
So yeah, 5v-turbo WILL be a thing and imo, it's the model of choice as a Sonnet 4.6/5.3 low/medium like agent.
but waaay cheaper.
Have you tried via open router or z.ai's pay per token api? According to my brain agent (which uses gpt5.4 on high), there IS upstream implementation for 5v-turbo, it checked the repo (I didn't, so there's a chance for it to be a hallucination); just not for the coding plan endpoints, which are different than the z.ai's pey per token api ones.
Right now, I have a bunch of use cases for 5v-turbo, it'd make my life way easier (my pocket, to be precise). Sure, I can use other multimodal models, but they are way more expensive or not good enough for something at 5-turbo level.
I'm using the coding plan in openclaw, you just use the coding plan api in place of what would be an oauth for codex or claude
Also I thought I got a good deal at ~$650 ๐
I keep trying to tell people this; GLM-5 (and 5.1) is crazy good. I point it at all kinds of crazy problems in my homelab, and it just... fixes them. I wish it could fix my RL issues too. ๐คฃ
how to set it up in openclaw via oauth or genie subscription?
it is imo
I got super lucky
you can also use the z.ai pay per token api endpoint and open router (not sure if they provide 5v-turbo, since 5-turbo isn't open weights)
it is slow and sometimes it goes a bit overboard, it's amazing at ux/ui and backend, tho
an eager fck
I just got my access to 5v-turbo ๐ I'm hooking it up soon.
Anyone know what this z.AI ban non-sense is about? Is that including us openclaw users?
No, I don't think so. I think they're trying to restrict multi-user use, and they're catching some folks up in it accidentally. I also think that stuff like Tailscale's Aperture might be problematic, and that might be something that's catching folks in the 'shared use' net.
There's more info (and folks flaming them) on their Discord. If you join it, go to the FAQ, and (right now) the second from the newest is a big post about it, and their infra.
I haven't been hit, although when I did try to set up Aperture, I got rate-limited for any requests through it, which is probably a warning sign. Oh, and they've had some issues with roleplayers, who are using a coding plan for GLM-5.1 roleplay. ๐คฃ (It's a really good RP model, but I wouldn't... do that on a cloud model.)
I will say that OpenClaw is probably on their 'rate limit before coding harnesses' section. They really want to enable folks to do coding work with GLM-5.1.
The roleplayers subreddit was the last place I expected a fight about monthly coding plan subs to break out. tbh I didn't think they were targetted directly I thought they were just the ones reacting to Z.ai's pivot which was loosely worded towards "non-coding" use, meaning mostly agents like us, Hermes etc. I mean the ones reacting the loudest. They do love a bit of drama, after all ๐
OpenClaw is explicitly listed as the one example of a "Universal Agent Tool", which will be given "secondary scheduling" behind actual coding tools.
I had fun this morning doing some maintenance on OpenClaw using GLM5.1 in OpenCode, armed with Superpowers and the Openclaw Admin Skill. I need to figure out how to integrate these harnesses a bit more. Better sharing of memory, workspaces, skills etc.
it's weird though because they explicitly said we could use it for openclaw, and they have instructions for it
How many of you getting 429?
GLM 5.1 is a great model tbf, but apparently its unusable for openclaw around its peak hours. complete rate limit decline :/
Same. Tried it for the first time last night as open AI code is my primary and I hadn't tried 5.1 yet. I had only tried the turbo and the first request that goes in when I switched to mid conversation immediately 429.
Yeah, it seems like everybody's suddenly in a deep compute crunch. OpenAI shuts down Sora because they need the compute for other work. Anthropic considered, then backed off, removing Claude Code from its 'Pro' plan. z.ai is changing their subscription plans and removing the legacy stuff. It's... getting difficult.
I'm considering moving to either a second ChatGPT $20 as fallback or get Kimi