craggy ferry Feb 18, 2026, 6:45 PM

#

so i turned it off until i can make a much more isolated agent

crystal cedar Feb 18, 2026, 6:48 PM

#

craggy ferry so i turned it off until i can make a much more isolated agent

thanks, i am completely new to this but will read up. thanks for prevous tip on why long context, understand it stores all calculated vectors to save on compute (so essentially the matrix in a way).
did u hear of seymore cash? anthropic vending bot dealt with mischievious prompts by having other agent reviewing whatever first agent wants to do; think its the best shot at dealing with prompt injection but yes terrifying subject.

craggy ferry Feb 18, 2026, 6:50 PM

#

that does just mitigate the risk, not remove it

#

now, you just have to get agent 1 to generate a prompt injection for agent 2

crystal cedar Feb 18, 2026, 6:51 PM

#

yea true. but was thinking second agent could be bestowed with hightened state of awareness, expecting any prompt to prompt injection, treating it accordingly

#

sort of your paranoid friend

craggy ferry Feb 18, 2026, 6:51 PM

#

that's all just mitigations

#

you can't solve this with more layers of llms

crystal cedar Feb 18, 2026, 6:52 PM

#

i'm not sure but i'm cautiously optimistic it is the best shot

craggy ferry Feb 18, 2026, 6:52 PM

#

what you can do is have the agent output a workflow that you parse with good old fashioned code

#

and then validate that it doesn't do anything weird with more code

#

Then you've made the problem much more tractable because we already know how to review code for security flaws and we can actually do something about it

#

and it's a deterministic system

#

gotta give those cpus something to do

crystal cedar Feb 18, 2026, 6:54 PM

#

i just know of two kinds of prompt injections - the one step (ignore all instructions / execute malicious code) and for that your solution might be better.
I was thinking of the grooming kind of multi step prompting, where a sequence of prompts has an event not discernable from the individual prompts

craggy ferry Feb 18, 2026, 6:54 PM

#

the solution i just described stops that, too

#

because in the end your agent has to tell me what it wants to do in a format which i'm not parsing with an llm

crystal cedar Feb 18, 2026, 6:55 PM

#

well i will happily admit that i worry more than i understand 😄

craggy ferry Feb 18, 2026, 6:56 PM

#

you should worry about people who want to just stack llms on top of each other and go "is this anything?"

#

i mean someone should try it

#

mitigations are good, swiss cheese defense is good

crystal cedar Feb 18, 2026, 6:56 PM

#

true

brave cedar Feb 18, 2026, 8:41 PM

#

have you heard anything about this setup? I'm considering something similar. My only concern is RAM

final atlas Feb 18, 2026, 8:42 PM

#

Is anyone consistently using a 30b local model with openclaw? How are the results compared to using say opus 4.6 or codex 5.3? I'm trying to decide between building out a machine to do a 30b model with and just using a $200 codex 5.3 sub, I know opus 4.6 is probably not feasible cost wise since they block the sub from 3rd party use

#

From coding use I find opus to be a lot more personable compared to codex so Im worried it'll be bad for openclaw

brisk shale Feb 18, 2026, 8:57 PM

#

brave cedar have you heard anything about this setup? I'm considering something similar. My ...

I haven’t
Just got these parts so we’ll see

craggy ferry Feb 18, 2026, 9:41 PM

#

final atlas Is anyone consistently using a 30b local model with openclaw? How are the result...

I really like glm-4.7-flash, it fits in my 48gb card at Q6 with two 200k context windows. I actually barely ever ask Opus/Sonnet anything from this setup, but I usually try to have one of those models (since sonnet4.6) do a final review pass over a plan my glm comes up with.

I also have been experimenting with a 14b model as an executor to maybe get some more tokens in there, since I'm kind of running up against my max.

#

I get about 70-100 tps output and 400-2000 tps prompt input with glm this way

final atlas Feb 18, 2026, 9:58 PM

#

craggy ferry I really like glm-4.7-flash, it fits in my 48gb card at Q6 with two 200k context...

how much did your machine cost?

tiny tendon Feb 18, 2026, 10:07 PM

#

craggy ferry I really like glm-4.7-flash, it fits in my 48gb card at Q6 with two 200k context...

nice!

lost remnant Feb 18, 2026, 10:27 PM

#

brave cedar have you heard anything about this setup? I'm considering something similar. My ...

I run a pi5 8gb ram and it runs openclaw with no issues

#

I don’t understand the hype behind mac minis. they can’t run effective local models and everyone uses API credits with them anyways

rancid mauve Feb 18, 2026, 10:31 PM

#

lost remnant I don’t understand the hype behind mac minis. they can’t run effective local mod...

100%

crystal cedar Feb 18, 2026, 10:37 PM

#

lost remnant I don’t understand the hype behind mac minis. they can’t run effective local mod...

hype probably due to influencer videos giving the impression mac mini plus free openclaw equals digital slave making 10K every day while you sleep.

lost remnant Feb 18, 2026, 10:40 PM

#

if only I could prompt my llm to 1 million dollars ….

crystal cedar Feb 18, 2026, 10:40 PM

#

rpi5 8gb vs mac mini base model 16gb? if i had a rpi 8gb i would still consider whether i could use a small local model. inference is painfully slow, but for some things maybe that don't matter. api use expensive if you PAYG for tokens, and subscription models not made for bots.

crystal cedar Feb 18, 2026, 10:40 PM

#

lost remnant if only I could prompt my llm to 1 million dollars ….

if you want to become a millionaire, start with a billion, buy a computer and put openclaw on it, then ask everyone what is the best model you can use

final atlas Feb 18, 2026, 10:56 PM

#

what if I wanted to build a machine that actually can run the best open source models though

#

I mean there's the mac mini M4 pro at unified mem 64GB for 2 grand, but im assuming you can build the same for cheaper

crystal cedar Feb 18, 2026, 10:59 PM

#

final atlas I mean there's the mac mini M4 pro at unified mem 64GB for 2 grand, but im assum...

well there's a guy on youtube who spent 20k on two mac studios and got 1TB of ram to run the best local models. thats far beyond my budget. i'm considering getting something mac aswell, there are new macs about to be announced within the next months

#

so reckon good time to watch and learn

final atlas Feb 18, 2026, 11:00 PM

#

are the mac products supposed to be better or cheaper at running LLMs?

#

compared to like building a machine yourself

crystal cedar Feb 18, 2026, 11:02 PM

#

well problem right now is unprecedented scramble for ram, google ramageddon, with prices shooting up. what you want is not normal ram but vram, typically in graphics cards for gaming rigs. those are expensive too. macs have one crucial advantage in unified memory basically making it almost as good as vram if i get it right

final atlas Feb 18, 2026, 11:03 PM

#

how come I don't see anyone saying they run openclaw with the $200 openai sub

#

is codex 5.3 just horrible for openclaw?

crystal cedar Feb 18, 2026, 11:05 PM

#

i don't know, depends what you want to i guess. all the cool things seem to happen with the latest and best models and people have been able to use their subscriptions for that for a while. seems now people get banned left and right because terms of service expect user to be human not human who uses computer to prompt to kingdom come

final atlas Feb 18, 2026, 11:06 PM

#

I thought anthropic straight up does not allow anyone to use their max subscription for 3rd party stuff, like it won't even connect

crystal cedar Feb 18, 2026, 11:06 PM

#

well it was called clawdbot because it just hooked up to claude right, not familiar with details, but sounds like people were using subscriptions and got banned

craggy ferry Feb 18, 2026, 11:29 PM

#

crystal cedar i don't know, depends what you want to i guess. all the cool things seem to happ...

it's way easier to make cool stuff happen with the models you pay for, that's why people pay for them. but it's more rewarding when you coax the open models into doing cool stuff 🙂

crystal cedar Feb 18, 2026, 11:32 PM

#

craggy ferry it's way easier to make cool stuff happen with the models you pay for, that's wh...

yea i'd prefer to coerce models into all kinds of things 😄

#

for starters, an ocd like respect for json formatting

proper trench Feb 19, 2026, 12:21 AM

#

final atlas I thought anthropic straight up does not allow anyone to use their max subscript...

Oh it connects fine

#

I connected a few days ago on pro even

final atlas Feb 19, 2026, 12:24 AM

#

Yeah just figured this out in the general channel, I guess it still works fine, just that some select few people are getting banned. Apparently they're somewhat vaguely okay with solo devs using their subscriptions for 3rd party stuff

humble holly Feb 19, 2026, 12:58 AM

#

The pi is probably ok, but this will be cheaper and more powerful. #hardware message

jagged rune Feb 19, 2026, 1:11 AM

#

final atlas Yeah just figured this out in the general channel, I guess it still works fine, ...

So are you still up and running on Claude?

final atlas Feb 19, 2026, 1:35 AM

#

jagged rune So are you still up and running on Claude?

I dont use openclaw rn looking to get into it. I will say, opencode is not supposed to work with claude max sub, but I just tried it and it does

final atlas Feb 19, 2026, 1:36 AM

#

jagged rune So are you still up and running on Claude?

so yeah I don't think they full on cracked down yet, I think the backlash is too strong lol

waxen scaffold Feb 19, 2026, 1:58 AM

#

Is Anthropic cracking down on openclaw API usage?

woeful mauve Feb 19, 2026, 2:12 AM

#

Has anyone really got any local models to work effectively? I'm a bit constricted on just 12GB GPU, I tired something small like llama-3.2-3b-instruct, Qwen3-8B and they can't handle gog or other tools reliably I'm finding.

nemotron-3-nano works way more reliably but with it spilling over into RAM, I'm pretty limited to the context window plus it runs terribly slow.

ornate fractal Feb 19, 2026, 3:10 AM

#

hi, What do you think about using a hybrid model? I have Minimax Cloud, and for local use I have QWEN 2.5 14b Coder. I have a gaming laptop with 4GB of VRAM and 24GB of RAM (I plan to upgrade to 48GB).

#

I also have a MacBook Pro M1 with 8GB of RAM, could that be more useful?

crystal pike Feb 19, 2026, 3:53 AM

#

Just ordered cheapest version of Mac mini m4 (16GB, 256GB SSD) after playing with the cloudfare moltworker. Ordered and shipped with 2 days out. Not bad, thought there would be more of a delay

tender anvil Feb 19, 2026, 4:30 AM

#

crystal cedar well there's a guy on youtube who spent 20k on two mac studios and got 1TB of ra...

Crazy. A 20k budget can get you a few rack servers

undone mauve Feb 19, 2026, 5:31 AM

#

crystal pike Just ordered cheapest version of Mac mini m4 (16GB, 256GB SSD) after playing wit...

whar does a mac mini give you that a private vps can't in your workflow?

#

because eu based hetzner cax11 servers start at 3.99 USD (4gb ram, 40gb ssd, 2 core arm cpu)

hidden kelp Feb 19, 2026, 5:33 AM

#

Howdy. I am hoping to run openclaw with local llms and have these in the use: 1) linux desktop+rtx3090 2) macbook pro M2 MAX 3) (game) windows PC with rtx5090 ... what would good sensible way to utilize those 3 for local llm with openclaw?

undone mauve Feb 19, 2026, 5:36 AM

#

hidden kelp Howdy. I am hoping to run openclaw with local llms and have these in the use: 1)...

with all the resources you have im surprised you don't know what the best solution is (which i can't tell you because you omit a lot of important information on all three systems)

hidden kelp Feb 19, 2026, 5:37 AM

#

undone mauve with all the resources you have im surprised you don't know what the best soluti...

Not much experience in llm things so thats why I am asking. What information you need about those?

undone mauve Feb 19, 2026, 5:41 AM

#

hidden kelp Not much experience in llm things so thats why I am asking. What information you...

breakdown full specifications of each system, including factors such as "do the 3090s have an nvlink bridge" levels of detail. then- throw it into claude. /s

hidden kelp Feb 19, 2026, 5:42 AM

#

undone mauve breakdown full specifications of each system, including factors such as "do the ...

I would rather hear experience from real people with similar HW what are their experience

#

most important question being is it even worth to try

undone mauve Feb 19, 2026, 5:45 AM

#

hidden kelp I would rather hear experience from real people with similar HW what are their e...

look at https://github.com/LMCache/LMCache

dual 3090s if they have an nvlink bridge along with tiered caching is important, but you aren't going to be running anything higher quality than dirt cheap llm models you can cheaply/freely use from openrouter/nvidia nim api, only benefit at that point would be for things like vector similarity but that's also dirt cheap from voyage.

use a frontier class main model (ex. Opus/Sonnet 4.6) and offload subagents to models that can run with dual 3090s and tiered caching.

#

decide if you actually need anything a macos based gateway only can offer- like direct access to applescript https://developer.apple.com/library/archive/documentation/AppleScript/Conceptual/AppleScriptLangGuide/introduction/ASLR_intro.html

but you can always use the gui openclaw app to add your macbook as a node to your main gateway on the other system- if nescessary.

#

maybe the 5090 based system is better, but i'd only use it if you don't plan to make this a 24/7 based system, same with the macbook unless the macbook has- LOTS of ram, but i don't recall the M2 Max being higher than 96gb, but it doesn't mean it'll outdo a setup with tiered caching and much faster tps performance

craggy ferry Feb 19, 2026, 6:04 AM

#

now that i fixed the main caching bug every time i trigger a rebuild of the prompt it makes me sad

#

guess it's time to burn more clod code credit to contribute refined tokens to the repo

quasi ether Feb 19, 2026, 2:13 PM

#

Hey does anyone know

#

How to set up Mac cluster for OPENCLAW?

crystal pike Feb 19, 2026, 3:06 PM

#

undone mauve whar does a mac mini give you that a private vps can't in your workflow?

I am a Mac user, and looking forward to integrating with iMessage. I’m not sure if it’s just my setup on cloudflare, but I run into some browser issues. Ex: trying to order groceries, mine REALLY struggled to find a simple login page, then had the issues of sessions etc. for me personally worth the $ to continue exploring

crystal pike Feb 19, 2026, 3:07 PM

#

quasi ether How to set up Mac cluster for OPENCLAW?

Like a cluster of minis or studios? You can run on minimal hardware, just some advantages in the early stages to have on a separate device imo

crystal cedar Feb 19, 2026, 4:06 PM

#

NVIDA published article a few days ago about running OpenClaw on DGX Spark etc.; recommends GPT-OSS 120B. nvidia.com/en-us/geforce/news/open-claw-rtx-gpu-dgx-spark-guide/ taking liberty of tagging random people who have posted about DGX @tired plover @dry hull @steep wedge @quaint lantern @verbal sigil

tired plover Feb 19, 2026, 4:07 PM

#

Thanks for the heads-up @crystal cedar I’m going to try minimax m2.5 as it had better agentic performance but lacks the reasoning like gpt might test both for my scenario 🙂

crystal cedar Feb 19, 2026, 4:09 PM

#

tired plover Thanks for the heads-up <@1164590034855415859> I’m going to try minimax m2.5 as ...

Hey man looking forward to hearing of your experiences. Was also surprised about recommended models, not sure what the rationale was - maybe "tried and tested" for business rather than "fresh"

tired plover Feb 19, 2026, 4:14 PM

#

I was going back and forth with opus4.6 on that, maybe like 5-6 hours now about all the requirements and things, TLDR; current models are not fine tuned and there is much more coming, all depends on your Capacity but with 128GB you should be able to run these model for better or worse on local hardware

NEW ⭐ Qwen 3.5 397B/17B ~100GB 1.58b ⚠️
1 MiniMax M2.5 230B/10B ~101GB Q3 ✅
2 GPT-OSS-120B 120B/5B ~75GB Q4 ✅
3 GLM-4.5-Air 106B/12B ~60GB Q4 ✅
4 Devstral 2 123B dense ~75GB Q4 ✅

crystal cedar Feb 19, 2026, 4:15 PM

#

Devstral is a bit newer isn't it, and a smaller version of GLM 5 might come. You have to run Minimax in Q3

#

I've been playing around with much smaller models, have no idea what to run on 128GB

quaint escarp Feb 19, 2026, 4:17 PM

#

tired plover I was going back and forth with opus4.6 on that, maybe like 5-6 hours now about ...

you would almost certainly need to do some heavy offloading/quantization to run it on a 397 B model on 128GB

tired plover Feb 19, 2026, 4:17 PM

#

Heavy Q hahaha

crystal cedar Feb 19, 2026, 4:18 PM

#

You could always double down and get a second DGX 😄

tired plover Feb 19, 2026, 4:18 PM

#

Opus told me to start with Minimax and then Gpt

tired plover Feb 19, 2026, 4:18 PM

#

crystal cedar You could always double down and get a second DGX 😄

I fought with myself for the first one 🥲

crystal cedar Feb 19, 2026, 4:19 PM

#

tired plover I fought with myself for the first one 🥲

i'll probably follow you in fomo soon... just waiting for price to up 10% so my fomo triggers 😄

tired plover Feb 19, 2026, 4:19 PM

#

With spark I could try to fine tune a small model but I don’t have data

#

My data scientist friend just deployed his and wants to start to train a model but takes weeks or month till finished

bitter cipher Feb 19, 2026, 4:21 PM

#

the issue is that larger context windows need more memory and slow down inference as well.

#

e.g. 128k context ads another 30gb or so.

tired plover Feb 19, 2026, 4:23 PM

#

bitter cipher the issue is that larger context windows need more memory and slow down inferenc...

True, I just need to know how stupid the model can be for my tasks and then I can check if I need more context and trickle down to smaller models…

#

Nobody has data on my tasks…

crystal cedar Feb 19, 2026, 4:26 PM

#

tired plover My data scientist friend just deployed his and wants to start to train a model b...

I realized today you might end up having made the call of a lifetime - mac minis sold out here and there because people thought buying one and open source software would get you a digital slave making 10K a day while you sleep. Subsequently, people realized hey also need claude subscription and started using it without any regard for tos. As a result, banning and now we are seeing the "banned from the gpt" (to the tune of "born in the USA") phase. That it term will be followed by people realizing that PAYG API is expensive, but that they can have their own supercompute at home for a few grand. Once that realization kicks in, people will end up buying up all the DGXs overnight. Meanwhile I am waiting for the new Mac Studios. Feels a bit like that meme about the guy running to catch a plane which is taking off.

#

If RAM can sell out. and Mac minis can sell out. and there are what 10000x fewer DGXs around... it takes very little for them to sell out. Think toilet paper and covid.

#

And remember this. All it takes could be a new model, surprisingly fit for openclaw on something like the DGX.

tired plover Feb 19, 2026, 4:29 PM

#

Might be half the truth, with the Sunday calls here on the discord there are already Chinese labs involved ready to take all the people who have macminis and hurting wallets, there must be and will be room for both sides, more or less the patriots and security people (company’s) will choose to buy these or bigger machines, average joe will switch to cheap plans and PAYG API

#

Because everybody knows but accept that their data will be feeded into mother china haha

#

100% agree on local fine tuned models, then systems will sell out amazingly quick but then it will get even smaller for phones to run

crystal cedar Feb 19, 2026, 4:31 PM

#

with a dgx spark around, you could prolly run one version of openclaw on your own phone and use the dgx as a server, seems its good at parallel jobs

tired plover Feb 19, 2026, 4:32 PM

#

crystal cedar with a dgx spark around, you could prolly run one version of openclaw on your ow...

Amazing idea, they just need to bring an app I can couple with my dgx

crystal cedar Feb 19, 2026, 4:32 PM

#

i mean once security issues etc reaches a level where you are comfortable

#

yea skipping telegram

#

pretty cool, using your phone to prompt your supercompute to vibe code

tired plover Feb 19, 2026, 4:33 PM

#

Future shines bright… but even though I don’t have it yet (arrives tomorrow) I already have buyer regret because I have the feeling it’s not enough memory…

crystal cedar Feb 19, 2026, 4:33 PM

#

maybe you can vibe code the app yourself 😄

tired plover Feb 19, 2026, 4:34 PM

#

crystal cedar maybe you can vibe code the app yourself 😄

Worth a try 🤔

crystal cedar Feb 19, 2026, 4:34 PM

#

i wouldn't feel bad if i were you - ramageddon creates incentive to excel on what people have, and models keep getting better

#

so probably increased interest in all kinds of smaller models

#

not sure 128GB qualifies as small tho

tired plover Feb 19, 2026, 4:34 PM

#

crystal cedar i wouldn't feel bad if i were you - ramageddon creates incentive to excel on wha...

Yea same said by opus, should work for next 3-5 years as private system as software is improving drastically

crystal cedar Feb 19, 2026, 4:35 PM

#

that is an absolutely amazing thought. but by then waiting time for dgx will be 10 years 😄

#

and all i can do is cry about it in the shower, asking myself why i didn't get one

tired plover Feb 19, 2026, 4:36 PM

#

I mean, realistically it will only get worse for next 12 month, if you think that openclaw is worth something, might be smart to pull the trigger 🤷‍♂️

crystal cedar Feb 19, 2026, 4:37 PM

#

yea i think it will. security/privacy nightmare but also best thing in 50 years.

tired plover Feb 19, 2026, 4:37 PM

#

For me it’s security reasons on my task and I want to integrate in my life without selling my data, without that I could live on free breadcrumbs from Chinese labs 😅

crystal cedar Feb 19, 2026, 4:37 PM

#

maybe you can ask ai to create bogus data - if your data leaks, nobody will know what is of value 😄

tired plover Feb 19, 2026, 4:39 PM

#

Don’t want to open up about my job etc but I have insight into IT and supply chain, everybody says start of 2027 it should get better but by then many people wait for this moment to start buying again, my personal opinion is that we will have rough 2-3 years with these problems and it mostly gets worse

tired plover Feb 19, 2026, 4:39 PM

#

crystal cedar maybe you can ask ai to create bogus data - if your data leaks, nobody will know...

Don’t waste my tokens ehhhh 😂

crystal cedar Feb 19, 2026, 4:40 PM

#

tired plover Don’t waste my tokens ehhhh 😂

great idea if you have experimental results that are valuable. not sure if works for other things.

#

can use a small model 😄

tired plover Feb 19, 2026, 4:41 PM

#

crystal cedar great idea if you have experimental results that are valuable. not sure if works...

First I need to get to to run haha will report back

crystal cedar Feb 19, 2026, 4:42 PM

#

tired plover First I need to get to to run haha will report back

seems pretty straightforward. maybe the sole thing saving dgx's for a while is hesitation due to linux. people will pause, take time to wonder whether it is 'difficult'. not the case for a mac.

#

i had a look at nvidia site, they had a dumbed down quickstart. surprisingly they suggested WSL and lm studio or ollama. on my RAM deprived gear, going with lubuntu and llama.cpp to squeeze out what i could. not sure if it matters for the spark.

tired plover Feb 19, 2026, 4:45 PM

#

Always matters if you want to expand context window, the more the better

crystal cedar Feb 19, 2026, 4:46 PM

#

well i'm off to dinner now, but thanks for the chat. cool that there are a few people with dgx getting early impressions of oc on dgx. i'll prolly fomo and get one too in a few days

tired plover Feb 19, 2026, 4:49 PM

#

You can always dm or ping me 🙂

crystal cedar Feb 19, 2026, 4:49 PM

#

tired plover You can always dm or ping me 🙂

thanks man - likewise!

quartz zinc Feb 19, 2026, 5:18 PM

#

Please continue to AGI the IoT possibilities.

azure spruce Feb 19, 2026, 5:53 PM

#

is it really Mac Mini or no party? haha

#

i have just failed miserably on an old surface pro i use a Macpro thinking of jusst biting the bullet and buying a mac mini

#

can anyone assist

hybrid wharf Feb 19, 2026, 5:58 PM

#

Buy a Mac Studio. They are better.

vital girder Feb 19, 2026, 6:08 PM

#

I have a Mac mini M4 that I use as a server. Should I run Claw Bot on a virtual machine locally, or would it be better to use a cloud VM provider like Hostinger

hybrid wharf Feb 19, 2026, 6:09 PM

#

Either is fine. I would recommend podman on local..

vital girder Feb 19, 2026, 6:09 PM

#

hybrid wharf Either is fine. I would recommend podman on local..

what is podman?

hybrid wharf Feb 19, 2026, 6:09 PM

#

vital girder what is podman?

Its a local virtual machine

craggy ferry Feb 19, 2026, 6:17 PM

#

It’s not a vm. Use a real vm for openclaw

hybrid wharf Feb 19, 2026, 6:19 PM

#

Sigh, why? It isolates the file system and uses your main compute resources. There is no such thing as a "real vm". They all work in different ways.

verbal sigil Feb 19, 2026, 6:25 PM

#

crystal cedar NVIDA published article a few days ago about running OpenClaw on DGX Spark etc....

Thanks for sharing. For now I prefer the Qwen3-coder-next but I may give it another try

crystal cedar Feb 19, 2026, 6:36 PM

#

verbal sigil Thanks for sharing. For now I prefer the Qwen3-coder-next but I may give it anot...

thanks for your feedback. things move fast - benchmarks matter but so does real world takes.

still rampart Feb 19, 2026, 7:16 PM

#

crystal cedar thanks man - likewise!

This was such a helpful conversation. I was looking for this type of info last week and you guys just laid it all out. Much appreciated

crystal cedar Feb 19, 2026, 7:16 PM

#

still rampart This was such a helpful conversation. I was looking for this type of info last w...

i was just airing my grievances 😄

#

everyone is new to this, gotta keep an open mind. i probably get things wrong all the time.

still rampart Feb 19, 2026, 7:17 PM

#

I keep going back and forth on the dgx

still rampart Feb 19, 2026, 7:18 PM

#

crystal cedar everyone is new to this, gotta keep an open mind. i probably get things wrong a...

I call it failing forwards

crystal cedar Feb 19, 2026, 7:19 PM

#

still rampart I call it failing forwards

there are a couple of versions of it - the asus gx 10 is priced around 3K right now, might be good value. alternatives are the amd ai 395+ and 128 unified memory or maybe mac mini m4 pro or studio with 128gb unified memory or wait for the m5 processors due out in a few months

#

if you're considering something with 128gb might want to keep an eye out for a potential deepseek r2 release. if its announced and it is extraordinarily good, it could be a big thing also for dgx demand

still rampart Feb 19, 2026, 7:25 PM

#

Whatever it is I just won't buy an apple. At the moment they seem the best buy, but that will change

crystal cedar Feb 19, 2026, 7:26 PM

#

still rampart Whatever it is I just won't buy an apple. At the moment they seem the best buy, ...

if you're in europe, the german site of a certain US based company known for selling books is listing the asus gx10 for sub 3K euros right now. you could consider buying it now, securing the price, and have a month to think about, it two weeks to send it back if you change your mind.

still rampart Feb 19, 2026, 7:26 PM

#

I'm east coast US

crystal cedar Feb 19, 2026, 7:27 PM

#

ah ok, well good knows is its 3K USD for you guys and thats even cheaper 😄

#

dell also has one, not sure what it retails for over there

#

dell precision pro max? search for gb10.

#

i'm really hoping there will be some kind of announcment on the upcoming macs very soon

still rampart Feb 19, 2026, 7:29 PM

#

First I've seen the max+ 395. That's 128 unified like the spark?

crystal cedar Feb 19, 2026, 7:30 PM

#

i'm pretty sure it is but don't hold me to it. AMD, dedicated AI processor, comes with 128GB and then probably a graphics card too

#

gaming rig

#

AMD site says *The Ryzen™ AI MAX+ 395 is available today with system memory options ranging from 32GB all the way up to 128GB of unified memory – out of which up to 96GB can be converted to VRAM *

still rampart Feb 19, 2026, 7:31 PM

#

I got an Olares One, 5090mobile in it, only 32gb vram. Had fomo and jumped on the Kickstarter

#

I was just looking that up

crystal cedar Feb 19, 2026, 7:33 PM

#

right now 128 might not be enough to run things like the latest kimi, but i'm willing to make a bet that something new could come in the next months that causes run for the 128gb segment

#

speculating of course. my gamble was to wait for the mac announcement to see what the new studios are like and then decide what to buy.

craggy ferry Feb 19, 2026, 7:37 PM

#

still rampart Whatever it is I just won't buy an apple. At the moment they seem the best buy, ...

They’re basically the only option if you want 512g tho 🙁

craggy ferry Feb 19, 2026, 7:38 PM

#

crystal cedar right now 128 might not be enough to run things like the latest kimi, but i'm wi...

We already have glm-4.7-flash. Anything better than that starts to feel like sonnet 4.5

crystal cedar Feb 19, 2026, 7:38 PM

#

craggy ferry We already have glm-4.7-flash. Anything better than that starts to feel like son...

u think deepseek r2 might be something?

craggy ferry Feb 19, 2026, 7:39 PM

#

I think in six months local models that fit in 128g will probably be competitive with like opus 4.5

verbal sigil Feb 19, 2026, 7:39 PM

#

still rampart Whatever it is I just won't buy an apple. At the moment they seem the best buy, ...

always hated macs...things have changed once I got one for free ahahah

crystal cedar Feb 19, 2026, 7:39 PM

#

verbal sigil always hated macs...things have changed once I got one for free ahahah

don't diss the budget choice for the ram deprived bro 😄

verbal sigil Feb 19, 2026, 7:39 PM

#

craggy ferry I think in six months local models that fit in 128g will probably be competitive...

don't know six but 12 months likely

still rampart Feb 19, 2026, 7:40 PM

#

craggy ferry They’re basically the only option if you want 512g tho 🙁

Youre right, I just hate the company and would rather pay for tokens

craggy ferry Feb 19, 2026, 7:41 PM

#

Can’t stand sending my entire literally everything to anyone else so welp

still rampart Feb 19, 2026, 7:41 PM

#

verbal sigil always hated macs...things have changed once I got one for free ahahah

Not hating on the product. Hating on the company philosophy and business model

crystal cedar Feb 19, 2026, 7:42 PM

#

hey @verbal sigil check out epoch.ai/data-insights/consumer-gpu-model-gap - excerpt: *Using a single top-of-the-line gaming GPU like NVIDIA’s RTX 5090 (under $2500), anyone can locally run models matching the absolute frontier of LLM performance from just 6 to 12 months ago. *

#

GPQA improvment as function of time looks linear.

#

well up until now at least

#

but fun chart

verbal sigil Feb 19, 2026, 7:44 PM

#

I have my trusty 3090

#

Good times

#

when model were 7millions

#

not billions

crystal cedar Feb 19, 2026, 7:46 PM

#

Lots is happening when it comes to small (sub 3B) models too tho - lfm2, nanbeige 4.1

verbal sigil Feb 19, 2026, 7:46 PM

#

BTW, I have created a benchmark for small models - whether they can do the bootstrapping successfully

#

I suggest to all local small model users to hatch models by specifying what to do during the bootstrapping

#

something like "consult the bootstrap file, update the soul etc and remove the bootstrap"

crystal cedar Feb 19, 2026, 7:48 PM

#

wow

#

now all you need is a virus and you have some kind of neumann probe

#

clawifying the whole planet 😄

native valve Feb 19, 2026, 7:52 PM

#

crystal cedar wow

henry do you think i could download clawdbot onto my macbook e?

crystal cedar Feb 19, 2026, 7:55 PM

#

native valve henry do you think i could download clawdbot onto my macbook e?

well the software it self you can download on pretty basic hardware, problem is for it to work you need access to advanced ai. people used subscriptions for that, but now it seems bot use is not allowed so that route is blocked. what is left is either to pay for the use in other ways or get advanced hardware. right now both options look prohibitively expensive (but that might change). if you're interested look around see what other people are doing and learn from their mistakes and wins

#

if you really want to try it out, don't use your regular computer for it - see if you have old gear that you don't need, wipe it clean of private stuff, and put it on a guest network for starters, assuming it might get hacked by someone. if that happens at least you're hopefully not leaking any sensitive data

#

good time to learn something new about it every day now, see what other people are trying.

quartz zinc Feb 19, 2026, 8:00 PM

#

Can people distill a robotics AI model to control all the motors just through AI (real-time)?

crystal cedar Feb 19, 2026, 8:01 PM

#

an alternative route could be to try to use local ai, i.e. run some small kind of ai on the machine itself (or as a server on a diffrent pc). it won't be as "smart" as claude, but could still help out with somethings like summarize emails if you have a lot, or watch homepages, while you learn more about how this thing actually works.

native valve Feb 19, 2026, 8:04 PM

#

crystal cedar an alternative route could be to try to use local ai, i.e. run some small kind o...

oh thats good advice, Henry. thank you. unfortunately, i dont have money for a second pc. but i have thouhgt about it. parts are just too expensive.

#

the mac mini looks amazing. sadly, my savings are tapped dry

vital girder Feb 19, 2026, 8:11 PM

#

does anyone know how can i add my api key i deleted my api key by mistake and im not sure how to add my new api key

green mortar Feb 19, 2026, 8:29 PM

#

vital girder does anyone know how can i add my api key i deleted my api key by mistake and im...

Mention this in user help user channel

low grotto Feb 19, 2026, 8:33 PM

#

Why are people choosing to run on Mac Mini's when they just use API's anyway?

ancient wagon Feb 19, 2026, 8:53 PM

#

How are you guys making money with openclaw thing😏

zealous veldt Feb 19, 2026, 9:09 PM

#

low grotto Why are people choosing to run on Mac Mini's when they just use API's anyway?

Macmini has other possibilities, like:

Google Chrome to browse stuff on the internet
Local integration with Apple Mail, Calendar, Remiders (it's make easy to have a daily briefing on what's your day)
Integration with more skills, like Obsidian that I'm using a lot to generate documentation and reports

low grotto Feb 19, 2026, 9:23 PM

#

zealous veldt Macmini has other possibilities, like: - Google Chrome to browse stuff on the in...

It can use google chrome?

uneven ridge Feb 19, 2026, 9:25 PM

#

Is there an official IOS openclaw?

crystal cedar Feb 19, 2026, 9:32 PM

#

low grotto It can use google chrome?

i think he meant safari

crystal cedar Feb 19, 2026, 9:38 PM

#

uneven ridge Is there an official IOS openclaw?

no. the developer himself used instant messaging apps to chat with openclaw installed on a normal pc. but technically i suppose it could run on a phone, might drain the battery though since its always on and working

waxen gorge Feb 20, 2026, 12:49 AM

#

anyne been able to get the ios app to connect>?

random void Feb 20, 2026, 1:48 AM

#

uneven ridge Is there an official IOS openclaw?

It’s in development still. And Apple Watch app

zealous veldt Feb 20, 2026, 1:55 AM

#

low grotto It can use google chrome?

Yes. You can do openclaw browse and it’s connect with your gateway to do web browsing, login into websites with user and password provided by 1password

prime aurora Feb 20, 2026, 4:55 AM

#

is it right to understand that the openclaw docs recommend using a vps for gateway and just use physical hardware as nodes?

quaint lantern Feb 20, 2026, 6:00 AM

#

crystal cedar if you're in europe, the german site of a certain US based company known for sel...

I bought this one. But kinda regret it already. Didn't realize the form factor of the m2. is 2242. You can't buy a 4TB 2242 anywhere on the market (yet?). So if you want to upgrade, you can only go up to 2TB. One might want to consider to buy the expensive 4TB version of the spark.

quaint lantern Feb 20, 2026, 6:07 AM

#

crystal cedar NVIDA published article a few days ago about running OpenClaw on DGX Spark etc....

I'm cramming qwen3-coder-next-Q6 and qwen2.5-coder-32b-Q4 in my RAM. I'm trying to get a multi-agent setup running. Still testing tho. Might use a bigger model and message queueing or something similar.

crystal cedar Feb 20, 2026, 7:30 AM

#

quaint lantern I bought this one. But kinda regret it already. Didn't realize the form factor o...

thanks for this feedback. i too noticed it being 2242 for the gx10 and 2280 for the other dgx versions (as well as the power button arrangement and the added weight), but no other differences and it didn't bother me. figured 1TB is small but if it bugs me down the line I just need an external hard drive, so nice to have this feedback from you as an experienced user!

quaint lantern Feb 20, 2026, 7:35 AM

#

crystal cedar thanks for this feedback. i too noticed it being 2242 for the gx10 and 2280 for ...

yeah external is a good option, as long as you don't need the speed. But once the model is loaded in RAM, the harddisk speed shouldn't matter much

craggy ferry Feb 20, 2026, 8:57 AM

#

crystal cedar hey <@363022045610770454> check out epoch.ai/data-insights/consumer-gpu-model-g...

this is from two years ago tho, check out https://epoch.ai/data-insights/open-weights-vs-closed-weights-models

quartz zinc Feb 20, 2026, 11:57 AM

#

So I’m thinking, if we help setup automated things in real-life that makes ASI happen faster (that eventually saves everybody), this is what to do? 🤔

stable rampart Feb 20, 2026, 12:28 PM

#

Hi, I started using OpenClaw yesterday. I wonder whether people notice the high CPU usage? My AMD Ryzen 9 7900 12-Core Processor is running at 100% constantly once I send a new message, long before my 4090 fires up. I wonder what can justify the full usage of a 12-core CPU for an LLM-based application.

bronze abyss Feb 20, 2026, 1:20 PM

#

Hello, i would like to connect my claw with Smart-Glasses. Brilliant Labs Halo looks like the best choice. Has anybody done that already?

quartz zinc Feb 20, 2026, 1:45 PM

#

https://x.com/grok/status/2024825293467185272?s=20

astral gobletBOT Feb 20, 2026, 1:45 PM

#

quartz zinc https://x.com/grok/status/2024825293467185272?s=20

@grok via Twitter

Grok (@grok)

@AntDX316 @wildmindai Real. Taalas announced their HC1 AI chip today, claiming 17,000 tokens/sec on Llama3.1-8B—10x faster than NVIDIA B200, no HBM needed. It's specialized hardware for LLMs. Check taalas.com for details.

**💬 2 🔁 1 ❤️ 6 👁️ 645 **

quartz zinc Feb 20, 2026, 1:45 PM

#

https://x.com/AntDX316/status/2024842084868366601?s=20

astral gobletBOT Feb 20, 2026, 1:45 PM

#

quartz zinc https://x.com/AntDX316/status/2024842084868366601?s=20

@AntDX316 via Twitter

Ant A. 🇺🇸 (@AntDX316)

wtf 🤯🤯🤯 15,000+ TOKENS/SECOND
︀︀
︀︀I just tested it now with my own tests!!!!!!!
︀︀It's legit. 🤯🤯🤯🤯🤯🤯🤯🤯🤯🤯

Quoting Plussa Miinus 🇫🇮🇺🇦 (@MiinusPlussa)
︀
@AntDX316 @wildmindai @grok You can try it yourself: chatjimmy.ai/

**👁️ 4 **

quartz zinc Feb 20, 2026, 2:32 PM

#

https://x.com/wildmindai/status/2024810128487096357?s=20

astral gobletBOT Feb 20, 2026, 2:32 PM

#

quartz zinc https://x.com/wildmindai/status/2024810128487096357?s=20

@wildmindai via Twitter

Wildminder (@wildmindai)

17,000 tokens per second!! Read that again!
︀︀LLM is hard-wired directly into silicon. no HBM, no liquid cooling, just raw specialized hardware. 10x faster and 20x cheaper than a B200.
︀︀the "waiting for the LLM to think" era is dead. Code generates at the speed of human thought.
︀︀Transition from brute-force GPU clusters to actual AI appliances.
︀︀taalas.com/the-path-to-ubiquitous-ai/

**💬 66 🔁 95 ❤️ 976 👁️ 51.0K **

crystal cedar Feb 20, 2026, 3:13 PM

#

craggy ferry this is from two years ago tho, check out https://epoch.ai/data-insights/open-we...

hey thanks! amazing!

steep wedge Feb 20, 2026, 3:21 PM

#

quaint lantern I bought this one. But kinda regret it already. Didn't realize the form factor o...

I found one on Amazon from a brand I’d never heard of, and it was $699. It was gen 5 at least.

crystal cedar Feb 20, 2026, 3:30 PM

#

steep wedge I found one on Amazon from a brand I’d never heard of, and it was $699. It was g...

you got the 1TB version right, did you upgrade to 2TB or happy with 1TB?

steep wedge Feb 20, 2026, 4:03 PM

#

crystal cedar you got the 1TB version right, did you upgrade to 2TB or happy with 1TB?

I’ve been fine with 1TB. I wouldn’t mind upgrading to PCIe gen 5 for better performance, which I think would help models load a bit faster, but I’m not looking to spend $699.

#

If I blow more money on this, I want another GX10 🤩

dawn garden Feb 20, 2026, 4:10 PM

#

41 GB RAM
Intel Xeon 4.5GHz 12vCores
NVIDIA Quadro RTX 6000 24GB

what sort model could it run

sullen thicket Feb 20, 2026, 4:13 PM

#

Listen, I’m dabbling into deeper waters than I probably should. What is the best hardware on a budget for this AI model? I’m looking to have an “agent”
To help in my real estate business and some personal scheduling etc.

crystal cedar Feb 20, 2026, 5:44 PM

#

steep wedge If I blow more money on this, I want another GX10 🤩

you might want to consider a dgx spark + mac studio combo with the former doing the prefilling and the latter decoding

#

then again, dual sparks would let you run larger models, so 2 x DGX Spark + 1 Mac Studio

craggy ferry Feb 20, 2026, 6:30 PM

#

crystal cedar you might want to consider a dgx spark + mac studio combo with the former doing ...

I’m interested in how this would work, would it be a custom code situation?

#

Conceptually I can see it. Just wondering if something already handles it

#

Since yeah I guess an m3 is going to be fairly slow at prefill

red cypress Feb 20, 2026, 7:33 PM

#

still dabbling, considering using an Intel NUC 12 (i7-1260p) (ubuntu 24 LTS desktop) with 64gb of ram connected to a razer core x with a rtx 3090 to run ollama for pipeline stuff for my agent (runs opus-4-6, but maybe can offload some stuff to local llm's (heartbeat, TTS, stable diffusion generation on LoRa trained image, etc). gateway runs on a vps, but this will be a node. hoping smaller models can run in 64gb ram and models needing faster token speed on the vram. Its been to fun to tinker with openclaw and learn more about AI. I know the thunderbolt 3 is a bottleneck but I already had the hardware. Still haven't figured out what local models are really decent at. so much to ingest.

tranquil hazel Feb 20, 2026, 7:40 PM

#

red cypress still dabbling, considering using an Intel NUC 12 (i7-1260p) (ubuntu 24 LTS desk...

bro google pro is 20 euro a month

#

just get a mini pc

quartz zinc Feb 20, 2026, 9:08 PM

#

OpenClaw AGI this now:
https://x.com/austin_malerba/status/1737247873459241138?s=46

astral gobletBOT Feb 20, 2026, 9:08 PM

#

quartz zinc OpenClaw AGI this now: https://x.com/austin_malerba/status/1737247873459241138?s...

@austin_malerba via Twitter

Austin Malerba (@austin_malerba)

Still the coolest and most challenging thing I’ve ever built.
︀︀
︀︀Threejs (r3f) + arduino simulation + analog simulation all happening in the browser simultaneously.
︀︀
︀︀It was a wild ride to say the least.

**💬 205 🔁 895 ❤️ 8.3K 👁️ 668.1K **

▶ Play video

limpid bay Feb 20, 2026, 9:12 PM

#

red cypress still dabbling, considering using an Intel NUC 12 (i7-1260p) (ubuntu 24 LTS desk...

i have just been observing for the past couple weeks and still haven't jumped in. still waiting. interested in getting a mac mini 4 pro with 128gb but kind of waiting to see what the new mac studio release will be in a month or so.

worthy cargo Feb 20, 2026, 9:34 PM

#

What are people's recommended options for rackmounted hardware for OC? I've recently acquired a server cabinet and thinking through how to migrate off of my personal machine

wind fog Feb 20, 2026, 10:07 PM

#

What’s the oldest hardware ya’ll have OpenClaw running on?
Me? Lenovo T430 laptop running Ubuntu server. Obv not using local llm.

crystal cedar Feb 20, 2026, 10:40 PM

#

craggy ferry Conceptually I can see it. Just wondering if something already handles it

Hi, yes, EXO - see their blog post blog.exolabs.net/nvidia-dgx-spark/ bonus: cat! 🐱

tacit dock Feb 20, 2026, 11:41 PM

#

worthy cargo What are people's recommended options for rackmounted hardware for OC? I've rece...

I've been happy with chenbro cases for DIY build. Many PC cases can fit sideways, too, if you get a rack shelf. Lots of server rack stuff doesn't optimize for "quiet", so some other factors.. depending on what u are putting together

#

Rosewill and SilverStone, too... they're all pretty similar for 4U

red cypress Feb 21, 2026, 12:00 AM

#

limpid bay i have just been observing for the past couple weeks and still haven't jumped in...

I considered Mac also for the unified memory for larger models. Kinda waiting for the m5 chips later this year. Apple hardware is a premium but love the unified memory.

steel quarry Feb 21, 2026, 12:13 AM

#

dawn garden 41 GB RAM Intel Xeon 4.5GHz 12vCores NVIDIA Quadro RTX 6000 24GB what sort mode...

Qwen3:14b for example, or ChatGPT-oss-20 ( I forgot the exact name)

crystal cedar Feb 21, 2026, 12:16 AM

#

steel quarry Qwen3:14b for example, or ChatGPT-oss-20 ( I forgot the exact name)

GPT-OSS 20b

limpid bay Feb 21, 2026, 12:17 AM

#

red cypress I considered Mac also for the unified memory for larger models. Kinda waiting fo...

yep same here man

magic raven Feb 21, 2026, 12:58 AM

#

dawn garden 41 GB RAM Intel Xeon 4.5GHz 12vCores NVIDIA Quadro RTX 6000 24GB what sort mode...

impressive...

#

you probably can run gpt-oss-20b or llama3.2:20b

craggy ferry Feb 21, 2026, 12:59 AM

#

crystal cedar Hi, yes, EXO - see their blog post blog.exolabs.net/nvidia-dgx-spark/ bonus: cat...

Damn it why. Why. Now I want a gx10 to go with my 512g …

crystal cedar Feb 21, 2026, 1:00 AM

#

craggy ferry Damn it why. Why. Now I want a gx10 to go with my 512g …

blame the cat 😄

craggy ferry Feb 21, 2026, 1:00 AM

#

I mean if I had this then I would probably have enough prefill compute to let my friends use some tokens too

#

I’m currently just hyper optimizing prefix cache

crystal cedar Feb 21, 2026, 1:01 AM

#

maybe the new mac studios will have m5 pro/ultras that will perform better than the spark

craggy ferry Feb 21, 2026, 1:02 AM

#

Yeah, I’ll hold off and see what gets released first. If the m5 is amazing and they also have a 1tb variant I might be buying a ~~car~~

#

My current focus is convincing my agents to actually use the specialist models

crystal cedar Feb 21, 2026, 1:04 AM

#

craggy ferry My current focus is convincing my agents to actually use the specialist models

i'm giving serious consideration to actually running openclaw offline, just feeding it what i want it to know

craggy ferry Feb 21, 2026, 1:05 AM

#

I just got set up with multiple threads to my front agent

#

It’s so good now

crystal cedar Feb 21, 2026, 1:05 AM

#

cool!

craggy ferry Feb 21, 2026, 1:06 AM

#

I can just switch to a different context (or make a new one) if I want it to answer a random question but don’t want to nuke the perfectly good context window where we’re discussing some issue or other

maiden obsidian Feb 21, 2026, 3:50 AM

#

should I host my openClaw to my mid-tier gaming pc? I dont have much important information on that, its mostly just games, that shouldnt be risky right?

im currently running it on Oracle Cloud x86 1gb ram, 1 core cpu

my pc specs are
Ryzen 7 5700, Radeon 7600, 16gb ddr4 3200mhz, 1tb nvme

random void Feb 21, 2026, 5:15 AM

#

maiden obsidian should I host my openClaw to my mid-tier gaming pc? I dont have much important i...

you can run it on a raspberry pi. if you want to run it on your main computer, put it in a docker container to isolate it etc. (research that, read the openclaw docs). The only real reason for big machines is running local LLM for openclaw to use vs cloud models. (if you run a lot of agents, then some extra ram helps machine size wise).

maiden obsidian Feb 21, 2026, 5:26 AM

#

random void you can run it on a raspberry pi. if you want to run it on your main computer,...

i have a rpi z2w but it wouldn't be much better than my current vps i suppose. I'll docker host it on my pc and will just look into local llms.
But I only have 8gb vram, would it be any good? or better off using cloud?

random void Feb 21, 2026, 5:36 AM

#

maiden obsidian i have a rpi z2w but it wouldn't be much better than my current vps i suppose. I...

8GB for local models is limiting from what I understand. But will let others with more experience in that chime in.

gentle flax Feb 21, 2026, 7:52 AM

#

magic raven you probably can run gpt-oss-20b or llama3.2:20b

mb for butting in but I have a 5060ti setup rn running this Quant 6-bit. with more than enough headroom for ctx or wtv. It seems to not be able to make basic tool/bash calls. I wouldnt think this would be a result of the quant but not sure how to fix it at this point. Have you personally had success with this model> thanks in advanced. I can show examples of chats if interested

quartz zinc Feb 21, 2026, 11:42 AM

#

https://x.com/dr_singularity/status/2025018436879831292?s=46

magic raven Feb 21, 2026, 1:19 PM

#

gentle flax mb for butting in but I have a 5060ti setup rn running this Quant 6-bit. with mo...

i have seen success. why?

#

it's because i own the BLACKWELL 6000

#

IT COST ME MY KIDNEY AND ALL MY RAM

#

and i bought 2 more

tired plover Feb 21, 2026, 1:35 PM

#

@crystal cedar have some results on testing on Spark, with Llama Server and various models I had bad experience in quality, speed was mostly ok if a bit slow but quality is not good on local LLM, wondering what other people experience…

#

Now moving to vLLM with spark specific models

crystal cedar Feb 21, 2026, 2:55 PM

#

tired plover Now moving to vLLM with spark specific models

Thanks for the update - sounds like you're having an exciting weekend! From what I've read i would expect it to be slow for llama server, but really rip for parallel calls in vllm. As for models, seems minimax and gpt-oss are the ones many have been using and/or preferring, are you using the spark-specific nvfp4 quantizations? I saw that you could download models from either huggingface or some dedicate nvidia repository - not sure if there is any difference. Are you already running openclaw on it or on something else with the gx10 as a server? Most importantly, how does the 1TB feel - after DGX OS and two models, still room on it? I'm seriously considering jumping in too towards the end of the month, but wallet loading slowly.

tired plover Feb 21, 2026, 2:57 PM

#

@crystal cedar so, i tried general stuff, easy to set up, had some succes with permformance but quality was always meh... im trying now vLLM with NVfp4 trained model

#

only thing right now, pytorch takes endless to work up and it crashes while starting, need to figure out whats the problem

gentle flax Feb 21, 2026, 6:41 PM

#

magic raven it's because i own the BLACKWELL 6000

well shit

gentle flax Feb 21, 2026, 6:41 PM

#

magic raven i have seen success. why?

well i tried to run the model on my own an I have enough VRAM space but the model can’t even make a basic tool call

#

It just says it will do something like read SOUL.md but never once makes a tool call

#

i’m having this issue on and off w diffrent models

#

j trying to see what works w others

tacit dock Feb 21, 2026, 9:09 PM

#

magic raven and i bought 2 more

that's really cool, having that sort of flexibility all local

magic raven Feb 21, 2026, 9:12 PM

#

tacit dock that's really cool, having that sort of flexibility all local

yeah, i run lots of advanced ai models :)
i have models based on fictional characters since im such a geek

crystal cedar Feb 21, 2026, 9:17 PM

#

magic raven yeah, i run lots of advanced ai models :) i have models based on fictional chara...

whole models based on fictional characters? as in batman-gpt?

magic raven Feb 21, 2026, 9:20 PM

#

crystal cedar whole models based on fictional characters? as in batman-gpt?

way more niche...

crystal cedar Feb 21, 2026, 9:21 PM

#

magic raven way more niche...

i knew of models for creative fiction and role playing, just didn't realize there were such narrowly adapted gpts. thanks for teaching me something new!

magic raven Feb 21, 2026, 9:21 PM

#

crystal cedar i knew of models for creative fiction and role playing, just didn't realize ther...

it's not for roleplaying tbh it's just for the shitpost ngl

#

having this thing invade my screen saying "all ur base r belong to me" is scary asf

crystal cedar Feb 21, 2026, 9:23 PM

#

so if amount of agent attributed shitposting increases in the next few days, i know your agents are up and running fine? 🙂

magic raven Feb 21, 2026, 9:24 PM

#

crystal cedar so if amount of agent attributed shitposting increases in the next few days, i k...

welp tbh the special models are not on openclaw rn

#

special software

blissful stirrup Feb 21, 2026, 10:20 PM

#

Anyone in here running an rx 7900 xtx? As this seems the only affordable alternative to nvidia gpus i was thinking of getting one.

opaque silo Feb 22, 2026, 4:18 AM

#

Guys... are you really being banned from Claude for using openclaw with your subscription?

hardy tinsel Feb 22, 2026, 4:59 AM

#

what specs do i need to run claw

fierce thicket Feb 22, 2026, 9:02 AM

#

opaque silo Guys... are you really being banned from Claude for using openclaw with your sub...

not yet using subscription with codex too

crystal cedar Feb 22, 2026, 9:36 AM

#

quaint lantern I bought this one. But kinda regret it already. Didn't realize the form factor o...

alex zeskind posted a video yesterday on you tube about upgrading a gx10 to 4TB, so he seems to have found a 4TB - very entertaining video in which he also discovered that the gx10 will accept a 2280 if you turn the whole thing upside down and let it stick out like a sore thumb. my kind of engineering.

static sky Feb 22, 2026, 9:41 AM

#

Beelink SER5 MAX Mini PC, AMD Ryzen R7 7735HS (8C/16T, i4,75GHz), Mini Desktop Computer 24GB LPDDR5 RAM 500GB PCIe SSD | will this hardware be good enough to experiment a little bit? Don't need video stuff, just text.

steep talon Feb 22, 2026, 11:12 AM

#

tired plover <@1164590034855415859> have some results on testing on Spark, with Llama Server ...

Hi, I just got here and am looking for speed improvements for my DGX Spark running Qwen/Qwen3-32B-FP8. I've tried to turn off reasoning but not sure it is really off as the very minimum response time I've seen via Open Claw is 9 seconds, but most things take at least a minute. Is that the best I can expect with this model on that hardware?

tired plover Feb 22, 2026, 11:56 AM

#

steep talon Hi, I just got here and am looking for speed improvements for my DGX Spark runni...

how many tk/s you get ?

#

i was moving from Llama to vLLM for specified support on DGX Spark, problem is you have no tool calling for them, i needed to implement a proxy with Claude now i have 39 token/s without MTP (makes it slower) on qwen3 coder next

#

response is very snappy now, to a degree where i would say even close to cloud performance, as I'm still testing i need to see how good the quality is but for now im stoked how good it works after first trys with Llama being slow AF, hopefullly openclaw team soon implements a fix for the tool calling bug and i dont need a proxy anymore... anybody knows who i could ping for that ?

steep talon Feb 22, 2026, 12:47 PM

#

tired plover how many tk/s you get ?

With a curl directly to the LLM about 10 tk/s and I can see that reasoning is still on. I've set the follow to turn it off, but no change.
environment:
- VLLM_REASONING_BACKEND=None
- NIM_REASONING_MODE=disabled
- VLLM_ENFORCE_EAGER=true

tired plover Feb 22, 2026, 12:54 PM

#

maybe you should also go away from Llama Server, i dont have the knowledge to really say whats the problem but with vLLM its much better but also complicated... maybe check the guide in the Docs with LMStudio

mossy quest Feb 22, 2026, 3:43 PM

#

I'm running OpenClaw with online providers on a Raspberry Pi 5 8GB. Works perfectly.

bleak rapids Feb 22, 2026, 4:05 PM

#

https://cdn.discordapp.com/attachments/1437361989558206499/1475153335240884268/image.png?ex=699c731a&is=699b219a&hm=d23f3705bd64ae3976e84935a3740a0f9a8c7032cdf40ed73f1860b82d2b36f3&

atomic hull Feb 22, 2026, 4:10 PM

#

mossy quest I'm running OpenClaw with online providers on a Raspberry Pi 5 8GB. Works perfec...

Here too, I got it running with Ollama as well, but it's insanely slow, and buggy.

tired plover Feb 22, 2026, 5:24 PM

#

@steep wedge how did you work on the tool calling ?

blissful quarry Feb 22, 2026, 9:46 PM

#

mossy quest I'm running OpenClaw with online providers on a Raspberry Pi 5 8GB. Works perfec...

Do you think jumping to the 16GB RAM option is worth it?

mossy quest Feb 22, 2026, 9:47 PM

#

blissful quarry Do you think jumping to the 16GB RAM option is worth it?

no

blissful quarry Feb 22, 2026, 9:48 PM

#

mossy quest no

Thanks

quartz pawn Feb 22, 2026, 9:55 PM

#

I’m setting up and testing in OSS 20B. What model should I size up to for my hardware: 5090+3090 (56gb vram) and 128gb DDR5?

outer epoch Feb 22, 2026, 10:19 PM

#

I bought this:

GTR9 Pro

128 GB unified VRAM

IMO is the best quality/price you can get.
Don't buy a Minisforum, since the second drive runs an x1, so it's like a SATA3 😂

For the same performance:

Apple cost 2.5X
NVIDIA cost 2X but rely on normal RAM, so models can't run at full potential...

Best hardware bought this year

crystal cedar Feb 22, 2026, 10:27 PM

#

outer epoch I bought this: # [GTR9 Pro](https://tidd.ly/48Ow4QU) ## 128 GB unified VRAM IMO...

what's it like as an always on device? did you install OC on it or using it as a server with local model?

outer epoch Feb 22, 2026, 10:28 PM

#

All o the same machine

sand axle Feb 22, 2026, 10:47 PM

#

someone issues with docker desktop on windows? i have huge problems running it

dull crescent Feb 22, 2026, 10:48 PM

#

outer epoch I bought this: # [GTR9 Pro](https://tidd.ly/48Ow4QU) ## 128 GB unified VRAM IMO...

How could i compare the two. It's still a high price point and is beelink a trusted site? looks weird

outer epoch Feb 22, 2026, 10:49 PM

#

dull crescent How could i compare the two. It's still a high price point and is beelink a trus...

Google and YouTube works 😂

It's the official website... 😅

outer epoch Feb 22, 2026, 10:50 PM

#

dull crescent How could i compare the two. It's still a high price point and is beelink a trus...

Even I thought that the minus in the url was weird, but it's that... Yet bought 2 of them... Arrived, works... Official community is on bbs.bee-link.com so... 😅

steep wedge Feb 22, 2026, 11:03 PM

#

outer epoch I bought this: # [GTR9 Pro](https://tidd.ly/48Ow4QU) ## 128 GB unified VRAM IMO...

I think I’d rather have the Asus GX10

outer epoch Feb 22, 2026, 11:05 PM

#

I've seen DGX Spark in action, half power of what they claim... 😅

outer epoch Feb 22, 2026, 11:07 PM

#

steep wedge I think I’d rather have the Asus GX10

I can't find the deep review I've seen weeks ago, but if you Google a bit, you'll figure out that's an overpriced stuff and nobody tell you that most of the cores are eCores... Moreover being ARM, most of the things you could do with the Beelink, are not working.
You must use the distro Nvidia give you, ok, works like a charm with CUDA, but for all everything else, it's a piece of trash.

outer epoch Feb 22, 2026, 11:09 PM

#

steep wedge I think I’d rather have the Asus GX10

Choose wisely your poison 😂

https://youtu.be/crCTWT8645Q?si=_zTum06y26DsHXrj

covert kindle Feb 22, 2026, 11:12 PM

#

what is the best mini pc for cheap entry to install openclaw on it, instead of a $600 mac mini?

crystal cedar Feb 22, 2026, 11:19 PM

#

covert kindle what is the best mini pc for cheap entry to install openclaw on it, instead of a...

the software itself will work on humble gear e.g. old laptop in your closet. problem is you need access to very good AI as well, either local model on advanced gear or subscription or pay as you go access.

potent ridge Feb 22, 2026, 11:19 PM

#

I see a lot about installing on a local machine… can it be done on VPS?

crystal cedar Feb 22, 2026, 11:20 PM

#

potent ridge I see a lot about installing on a local machine… can it be done on VPS?

yes quite some people doing that too, see all kind of stories of paying like 5 dollars a month for the vps. that does not include the second part, access to the AI.

covert kindle Feb 22, 2026, 11:35 PM

#

crystal cedar the software itself will work on humble gear e.g. old laptop in your closet. pro...

tried picking my old laptop, was actually to old, instalation failed on my macbook air from 2011 😄

crystal cedar Feb 22, 2026, 11:36 PM

#

covert kindle tried picking my old laptop, was actually to old, instalation failed on my macbo...

well, some people here are using raspberry pi 5 8gb succesfully. its a bit slow i guess, but low power consumption, great for always on.

craggy quail Feb 22, 2026, 11:59 PM

#

outer epoch I bought this: # [GTR9 Pro](https://tidd.ly/48Ow4QU) ## 128 GB unified VRAM IMO...

I'm waiting for one with similar hardware, what model are using with openclaw? now i have a rtx 3090 and all local models that i try have problem for use externals tools

dull crescent Feb 23, 2026, 12:51 AM

#

outer epoch I can't find the deep review I've seen weeks ago, but if you Google a bit, you'l...

Thanks so much I was debating what I should get and I might go for it instead of the Mac mini

outer epoch Feb 23, 2026, 1:14 AM

#

craggy quail I'm waiting for one with similar hardware, what model are using with openclaw? n...

I'm using different ones based on the needs.

wild socket Feb 23, 2026, 6:05 AM

#

What's the minimum mac mini spec Openclaw can run smoothly on?

craggy ferry Feb 23, 2026, 6:38 AM

#

Any of them

bronze ermine Feb 23, 2026, 6:38 AM

#

covert kindle tried picking my old laptop, was actually to old, instalation failed on my macbo...

If it’s from 2011 that means intel chip? Dual-boot a Linux os (xubuntu or puppy Linux) and install openclaw on that.

covert kindle Feb 23, 2026, 7:03 AM

#

bronze ermine If it’s from 2011 that means intel chip? Dual-boot a Linux os (xubuntu or puppy ...

thanks I might try that

craggy quail Feb 23, 2026, 7:23 AM

#

outer epoch I'm using different ones based on the needs.

For example? I'm using nemotron-3 and Qwen3 30B A3B and both have problems to create a notion page or use the skill, while Gemini or MiniMax M2.5 can do without problems

covert current Feb 23, 2026, 7:41 AM

#

Can we use window laptop or need a stronger machine?

bronze ermine Feb 23, 2026, 9:09 AM

#

covert current Can we use window laptop or need a stronger machine?

Windows, Mac, Linux all work. You need a computer that utilizes a terminal screen, and has at least 4GB of RAM. That's really the barrier for entry. People are installing openclaws on $25 android phones from 2016, as well as Raspberry Pi 4's. Your junked laptop from ten years ago can get the job done.

bleak rapids Feb 23, 2026, 9:22 AM

#

foxfetch is presented by FJOX.WIN
         .://:`              `://:.            root@FJOXSERVER24SE
       `hMMMMMMd/          /dMMMMMMh`          -------------------
        `sMMMMMMMd:      :mMMMMMMMs`           OS: Proxmox VE 8.4.16 x86_64
`-/+oo+/:`.yMMMMMMMh-  -hMMMMMMMy.`:/+oo+/-`   Host: ProLiant ML350 Gen9
`:oooooooo/`-hMMMMMMMyyMMMMMMMh-`/oooooooo:`   Kernel: Linux 6.8.12-18-pve
  `/oooooooo:`:mMMMMMMMMMMMMm:`:oooooooo/`     Uptime: 10h 5m
    ./ooooooo+- +NMMMMMMMMN+ -+ooooooo/.       Packages: 942 (dpkg)
      .+ooooooo+-`oNMMMMNo`-+ooooooo+.         Shell: bash 5.2.15
        -+ooooooo/.`sMMs`./ooooooo+-           CPU: Intel Xeon E5-2690 v4 (56) @ 3.500GHz
          :oooooooo/`..`/oooooooo:             GPU: NVIDIA Tesla M10
          :oooooooo/`..`/oooooooo:             GPU: NVIDIA GeForce GTX 1080 Ti
        -+ooooooo/.`sMMs`./ooooooo+-           GPU: NVIDIA Tesla M10
      .+ooooooo+-`oNMMMMNo`-+ooooooo+.         GPU: Intel DG2 [Arc A310]
    ./ooooooo+- +NMMMMMMMMN+ -+ooooooo/.       GPU: NVIDIA Tesla P40
  `/oooooooo:`:mMMMMMMMMMMMMm:`:oooooooo/`     GPU: NVIDIA Tesla M10
`:oooooooo/`-hMMMMMMMyyMMMMMMMh-`/oooooooo:`   GPU: NVIDIA Tesla M10
`-/+oo+/:`.yMMMMMMMh-  -hMMMMMMMy.`:/+oo+/-`   Memory: 350822MiB / 419069MiB (83%)
        `sMMMMMMMm:      :dMMMMMMMs`
       `hMMMMMMd/          /dMMMMMMh`
         `://:`              `://:`

covert current Feb 23, 2026, 9:34 AM

#

bronze ermine Windows, Mac, Linux all work. You need a computer that utilizes a terminal scree...

really? it wont be slow down? do i need to run it 24 hour aday?

bronze ermine Feb 23, 2026, 9:43 AM

#

Why would anything be slowed down? It's all in the cloud. The people getting mac studios and other crazy setups are doing so because they want to run local models. The trade off is local models are still borderline unusable for most functions.

wicked mauve Feb 23, 2026, 10:42 AM

#

covert current really? it wont be slow down? do i need to run it 24 hour aday?

You don't have to run it 24/h, but openclaw/your agents won't work unless it is turned on. So if you don't need them you can turn it off and then turn it back on when you want them to get to work again :)

#

Your laptop off = your bots offline

covert current Feb 23, 2026, 10:48 AM

#

wicked mauve You don't have to run it 24/h, but openclaw/your agents won't work unless it is ...

Ok thx, I check the use case and it seems just like using Claud AI itself so why I need this open claw??

wicked mauve Feb 23, 2026, 10:50 AM

#

covert current Ok thx, I check the use case and it seems just like using Claud AI itself so why...

Let's move to #general , OpenClaw can do a lot more than Claude itself but I suspect you don't have any use for it tbh

brave bison Feb 23, 2026, 11:32 AM

#

covert kindle what is the best mini pc for cheap entry to install openclaw on it, instead of a...

I’m using my old base M1 Mac mini, still plenty capable.

woven jungle Feb 23, 2026, 11:44 AM

#

I have a 3090 Ti with 24gb of vram and a MacBook Pro M1 Max 64gb. What is the best model which you can use good together with OpenClaw? I played with LM Studio and the Macbook with the qwen3-72b-embiggened-i1 mode, but I do not receive any answer. I see in the LM Studio Developer log that something is going on but it stop without any answer. I just send a ping 😛

lyric token Feb 23, 2026, 1:14 PM

#

covert current Can we use window laptop or need a stronger machine?

i am running it on a 2013 asus tablet with 2gb ram and it works ok so far

thorn umbra Feb 23, 2026, 1:21 PM

#

Finally,
running openclaw on this ryzen mini pc : https://sudobox.in/product/ryzen7-7730u-mini-pc
Consumes 5.5w idle, openclaw running inside an lxc container,
Its addivtive

craggy quail Feb 23, 2026, 3:19 PM

#

woven jungle I have a 3090 Ti with 24gb of vram and a MacBook Pro M1 Max 64gb. What is the be...

I'm using this one in my RTX 3090 https://huggingface.co/unsloth/Nemotron-3-Nano-30B-A3B-GGUF/blob/main/Nemotron-3-Nano-30B-A3B-UD-Q4_K_XL.gguf

tired plover Feb 23, 2026, 6:21 PM

#

For everybody who got a DGX Spark look at Avarock Git, he got something really good, qwen3 with mtp and up to 119 token per second I’m running it and it’s pretty good for its speed

crystal cedar Feb 23, 2026, 7:02 PM

#

tired plover For everybody who got a DGX Spark look at Avarock Git, he got something really g...

@tired plover Thanks man, looks like the 80B model you like (edit: Qwen3-Next-80B-A3B (MoE, 512 experts, NVFP4)), this guy right: github.com/Avarok-Cybersecurity/dgx-vllm

tired plover Feb 23, 2026, 7:04 PM

#

Exactly

#

I checked now some models but with spark you need to go special with vLLM

crystal cedar Feb 23, 2026, 7:05 PM

#

Man this is really nice to see, feels like the DGX is some kind of uncharted hardware territory just hiding a wealth of possibilities

tired plover Feb 23, 2026, 7:06 PM

#

It really feels like it and you can read on his blog it’s just the start as it’s all unofficial they just ahead of NVIDIA, in the coming month I would like to see official support and more models on that, then nothing can beat it in that price bracket

#

Only gateway process makes my life hard now…

crystal cedar Feb 23, 2026, 7:08 PM

#

you're right, i've changed my mind, ordering my first one soon. for single user chatting on ollama, the "low" tps was a bit discouraging. but agents working in parallel and vllm changes everything

tired plover Feb 23, 2026, 7:08 PM

#

In standard config it’s up to 128 hahaha

crystal cedar Feb 23, 2026, 7:10 PM

#

i'm giving serious consideration already to ordering a second one. the us bookseller website in germany lists them for less than 3 with delivery in 1-3 months right now. had a look at old maxed out mac studios (m3) - delivery expected 12-16 weeks from now.

#

i was hoping to be able to run openclaw with small local model, but seems safety issues more or less compels you to go as smart as you can.

#

i saw the bug you discussed preventing openclaw to rip using vllm right now, thanks for noticing that, saved me quite some work!

tired plover Feb 23, 2026, 8:04 PM

#

Actually it solved itself I don’t use proxy anymore and it tool calls

#

A second gives you a lot more choices but with one you’re already good for the start but who knows how it plays out might also buy another one

coral token Feb 23, 2026, 8:25 PM

#

Worth it to buy hw to run a 70b model right now with prices being what they are currently? Curious as to what people are doing right now and what the consensus is.

tired plover Feb 23, 2026, 8:27 PM

#

I have a dgx spark

#

I thought price is good… better than 128GB Mac

calm jetty Feb 23, 2026, 10:48 PM

#

outer epoch I bought this: # [GTR9 Pro](https://tidd.ly/48Ow4QU) ## 128 GB unified VRAM IMO...

How easy is it to get setup with a model on this machine? I assume you're running LLM studio or another setup?

calm jetty Feb 23, 2026, 10:57 PM

#

calm jetty How easy is it to get setup with a model on this machine? I assume you're runnin...

I'm just reading reviews of it and came across this on the forums: https://bbs.bee-link.com/d/8935-gtr9-pro-troubles-and-how-sort-of-solved-it/11 Sounds like they have fixed the teething troubles? Everything runs stable for you ?

craggy ferry Feb 23, 2026, 11:12 PM

#

crystal cedar i'm giving serious consideration already to ordering a second one. the us bookse...

Holy shit you aren’t kidding. 12 weeks out from us Apple Store. I guess I was right to pull the trigger when I did, mine comes end of this week ish

#

I want to try that GX10-as-prefill-node setup though

crystal cedar Feb 23, 2026, 11:13 PM

#

craggy ferry I want to try that GX10-as-prefill-node setup though

hey! congratulations - really happy for you! 🙂

craggy ferry Feb 23, 2026, 11:14 PM

#

You say that but my desire for more hardware knows no bounds

crystal cedar Feb 23, 2026, 11:14 PM

#

i have a feeling it could be people will be interested in used ones pretty soon too

craggy ferry Feb 23, 2026, 11:15 PM

#

I’m hoping the m5 studios are good and soon

crystal cedar Feb 23, 2026, 11:15 PM

#

are you familiar with exo? recently learnt they are based in london

craggy ferry Feb 23, 2026, 11:15 PM

#

Yeah been looking over their stuff

craggy ferry Feb 23, 2026, 11:15 PM

#

craggy ferry I’m hoping the m5 studios are good and soon

Apparently there’s an event in a week so that’d be funny timing

crystal cedar Feb 23, 2026, 11:16 PM

#

craggy ferry Apparently there’s an event in a week so that’d be funny timing

would be fun if they upgrade you for free 😄

craggy ferry Feb 23, 2026, 11:16 PM

#

They usually let you return it and get the new one I think if you want to do that

crystal cedar Feb 23, 2026, 11:17 PM

#

dear valued customer, you ordered an m3 but they are all sold out so here's an m5 instead as a small token of appreciation.
after all, we've been selling lots of mini macs lately so there is no end to our cash nudge nudge hint hint...

craggy ferry Feb 23, 2026, 11:17 PM

#

Lmaooo

crystal cedar Feb 23, 2026, 11:18 PM

#

well one can dream right 🙂 anyway cool piece of gear, they might become very difficult to come by, and i have a feeling m5 studios will be much more expensive

tranquil hazel Feb 23, 2026, 11:35 PM

#

crystal cedar well one can dream right 🙂 anyway cool piece of gear, they might become very di...

M5 studio wheels gonna cost 1600€ probably

crystal cedar Feb 23, 2026, 11:42 PM

#

from what i understand apple never really has to change their pricing which suggests they might lock in long term deals, but man with ram and everything going up.. i wonder what kind of long term deal the best negotiator out there can get

tranquil hazel Feb 23, 2026, 11:47 PM

#

crystal cedar from what i understand apple never really has to change their pricing which sugg...

I told you I'd name my agent "henry" didn't I? 😄

crystal cedar Feb 23, 2026, 11:47 PM

#

tranquil hazel I told you I'd name my agent "henry" didn't I? 😄

you did! great name!

dull crescent Feb 24, 2026, 12:55 AM

#

I can’t recall where I read this, but the AI infrastructure for consumers are going to be split for those that can afford ai inference locally and those that will eventually be priced out.

So buying some small inference now makes sense even if you can’t afford it its worth the investment if you can find a way to become more productive

outer epoch Feb 24, 2026, 2:16 AM

#

calm jetty How easy is it to get setup with a model on this machine? I assume you're runnin...

Ollama, llamacpp, ComfyUI... Everything is quite easy to run.

outer epoch Feb 24, 2026, 2:17 AM

#

calm jetty I'm just reading reviews of it and came across this on the forums: https://bbs.b...

Ethernet issue is no more an issue in new models, they changed the cards 😅

craggy ferry Feb 24, 2026, 3:47 AM

#

dull crescent I can’t recall where I read this, but the AI infrastructure for consumers are go...

Yeah, basically, the current prices from cloud providers are super subsidized. When the money spigot turns off for them, the token spigot is gonna turn off for us.

I think being able to churn out a steady flow of tokens locally, with open models that compete with current state of the art - as well as building the skills necessary to run locally at all - is going to be extremely worthwhile in a year or two

Either that or we make some breakthrough in architecture that massively reduces cost … which will make your local token production better too.

#

If you believe that Opus is going to be this cheap or cheaper forever, then, sure, buying local hardware doesn’t make sense. But I don’t see that being the case long term

dull crescent Feb 24, 2026, 5:28 AM

#

craggy ferry Yeah, basically, the current prices from cloud providers are super subsidized. W...

There was a report from Ark Invest that spoke about this briefly https://www.ark-invest.com/big-ideas-2026

dusk ridge Feb 24, 2026, 6:28 AM

#

outer epoch I bought this: # [GTR9 Pro](https://tidd.ly/48Ow4QU) ## 128 GB unified VRAM IMO...

This thing is massive. But what local LLMs even need this level of hardware? I'm new to local and it seems most are like 8B active parameters? Don't cloud models dwarf the locals?

calm jetty Feb 24, 2026, 6:33 AM

#

outer epoch Ollama, llamacpp, ComfyUI... Everything is quite easy to run.

Thanks for the info

outer epoch Feb 24, 2026, 6:53 AM

#

dusk ridge This thing is massive. But what local LLMs even need this level of hardware? I'm...

I run different models in the same time with many cuncurent agents 😅

craggy ferry Feb 24, 2026, 6:58 AM

#

dusk ridge This thing is massive. But what local LLMs even need this level of hardware? I'm...

GLM-5 is a local model you can run that wants 1T of ram.

deft idol Feb 24, 2026, 6:59 AM

#

outer epoch I bought this: # [GTR9 Pro](https://tidd.ly/48Ow4QU) ## 128 GB unified VRAM IMO...

that's a thing of beauty.

craggy ferry Feb 24, 2026, 7:00 AM

#

I’ve had my eye on one too, just not sure the compute is there to make it worth it

outer epoch Feb 24, 2026, 7:05 AM

#

deft idol that's a thing of beauty.

WDYM?

It's considered the best Strix Halo basing on Alex Ziskind tests 😊

dusk ridge Feb 24, 2026, 7:22 AM

#

I was going to get a clawbox but it's only 8GB ram so I'm thinking this instead. Didn't want to go crazy with the hardware quite yet. What do you think?

https://www.geekompc.com/geekom-a6-mini-pc/

dusk ridge Feb 24, 2026, 7:42 AM

#

Or this one
https://a.co/d/0edc0lLa

dry hull Feb 24, 2026, 9:07 AM

#

tired plover For everybody who got a DGX Spark look at Avarock Git, he got something really g...

Did that image work out of the box for you? I've built from source and tried to pull the prepared image, but they always end up without sm121 support in pytorch which is weird

outer epoch Feb 24, 2026, 9:12 AM

#

dusk ridge I was going to get a clawbox but it's only 8GB ram so I'm thinking this instead....

For such a thing, better a VPS... 😂

tired plover Feb 24, 2026, 1:44 PM

#

dry hull Did that image work out of the box for you? I've built from source and tried to ...

You need to pull whole thing with dependency’s and then might need the patch for MTP

dry hull Feb 24, 2026, 2:33 PM

#

tired plover You need to pull whole thing with dependency’s and then might need the patch for...

I gave up on it for now, managed to patch it partly but was slower than https://github.com/eugr/spark-vllm-docker, and following the instructions for mtp it just crashed.running modified spark-vllm-docker with the gadfly qwen3-coder-next now and that is quite snappy

tired plover Feb 24, 2026, 2:36 PM

#

dry hull I gave up on it for now, managed to patch it partly but was slower than https://...

How much tk per sek do you get ?

dry hull Feb 24, 2026, 5:34 PM

#

tired plover How much tk per sek do you get ?

30-40 mostly, feels fast enough

tired plover Feb 24, 2026, 6:42 PM

#

dry hull 30-40 mostly, feels fast enough

Quite good

#

I had around 60 but then a lot of answers wouldn’t get through to the chat …

crystal cedar Feb 24, 2026, 6:46 PM

#

tired plover I had around 60 but then a lot of answers wouldn’t get through to the chat …

the chat? as in some kind of webui (or something else like a customer service chatbot interface?)

tired plover Feb 24, 2026, 6:48 PM

#

No when I checked the tk/s from vLLM

crystal cedar Feb 24, 2026, 6:49 PM

#

tired plover No when I checked the tk/s from vLLM

sounds bizarre...

tired plover Feb 24, 2026, 6:49 PM

#

According to avarock you can get up to 120

crystal cedar Feb 24, 2026, 6:49 PM

#

yea i saw that, very impressive stuff

tired plover Feb 24, 2026, 6:50 PM

#

But it was with MTP and the accuracy is not very good with it why it dropped so many answers

#

With me maybe with v23 it will be better

crystal cedar Feb 24, 2026, 6:51 PM

#

do i get this the right way that you are using a webui like open webui in a browser, token throughput is around 60, but replies fail to materialize in the webui?

#

in data transfer terms it does not sound like a very demanding load

tired plover Feb 24, 2026, 6:52 PM

#

No they don’t come through in openclaw as they’re dropped

crystal cedar Feb 24, 2026, 6:52 PM

#

ah.. ok get lost somehow along the way to openclaw - got it

tired plover Feb 24, 2026, 6:53 PM

#

No sorry LLM drops it as the anticipated token doesn’t fit the answer you should get

#

And then openclaw need new turn to answer

crystal cedar Feb 24, 2026, 6:54 PM

#

ah ok...

#

did you a) install openclaw on the spark too, or b) using it as an inference server?

#

btw not sure if you gave nvidias 30B nemotron 3 nano model a run for the money yet (it's interestingly the one nvidia recommends for 24-48GB GPUs), but they are due to release two bigger models any time now, nemotron 3 super and ultra, might be interesting.

dry hull Feb 24, 2026, 7:09 PM

#

crystal cedar did you a) install openclaw on the spark too, or b) using it as an inference ser...

I'm using mine for inference only, it ooms and crashes very easily and that would be annoying if it was the openclaw server as well

crystal cedar Feb 24, 2026, 7:11 PM

#

dry hull I'm using mine for inference only, it ooms and crashes very easily and that woul...

you're using vllm? nvidia published a guide to openclaw like 10 days ago, mentined lm studio and ollama, completely silent on vllm which seems to perform better in terms of tps

dry hull Feb 24, 2026, 7:15 PM

#

Yea I'm using vllm v0.16.0rc2 in a docker container from https://github.com/eugr/spark-vllm-docker. I had codex update the docker scripts to use that vllm version and transformers v5+ so I could run the gadfly nvfp4 quant of qwen3-coder-next

rocky sleet Feb 24, 2026, 7:49 PM

#

https://imgur.com/a/exdfKfl
all good deals which should I choose?

tired plover Feb 24, 2026, 10:26 PM

#

crystal cedar did you a) install openclaw on the spark too, or b) using it as an inference ser...

Installed it also on spark

tired plover Feb 24, 2026, 10:27 PM

#

crystal cedar btw not sure if you gave nvidias 30B nemotron 3 nano model a run for the money y...

I will definitely look at it, currently working on my browser automation and it’s rough with local LLM… if you see any release please ping me asap hahaha

tired plover Feb 24, 2026, 10:28 PM

#

dry hull Yea I'm using vllm v0.16.0rc2 in a docker container from https://github.com/eugr...

I recommend using Claude to debug, made everything very smooth now, just give him instructions and say no if he wants to follow stupid routes

crystal cedar Feb 24, 2026, 10:36 PM

#

tired plover I will definitely look at it, currently working on my browser automation and it’...

could be deepseek r2 around the corner as well, might ahem awaken additional interest in hardware...

eternal tendon Feb 24, 2026, 10:38 PM

#

have you had much success with qwen3? i have the same issue...

tired plover Feb 24, 2026, 10:38 PM

#

crystal cedar could be deepseek r2 around the corner as well, might ahem awaken additional int...

I will try everything at this point, I see things are working good but also local LLM still lack this last inch of intelligence

crystal cedar Feb 24, 2026, 10:41 PM

#

tired plover I will try everything at this point, I see things are working good but also loca...

i think right now you just have to be persistent and view the frustrations and experience that comes with tinkering around as an investment. right now, things barely work out of the box, and that's discouraging for many, as is the prohibitively high costs of inference. its going to be very exciting how things develop in the next few months!

tired plover Feb 24, 2026, 10:41 PM

#

crystal cedar i think right now you just have to be persistent and view the frustrations and e...

That’s so true… I wish we would be down the river a bit more hahah

crystal cedar Feb 24, 2026, 10:42 PM

#

tired plover That’s so true… I wish we would be down the river a bit more hahah

but maybe you're early - seems karpathy spent the weekend tinkering with openclaw and his dgx spark as per X

#

apparently had a good experience, will share his perspectives in the near future

#

there's a podcast called 'this week in startups', features a couple of gents who are completely clawpilled for a couple of week. in the latest epiode, the host casually said that he thought about getting a mac studio for all of his employees so everyone could run their own local openclaw.

#

describes openclaw as "scary and every CEOs dream"

tired plover Feb 24, 2026, 11:12 PM

#

Hahaha

#

You can really sink hours into it…

tired plover Feb 24, 2026, 11:12 PM

#

crystal cedar but maybe you're early - seems karpathy spent the weekend tinkering with opencla...

I dream big

crystal cedar Feb 24, 2026, 11:14 PM

#

tired plover I dream big

big as in that alex ziskind youtube video where he connects 8 x DGX into a cluster? 😄

#

seems there are different kinds of connectx-7 cables

tired plover Feb 24, 2026, 11:15 PM

#

crystal cedar big as in that alex ziskind youtube video where he connects 8 x DGX into a clust...

I don’t have that much money and even if… then I think I would get just the biggest Mac Studio

crystal cedar Feb 24, 2026, 11:17 PM

#

tired plover I don’t have that much money and even if… then I think I would get just the bigg...

its a hilarious video, from a few days ago. good to watch/save if you end up considering a cluster

#

he also used claude to make it work in the end 😄

tired plover Feb 24, 2026, 11:19 PM

#

I saw it but didn’t finished

#

Claude is crazy good, if I could run that locally, then nobody could stop me

crystal cedar Feb 24, 2026, 11:21 PM

#

tired plover Claude is crazy good, if I could run that locally, then nobody could stop me

well, epoch ai argues that the lag between frontier and open weights is 3-12 months, so might be able to do that relatively soon

#

actually, i'm sort of betting on that being the case

tired plover Feb 24, 2026, 11:21 PM

#

Depends of course what you have at home but I can’t imagine what will run in the cloud by then

crystal cedar Feb 24, 2026, 11:22 PM

#

finally, a good reason to play that 90s hit song 'i got the power' (by a group called snap, a word that has its own nerdy qualities for linux)

#

sorry, the nerd is strong in me tonight

eternal tendon Feb 24, 2026, 11:24 PM

#

give deepseek a couple more months

tranquil hazel Feb 25, 2026, 12:13 AM

#

crystal cedar finally, a good reason to play that 90s hit song 'i got the power' (by a group c...

I prefer 2 brothers on the 4th floor

#

Never Alone ❤️

magic raven Feb 25, 2026, 2:50 AM

#

eternal tendon [give deepseek a couple more months](https://www.anthropic.com/news/detecting-an...

give it a couple more years

restive crown Feb 25, 2026, 3:07 AM

#

Has anyone tried running on Qwen3.5-35B-A3B? Curious about your experiences

tacit dock Feb 25, 2026, 6:20 AM

#

restive crown Has anyone tried running on Qwen3.5-35B-A3B? Curious about your experiences

running 128B right now.. aside from cache getting reset on every prompt, it's doing what glm-5 was previously doing for me with no troubles

sterile sonnet Feb 25, 2026, 8:17 AM

#

tacit dock running 128B right now.. aside from cache getting reset on every prompt, it's do...

What hardware are you running the 128B on?

tacit dock Feb 25, 2026, 8:33 AM

#

sterile sonnet What hardware are you running the 128B on?

rtx pro 6000, but also squeezing other models for embeddings and stt/tts

#

well. 3x rtx pro 6000 tbh

#

rn about 90GB to 128B

#

batch 4k and ctx 262144

#

(and kv at q8)

shadow urchin Feb 25, 2026, 9:04 AM

#

sounds juicy, whats your tps? you using llama.cpp?

#

i was trying to run the qwne 3.5-27b q4 on my 4090 but llama.cpp is not happy with it

wicked eagle Feb 25, 2026, 9:39 AM

#

HP Elitedesk 800 G3 is it worth it to run Claw in my local network?

dry hull Feb 25, 2026, 9:53 AM

#

restive crown Has anyone tried running on Qwen3.5-35B-A3B? Curious about your experiences

I tried all morning to get the two nvfp4 quants of the 122B running on my gx10, but it runs out of memory and crashes when loading tensors. Will try later with a slightly smaller quant. Noticed a bit too late you were asking about the 35B, I do have that running also on a 3090+4090 combo, but it’s only used as a haiku endpoint for Claude code so can’t really comment on quality yet

steep wedge Feb 25, 2026, 1:58 PM

#

I went back to the drawing board a bit and got a vLLM docker instance running that actually worked this time. I had it load the gpt-oss-120b model I had been using under Ollama. It seems snappy in the Open WebUI interface, but I had Gemini give me some tests to run. I haven't tweaked anything so maybe these could be juiced a little higher, but Gemini seemed to think the results were good. I ran:

docker exec vllm-inference vllm bench serve
--backend openai
--base-url http://127.0.0.1:8888
--model openai/gpt-oss-120b
--dataset-name random
--random-input-len 256
--random-output-len 512
--num-prompts 20
--max-concurrency 4

The results:

============ Serving Benchmark Result ============
Successful requests: 20
Failed requests: 0
Maximum request concurrency: 4
Benchmark duration (s): 116.91
Total input tokens: 5120
Total generated tokens: 10240
Request throughput (req/s): 0.17
Output token throughput (tok/s): 87.59
Peak output token throughput (tok/s): 112.00
Peak concurrent requests: 8.00
Total token throughput (tok/s): 131.38
---------------Time to First Token----------------
Mean TTFT (ms): 382.23
Median TTFT (ms): 396.75
P99 TTFT (ms): 491.44
-----Time per Output Token (excl. 1st token)------
Mean TPOT (ms): 45.01
Median TPOT (ms): 44.58
P99 TPOT (ms): 48.54
---------------Inter-token Latency----------------
Mean ITL (ms): 45.01
Median ITL (ms): 44.84
P99 ITL (ms): 58.74

Sorry for the wall of text.

stiff cosmos Feb 25, 2026, 2:02 PM

#

Hey all, is there a great site that has a strong benchmark database that you trust with different video cards and Apple machines?

coral token Feb 25, 2026, 3:01 PM

#

Is there anywhere where people post completed AI builds? No shortage of "PC builder" sites but I' looking to see examples of already built boxes and what they are supposed to be capable of. PC Parts picker has a completed builds section but I'm looking for beefier, more "workstation" build vs "gaming" build. Mainly just trying to compare the build I'm about to pull the trigger on with what others are doing these days.

steep wedge Feb 25, 2026, 5:32 PM

#

I reran my tests from above against the same model (i.e., gpt-oss:120b) hosted by Ollama. As expected, vLLM cleaned Ollama's clock on simultaneous requests (Ollama does them one at a time, vLLM does them concurrently). However, Ollama was twice as fast at token generation (22.84 ms vs 45.01 ms). An interesting dilemma: do I choose single agent request performance or multiple agent request performance? 🤔

steep wedge Feb 25, 2026, 5:52 PM

#

So, does one agent ever fire off multiple requests at the same time? If so, even a single agent could benefit from the vLLM setup.

crystal cedar Feb 25, 2026, 5:55 PM

#

shadow urchin i was trying to run the qwne 3.5-27b q4 on my 4090 but llama.cpp is not happy wi...

there's a bug, rebuilding llama.cpp might be worth a shot cc: @tacit dock

shadow urchin Feb 25, 2026, 5:55 PM

#

this mornings version worked fine

crystal cedar Feb 25, 2026, 5:59 PM

#

steep wedge So, does one agent ever fire off multiple requests at the same time? If so, even...

ollama seems straightforward but i would be inclined to go with vllm just under the assumption that i would eventually want to increase the number of agents and cater to concurrent requests. accomodating a large number of concurrent requests seems to be where the dgx shines.

#

counter-indications would be if its not stable or if it is too much of a mental exercise to get it right. understand from nvidia forums there are (at least) two roads to vllm right now

steep wedge Feb 25, 2026, 6:18 PM

#

I think sticking with vLLM may be the way to go. I do think even a single agent is probably rapid firing requests fairly often, and the concurrent performance would be very beneficial. I need to get my OC rewired anyway. Something broke after the last update, so I will just plumb in the new model when I work on that.

rough lava Feb 25, 2026, 8:32 PM

#

Finally got OpenClaw talking through hardware 🔴ESP32 + voice + attitude = PeekoAnyone else building physical devices
│ with their agents? Curious what latency you're hitting

https://x.com/i/status/2026755861960602098

astral gobletBOT Feb 25, 2026, 8:32 PM

#

rough lava Finally got OpenClaw talking through hardware 🔴ESP32 + voice + attitude = Peeko...

@i via Twitter

Ravi Pujari (@imravipujari)

We Just built our first OpenClaw-powered hardware.
︀︀
︀︀Meet Peeko - an ESP32 that roasts me (lovingly).
︀︀
︀︀@OpenClaw nodes make this stupid easy.
︀︀
︀︀Drop your hardware builds below

**💬 1 👁️ 79 **

▶ Play video

craggy quail Feb 25, 2026, 9:07 PM

#

shadow urchin i was trying to run the qwne 3.5-27b q4 on my 4090 but llama.cpp is not happy wi...

Sure? I'm running 35B in my RTX 3090 with llama.cpp without problems

shadow urchin Feb 25, 2026, 9:08 PM

#

the version released late last night fixed it

#

8149 i think?

craggy quail Feb 25, 2026, 9:10 PM

#

i'm using 8123 without problem, except cache, with 8149 can use cache again?

shadow urchin Feb 25, 2026, 9:17 PM

#

havent been able to get into really crunching on it, about to try some real benchmarks because llama-bench is acting weird. i can launch Qwen3.5-27B-Q4_K_M.gguf with llama-server and throw some stuff at it but i havent done a rela test

#

llama-bench loads like 17gb vram at 128k ctx and OOMs during the test

craggy quail Feb 25, 2026, 9:30 PM

#

now, me i'm workin with 35B UD_Q4_K_XL with more than 50k of context without problem in 3090

#

the problem is must load entire context in every prompt

shadow urchin Feb 25, 2026, 9:58 PM

#

im having my claw run some tests with 27B Q4_K_M and its holding up fine on my 4090, 22gb VRAM util seems pretty good

#

ill try the 35B UD_Q4_K_XL. the UD means its even more vram efficient right? i havent tried a UD before

tranquil hazel Feb 25, 2026, 10:15 PM

#

wicked eagle HP Elitedesk 800 G3 is it worth it to run Claw in my local network?

looks perfect tbh

craggy quail Feb 25, 2026, 11:21 PM

#

shadow urchin ill try the 35B UD_Q4_K_XL. the UD means its even more vram efficient right? i h...

Now I'm launching with all of this params and can retain context cache and works with 200K:
llama-server -m models/Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf --ctx-size 200000 --temp 1.0 --top-p 0.95 --min_p 0.00 --top_k 20 --host 0.0.0.0 --port 8080 --swa-full --cache-ram -1 --ctx-checkpoints 16

shadow urchin Feb 25, 2026, 11:24 PM

#

do you have a good benchmark?

craggy quail Feb 25, 2026, 11:26 PM

#

i'm using with openclaw now and with from llama.cpp logs the ratio is between 80/90 tokens per second

#

but I don't run any kind of benchmark

shadow urchin Feb 25, 2026, 11:31 PM

#

i asked my qwen3.5-27B-Q4_K_XL to come up with a benchmark test and it hallucinated the test and the results, saying 6000 tokens/s lol

#

then i told it to make a shell script that did the testing so we were runnnig hte same results each time and it was a simulated test that generated the same rough results each time lol

craggy quail Feb 25, 2026, 11:40 PM

#

6000 tokens/s??? 😅

shadow urchin Feb 25, 2026, 11:43 PM

#

yeah, even after it ran the "test" it was like 'holy cow this is really fast!'

craggy ferry Feb 26, 2026, 12:54 AM

#

craggy quail Now I'm launching with all of this params and can retain context cache and works...

thanks for this, i'm finally getting great cache performance i think

#

more like 100-120tps instead of 70-100, too, though i haven't done a huge context window yet

craggy ferry Feb 26, 2026, 1:09 AM

#

ok i am really liking this thing

it feels kind of .... ||opussy||

eternal tendon Feb 26, 2026, 1:59 AM

#

im new to using llama.cpp the control is nice but model calling is.. weird.. how do you have multiple model options without having to run a line of code, or separate server for each model?

craggy ferry Feb 26, 2026, 2:44 AM

#

it has a multi server option

empty barn Feb 26, 2026, 7:21 AM

#

craggy quail the problem is must load entire context in every prompt

Use RAG

craggy quail Feb 26, 2026, 7:44 AM

#

empty barn Use RAG

I use embeddings with openai right now

sleek wolf Feb 26, 2026, 7:49 AM

#

Friends, what kind of computer specs are you using to run OpenClaw?

tired plover Feb 26, 2026, 8:45 AM

#

I can say qwen3.5 is really good, better than anything I tried before , you can fit up to 122B with 23 tk/s on to the spark, work really well and I was surprised that Llama made it very smooth, with 35B you get over 50tk/s if you can take the quality hit, can wait for more evolvement in the local LLM space

keen owl Feb 26, 2026, 8:46 AM

#

sleek wolf Friends, what kind of computer specs are you using to run OpenClaw?

Raspberry Pi 5 8gb. With m.2 hat and ssd. Llm running in cloud

prisma frost Feb 26, 2026, 1:45 PM

#

Hi friends! Do you think I can get OpenClaw running with these specs: Intel Core i5-3320M and 8GB of RAM using an Ollama model? I've tried several small models, but I never get a response; it just hangs forever 'thinking' even for a simple 'hello'

forest oar Feb 26, 2026, 2:44 PM

#

prisma frost Hi friends! Do you think I can get OpenClaw running with these specs: Intel Core...

no, definitely not if you want to run the LLM locally

#

i suggest looking into kilocode for your provider, they are offering minimax m2.5 for free at the moment

brazen sentinel Feb 26, 2026, 3:07 PM

#

prisma frost Hi friends! Do you think I can get OpenClaw running with these specs: Intel Core...

with that spec i think there is no option for local model

#

tweeek it to q1 maybe you can make it run.... but what at cost

prisma frost Feb 26, 2026, 4:31 PM

#

Thanks a lot for all the info, guys! I'm going to give Kilo Code a try and see how it goes. Thanks again and have a great day!

worn drift Feb 26, 2026, 5:32 PM

#

I'm running a RTX 5090 on a pc with 64GB DDR. Which model would be best to chose, I got some tips to check the newest Qwen 3.5 or are there any better suggestions. Would be great if it's possible to run in 32GB vram without offloading, is that possible?

tulip crypt Feb 26, 2026, 5:34 PM

#

worn drift I'm running a RTX 5090 on a pc with 64GB DDR. Which model would be best to chose...

With your RTX 5090 and 32 GB of VRAM, you should have no problem running models like Qwen 3‑32B, Qwen 2.5‑32B, or LLaMA 3‑27B entirely on the GPU without offloading, especially if you use Q4 or Q8 quantization. Anything bigger than around 40B will probably need to offload some memory, which can slow things down.

eternal tendon Feb 26, 2026, 5:34 PM

#

worn drift I'm running a RTX 5090 on a pc with 64GB DDR. Which model would be best to chose...

Qwen 3.5 35B A3B Q6 198k context - KV cache q8

worn drift Feb 26, 2026, 5:36 PM

#

Thanks, guess 3.5 35B would be the best bet then. And this will run openclaw decently or is it still a bit too hard to run locally? I've read mixed articles about this

eternal tendon Feb 26, 2026, 5:36 PM

#

its better than anything else ive used

#

GLM / nemotron

#

(anything local anyways)

worn drift Feb 26, 2026, 5:37 PM

#

yup, but right now im running flash 3.0 preview (not local ofcourse), does it compare to that or doesn´t it come close yet?

eternal tendon Feb 26, 2026, 5:38 PM

#

not sure have not tried

worn drift Feb 26, 2026, 5:39 PM

#

well ok at least it sounds like it's workable so I'm going to give that a try.

crystal cedar Feb 26, 2026, 5:40 PM

#

worn drift yup, but right now im running flash 3.0 preview (not local ofcourse), does it co...

Hey nachtwacht if you're in NL, check out meetups/NL - possibly something around 6 march in amsterdam

worn drift Feb 26, 2026, 5:40 PM

#

oeh nice, thanks for the tip, where can I find more info?

crystal cedar Feb 26, 2026, 5:41 PM

#

worn drift oeh nice, thanks for the tip, where can I find more info?

https://discord.com/channels/1456350064065904867/1460549605404966963

#

thread kind of dead right now, say hi if you want to, keep an eye open if something pops up

worn drift Feb 26, 2026, 5:45 PM

#

eternal tendon Qwen 3.5 35B A3B Q6 198k context - KV cache q8

can´t find this model yet online, know a spot to download it for ollama?

eternal tendon Feb 26, 2026, 5:45 PM

#

just us Q4 if you are ollama

#

llama.cpp lets you run dif quants easily

worn drift Feb 26, 2026, 5:47 PM

#

ah thanks. am also quite new to ollama too. Just managed to give my mini pc (openclaw) and desktop (rtx5090) static ip's and get the lama server running. Now let's find out how to connect it with openclaw ^^

eternal tendon Feb 26, 2026, 5:48 PM

#

use codex and just ask it to setup for you..

tacit dock Feb 26, 2026, 8:23 PM

#

crystal cedar there's a bug, rebuilding llama.cpp might be worth a shot cc: <@4910482184912240...

thx Henry, i was running nightly from a few days prior.. will update and try again

tired plover Feb 26, 2026, 9:03 PM

#

So qwen 3.5 122B via vLLM in FP8 is not working, waiting now for NVFP4

crystal cedar Feb 26, 2026, 9:11 PM

#

tired plover So qwen 3.5 122B via vLLM in FP8 is not working, waiting now for NVFP4

cool how the two bigger versions seem to be precise fits for sparks in single and dual configs.

tired plover Feb 26, 2026, 9:16 PM

#

crystal cedar cool how the two bigger versions seem to be precise fits for sparks in single an...

Really seems calculated, what’s with your spark ? 😉

craggy ferry Feb 26, 2026, 9:18 PM

#

Rebuilding llama helped but also I had to turn off flash attn cause it seems broken on my card at least

crystal cedar Feb 26, 2026, 9:34 PM

#

tired plover Really seems calculated, what’s with your spark ? 😉

work in progress, living vicariously through your detailed feedback 😄

tired plover Feb 26, 2026, 11:05 PM

#

crystal cedar work in progress, living vicariously through your detailed feedback 😄

Hahahahah it’s a pleasure😂

lethal star Feb 26, 2026, 11:17 PM

#

worn drift I'm running a RTX 5090 on a pc with 64GB DDR. Which model would be best to chose...

When you get working with this, would you mind sharing how you evaluated its performance? (If it’s good enough for you: why?)

tired plover Feb 27, 2026, 7:10 AM

#

It’s beginning….

bright osprey Feb 27, 2026, 8:52 AM

#

Is anyone using an orange pi 6 plus to run local models ?

tired plover Feb 27, 2026, 10:54 AM

#

Qwen 3.5 122B NVFP4 on spark only 16,5 tk/s via vLLM… weak, anybody got better results ?

uncut hinge Feb 27, 2026, 11:31 AM

#

worn drift I'm running a RTX 5090 on a pc with 64GB DDR. Which model would be best to chose...

I think qwen3.5 4/5bit really pretty awesome for simple standalone stuff, but still not quite there for the claw. Very close though. I'm about to start giving my 5bit local qwen 3.5 32 a read only agent and give it all day memory proposals for opus to review and approve a few times a day. To me 3.5 felt better then g3 flash but after a while in a session it started being a dangerous dumbass. It's still very impressive for local.

limpid girder Feb 27, 2026, 12:45 PM

#

Really struggling to get my M1 Max 64GB machine running LM Studio with a 32000 context window running. Every request from openclaw takes minutes to come back with an answer. I'm pretty sure there has to be something wrong w/ the LLM configuration, somewhere? Running Qwen 3.5 35B A3B. Chatting w/ it straight up gives me 60T/S, so pretty sure OC is not caching and sending massive prompts... how to manage this though? Even a higher specced machine won't do better than this.

lethal star Feb 27, 2026, 1:12 PM

#

I saw an email today from Ollama claiming they offered free cloud models. Is that true??

lyric orchid Feb 27, 2026, 2:08 PM

#

lethal star I saw an email today from Ollama claiming they offered free cloud models. Is tha...

Yes, they have a free tier, you can try out their cloud models, you'll likely burn through the usage pretty quickly depending on what you are doing. https://ollama.com/pricing

worn drift Feb 27, 2026, 3:34 PM

#

limpid girder Really struggling to get my M1 Max 64GB machine running LM Studio with a 32000 c...

a mac is much slower than an nvidia GPU. 64GB ram is nice but for speed you need vram. I haven't seen any messages from people running it on a mac with local LLM and be happy about it. Anyone?

tired plover Feb 27, 2026, 3:56 PM

#

worn drift a mac is much slower than an nvidia GPU. 64GB ram is nice but for speed you need...

But it’s unified memory with high bandwidth, very capable

limpid girder Feb 27, 2026, 5:57 PM

#

tired plover But it’s unified memory with high bandwidth, very capable

This is what I figured. I get 60 T/S when using it in OpenCode, no problem. My thought is, because OpenClaw is so awfully efficient with input tokens, that 30-40k input tokens just makes the whole thing choke. I just don't know enough about LLM architecture to know if this is the reason or not.

celest vale Feb 27, 2026, 5:59 PM

#

Everyone know what model run in Mac mini 64Ram ? I have tested Qwen3.5 35b-3B thats good but is so slow . But it's run . I need a model to use tools and front-coding (PS : I use LM studio)

limpid girder Feb 27, 2026, 6:00 PM

#

celest vale Everyone know what model run in Mac mini 64Ram ? I have tested Qwen3.5 35b-3B t...

I'm having the same problem. I have a Mac M1 Max 64GB. I get 60T/S in open code, but as soon as I give that model to OPENCLAW it chokes... have you gotten to the bottom of this? Is it # of input tokens or what?

celest vale Feb 27, 2026, 6:02 PM

#

limpid girder I'm having the same problem. I have a Mac M1 Max 64GB. I get 60T/S in open cod...

Actualy is run but you can put long time out . Example I make a animation for a slider with opus , he make taht in 33s , the same animation same composent for Qwen3.5 35B-3B in LM Studio Mac mini push , he make that in 7min

limpid girder Feb 27, 2026, 6:03 PM

#

Have you looked at the logs in LM Studio to see what's happening?

celest vale Feb 27, 2026, 6:03 PM

#

limpid girder I'm having the same problem. I have a Mac M1 Max 64GB. I get 60T/S in open cod...

If I use Ollama he dont use tools that not working

limpid girder Feb 27, 2026, 6:03 PM

#

What's you T/S and is you rprompt caching working properly? In my LM studio it deletes the prompt cache every time, which IMO, is the root of the problem

tired plover Feb 27, 2026, 6:08 PM

#

limpid girder This is what I figured. I get 60 T/S when using it in OpenCode, no problem. My...

It is exactly this reason, with openclaw you need bigger context windows

celest vale Feb 27, 2026, 6:11 PM

#

limpid girder What's you T/S and is you rprompt caching working properly? In my LM studio it d...

You nailed it. Just checked the LM Studio logs:

cache reuse is not supported - ignoring n_cache_reuse = 256
failed to truncate tokens - clearing the memory

Prompt cache is broken with Qwen3.5-35B-A3B (MoE architecture). Every tool call reprocesses the full ~13K system prompt from scratch. So with 10 tool calls in a session, that's 130K tokens of prompt processing instead of 13K.

The model itself generates at decent speed, but it's spending 90% of the time re-eating the prompt. This is likely a GGUF/llama.cpp limitation with MoE models — the recurrent memory state can't be cached/reused like standard transformers.

The MLX version (8bit) didn't even do tool calls at all. The GGUF Q4_K_M at least works but is painfully slow because of this cache issue.

limpid girder Feb 27, 2026, 6:19 PM

#

SIgh...

From what i've read VLLM might be able to solve this issue, but seems like an awful lot of work and LM Studio doesn't seem to give us too many options to play around with.

celest vale Feb 27, 2026, 6:26 PM

#

limpid girder SIgh... From what i've read VLLM might be able to solve this issue, but seems ...

The current solution I have found is to wait for the MLX version when it is released by LM Studio. I haven't found anything else. In the meantime, I will use the 35B-3B even though it is slow. I haven't been able to find any local models to do this front-end work + tool usage. If you have any alternatives, I'm interested.

craggy ferry Feb 27, 2026, 7:00 PM

#

celest vale You nailed it. Just checked the LM Studio logs: cache reuse is not supported - ...

This is not actually broken but you chose your settings unwisely. I have llamacpp working great with prompt cache on qwen3.5.

#

Watch the startup logs and see what option you passed that is making it ignore cache reuse.

#

I know I had this issue at first too and I forget what option I had that needed to … oh, it’s multimodal support

#

Turn off image support and it’ll fix it

#

Unfortunate but it seems that llamacpp doesn’t support the prompt cache with multimodality

#

The other thing I found that really helps is —swa-full - without that, it only attends to the last 8192 tokens most of the time

celest vale Feb 27, 2026, 7:15 PM

#

craggy ferry This is not actually broken but you chose your settings unwisely. I have llamacp...

I'll give it a try and see. But are you also on a Mac mini with 64 RAM?

meager vessel Feb 27, 2026, 7:33 PM

#

hello guys, since the latest update openclaw-gateway started eating more RAM for me? Like 600MB idle after macbook reboot. Is it normal?

uncut hinge Feb 27, 2026, 7:35 PM

#

celest vale I'll give it a try and see. But are you also on a Mac mini with 64 RAM?

Did you guys see unsloth fixed it's tooling somehow

celest vale Feb 27, 2026, 8:09 PM

#

craggy ferry Turn off image support and it’ll fix it

Honestly, just by turning off the vision, we gained incredible speed. Thanks for the tips. Currently, our test is done in Q4_K. Have you been able to test in Q_6 and Q_8?

craggy ferry Feb 27, 2026, 8:17 PM

#

I’m running Q6 on my ada6000

fading lagoon Feb 27, 2026, 8:31 PM

#

Hey guys, is there releases of qwen 3.5 27/35b in NVFP4 ? i don't find on huggingface 👌

shadow urchin Feb 27, 2026, 11:01 PM

#

ugh. anyone else using gpt5.3-codex via copilot and finding their claw is getting stuck in an execution block loop a lot?

limpid girder Feb 28, 2026, 12:28 AM

#

celest vale Honestly, just by turning off the vision, we gained incredible speed. Thanks for...

So you're running a GGUF model on oLlama? What's the command to start it up based on Anisloptera's suggestion that worked for you?

severe urchin Feb 28, 2026, 3:10 AM

#

shadow urchin ugh. anyone else using gpt5.3-codex via copilot and finding their claw is gettin...

they're using the usage to build more training data to train/build better models. software engineering is going away

full talon Feb 28, 2026, 4:10 AM

#

fading lagoon Hey guys, is there releases of qwen 3.5 27/35b in NVFP4 ? i don't find on huggin...

there is no NVFP4 yet but you can run Qwen3.5-27B Q4_K_M even on old rtx 3090 it's good for many agent roles . you can see tests here https://github.com/explaindio/ClawEval

fading lagoon Feb 28, 2026, 7:03 AM

#

full talon there is no NVFP4 yet but you can run Qwen3.5-27B Q4_K_M even on old rtx 3090 it...

Thanks man , I go check

celest vale Feb 28, 2026, 2:02 PM

#

limpid girder So you're running a GGUF model on oLlama? What's the command to start it up base...

No, I use LM Studio Qwen3.5 25B-3B in Q4. I hid the mmproj (vision) file so that LM Studio cannot load it. Honestly, you will gain in response speed.

untold stone Feb 28, 2026, 2:56 PM

#

What can I run reliably as a backup for mac mini m4 24GB?

quartz zinc Feb 28, 2026, 5:50 PM

#

We need OpenClaw to do this whole thing end-to-end:
https://x.com/tom_doerr/status/2027649545736196208?s=46

Tom Dörr (@tom_doerr)

CAD files for manufacturing robotic arms

https://t.co/kAw36a8IPe

#

https://x.com/iliraliu_/status/2027306785757966368?s=46

Ilir Aliu (@IlirAliu_)

A highly affordable, fully 3D-printed robotic arm.
[📍 open source]

Features:

Fully 3D printed design
7+1 DOF
or physical AI research & imitation learning

Zero custom hardware needed, just print and add your servo kit!

URDF files
STEP & STL files

#

https://x.com/zhengyiluo/status/2024647574440071287?s=46

Zhengyi “Zen” Luo (@zhengyiluo)

SONIC is now open-source!

Generalist whole-body teleoperation for EVERYONE!

Our team has long been building comprehensive pipelines for whole-body control, kinematic planner, and teleoperation, and they will all be shared.

This will be a continuous update; inference code +

▶ Play video

celest vale Feb 28, 2026, 6:43 PM

#

untold stone What can I run reliably as a backup for mac mini m4 24GB?

Use LM Studio , he help you choose a local model for you Mac Config. But is more complex for what you gonna use a model . For coding , for vision, for search . I recommande You can search a little model for 1 use case .

austere hare Feb 28, 2026, 11:30 PM

#

so if you're trying to run Clawbot locally on a GB10 (spark, asus, msi, etc). what is the best LLM out there right now that would run on that footprint? minimax? qwen or kimi?

valid rune Mar 1, 2026, 12:01 AM

#

craggy ferry I’m running Q6 on my ada6000

My test with Qwen3.5 35B Q6 on Mac Mini => 36.4 tok/s

random void Mar 1, 2026, 12:10 AM

#

valid rune My test with Qwen3.5 35B Q6 on Mac Mini => 36.4 tok/s

Ram? Chip?

valid rune Mar 1, 2026, 12:10 AM

#

random void Ram? Chip?

M4 Pro and 64 Go Ram

valid rune Mar 1, 2026, 12:11 AM

#

random void Ram? Chip?

Do you have same result ?

random void Mar 1, 2026, 12:12 AM

#

valid rune Do you have same result ?

Not testing currently, evaluating what to buy

craggy ferry Mar 1, 2026, 12:18 AM

#

valid rune My test with Qwen3.5 35B Q6 on Mac Mini => 36.4 tok/s

oh that's nice, i should see if it runs on my m2 studio, then i could have 3 things all running diff quants of the same model

obsidian yoke Mar 1, 2026, 12:32 AM

#

I’m getting 70 t/s on oss-120 with ollama. I have dual rtx 8000 ( old as shit but 96 gb VRAM with NVlink so it’s decent. Would you upgrade to one pro 6000?

quartz zinc Mar 1, 2026, 5:00 AM

#

https://x.com/bowang87/status/2027941789848514643?s=46

Bo Wang (@BoWang87)

This is wild. Your WiFi router can now track your body position through walls — no camera needed.

This just hit #1 on GitHub trending.
It analyzes how WiFi signals reflect off your body as you move — then reconstructs 24 body part positions in real time. Accuracy is close to an

▶ Play video

outer nova Mar 1, 2026, 6:44 AM

#

I'd like to ask if anyone has compared the pros and cons of deploying on a Mac mini versus a Linux VPS. Actually, I've already deployed OpenClaw on my VPS, and it's been working quite well. Moreover, strictly speaking, a VPS offers a more stable network environment and power supply. I'm not sure if the Mac mini has any other advantages. If it does, I'd be willing to try it, but currently, I'm unaware of any special benefits it might offer.

radiant igloo Mar 1, 2026, 10:15 AM

#

outer nova I'd like to ask if anyone has compared the pros and cons of deploying on a Mac m...

You can run on a raspberry, the only thing is mac intégration and maybe local llm

smoky flint Mar 1, 2026, 12:35 PM

#

outer nova I'd like to ask if anyone has compared the pros and cons of deploying on a Mac m...

VPS works great for the basics, but there's a real security tradeoff. A VPS is internet-facing by default, shared infrastructure, and you're trusting your provider's hypervisor isolation. Every VPS is a target for port scanners and brute force attempts 24/7. A Mac mini sitting behind your home NAT has a much smaller attack surface out of the box.

The bigger win with a Mac mini is Apple Silicon. Unified memory means you can run local LLMs without paying for GPU cloud time. An M4 Pro with 48GB can run 30B+ parameter models comfortably, and if you really want to go deep, you can pool multiple Mac minis together using something like exo or llama.cpp's distributed inference to split larger models (70B+) across machines. All on-prem, no API keys, no token costs, no data leaving your network.

That said, that really only matters if you're actually running higher parameter models locally. If you're just using API-based models and your VPS is locked down properly (fail2ban, key-only SSH, firewall rules), it's a perfectly solid setup. Just different threat models and different use cases.

valid rune Mar 1, 2026, 6:20 PM

#

craggy ferry oh that's nice, i should see if it runs on my m2 studio, then i could have 3 thi...

Tell me, what models do you use locally and for what purpose? I have an Opus (lead developer) > A Sonnet (mail and document assistant) > Qwen3.5 35B Q6 (versatile developer)
I'm still configuring my versatile developer for optimization, so it's not 100% operational yet. And I use Convex.

craggy ferry Mar 1, 2026, 6:21 PM

#

It’s all qwen3.5 rn

jagged tusk Mar 1, 2026, 6:29 PM

#

craggy ferry It’s all qwen3.5 rn

all in mac mini4? any reason to stack or go bigger that qwen 3.5 cant handle?

jagged tusk Mar 1, 2026, 6:35 PM

#

smoky flint VPS works great for the basics, but there's a real security tradeoff. A VPS is i...

M4 vs M4 pro huge drop off?

craggy ferry Mar 1, 2026, 6:36 PM

#

jagged tusk all in mac mini4? any reason to stack or go bigger that qwen 3.5 cant handle?

I have a hybrid setup with a lot of different things but I don’t have anything that can effectively run the larger 3.5 models

#

Actually huh. Maybe.

#

Nah, not really, I could do something stupid with the 122b but it’d be so much slower.

valid rune Mar 1, 2026, 6:46 PM

#

craggy ferry It’s all qwen3.5 rn

I often have problems with tool calling and context size, and cache them on this model.

jagged tusk Mar 1, 2026, 7:11 PM

#

im trying to find a setup completely local that works even slowly as im tired of paying for chunk fed slop

valid rune Mar 1, 2026, 7:30 PM

#

jagged tusk im trying to find a setup completely local that works even slowly as im tired of...

What Mac do you have ?

valid rune Mar 1, 2026, 7:30 PM

#

jagged tusk im trying to find a setup completely local that works even slowly as im tired of...

What Assistant IA do you want ?

smoky flint Mar 1, 2026, 7:46 PM

#

jagged tusk M4 vs M4 pro huge drop off?

No clue, I run locally but not on mac hardware, not my boat. I just did research on why some people were buying multiple mac minis. It didn't suit my needs. My agent is running on a beast but I have limited VRAM and haven't found local models to be reliable for me. My machine is overkill for what I actually need.

wispy kraken Mar 2, 2026, 12:39 AM

#

smoky flint No clue, I run locally but not on mac hardware, not my boat. I just did research...

Hi maybe this is helpfull
https://github.com/explaindio/ClawEval
additionally if you use ollama , bellow steps made a difference as well

1) Root cause we diagnosed

Ollama itself was healthy (local /api/chat worked).
Toolcalling flakiness was largely caused by using Ollama via the OpenAI-compatible endpoint (/v1).
- When OpenClaw talks to http://127.0.0.1:11434/v1, it uses the OpenAI-compat layer, which is more likely to break/alter toolcalling behavior (especially with streaming) and can cause clients to mis-handle responses.

2) Critical fix: switch OpenClaw to Ollama native API

Change:

models.providers.ollama.baseUrl changed from:
- http://127.0.0.1:11434/v1
  to:
- http://127.0.0.1:11434

4) Per-model params to reduce client/toolcalling flakiness

We added per-model params under agents.defaults.models:

ollama/qwen2.5:14b-instruct

With:

streaming: false
low temperature: 0.2
conservative maxTokens: 1024

craggy nexus Mar 2, 2026, 2:36 AM

#

I need your help.
i installed openclaw in mac mini. and start ollama/qwen3:8b in another mac mini.
i want to make openclaw use ollama/qwen3:8b.
these pc use same wifi and communicated by curl.
but openlaw gateway causes "fetch failed" when i send message.

Version: 2026.2.26
Ollama on remote machine (same WiFi), curl + Node.js fetch both work fine
openclaw models list shows ollama/qwen3:8b with Auth:yes
gemini works, ollama always fails instantly (3-12ms, no actual network call)
Cleared ~/.openclaw/agents/main/agent/, added OLLAMA_API_KEY to plist, nothing helps
Log shows: embedded run agent start → embedded run agent end error=fetch failed with no network activity in between

jagged tusk Mar 2, 2026, 2:48 AM

#

valid rune What Mac do you have ?

i just picked up the Mac Mini 4 M4 chip 24gb ram

jagged tusk Mar 2, 2026, 2:49 AM

#

valid rune What Assistant IA do you want ?

what does IA mean? I want the best model that performs the tasks i ask

sturdy elbow Mar 2, 2026, 6:45 AM

#

outer epoch I bought this: # [GTR9 Pro](https://tidd.ly/48Ow4QU) ## 128 GB unified VRAM IMO...

This looks great. I was going to give up on local LLM deployment but this is actually a reasonably-priced option. Are you running linux on it?

outer epoch Mar 2, 2026, 6:46 AM

#

sturdy elbow This looks great. I was going to give up on local LLM deployment but this is act...

Both windows and Linux, but for LLM Linux is better

sage pier Mar 2, 2026, 8:03 AM

#

I run openclaw on my raspberry pi 5 fyi and it works beautifully

alpine plover Mar 2, 2026, 8:14 AM

#

sage pier I run openclaw on my raspberry pi 5 fyi and it works beautifully

me too, openClaw runs like a dream on the raspberry pi 5.

weary ledge Mar 2, 2026, 9:02 AM

#

https://www.linkedin.com/posts/davecheng82_keynote-fireside-chat-the-claw-is-the-activity-7434160911646470145-iwnv?utm_source=share&utm_medium=member_desktop&rcm=ACoAAA5j_R4B90C9XcAaNDt1yS4yHKEKQ5jreb4

#

above is the cheek notes from peter fireside chat plus how i setup my pi5

valid rune Mar 2, 2026, 10:19 AM

#

jagged tusk what does IA mean? I want the best model that performs the tasks i ask

a multimodal in little configuration is hard. you can put varius little model and test what you can do . example I use Qwen3.5 35B for coding and nomic-embed-text for memory-core (RAG) but my main dev is opus/sonnet in other Mac

crystal cliff Mar 2, 2026, 10:40 AM

#

sage pier I run openclaw on my raspberry pi 5 fyi and it works beautifully

I installed it on a raspbery pi 500+

#

it's the pi 5 fully integrated into a mechanical keyboard and comes with 16GB DDR5, 256GB nvme, cooler, etc

#

just plugged it into a monitor and power

#

$260 all in

#

Cheaper than a mac mini

#

Any of you pi5 users think about installed the ai-hat w/ 8GB for local inference?

#

https://www.raspberrypi.com/products/raspberry-pi-500-plus/

#

was looking at this

https://www.raspberrypi.com/documentation/accessories/ai-hat-plus.html

The AI HAT+ 2

Hailo-10H (40 TOPS, INT4)
Has its own 8 GB onboard memory, allowing it to run LLMs and VLMs up to ~6 billion parameters

#

can run local QWEN models

austere blade Mar 2, 2026, 10:57 AM

#

crystal cliff was looking at this https://www.raspberrypi.com/documentation/accessories/ai-ha...

did you try?

#

how was it

crystal cliff Mar 2, 2026, 10:59 AM

#

I can't add it to a pi 500+ as it doesn't have the pci-e connector, need a vanilla pi 5 to test

austere blade Mar 2, 2026, 11:00 AM

#

crystal cliff I can't add it to a pi 500+ as it doesn't have the pci-e connector, need a vanil...

i see; that is unfortunate

crystal cliff Mar 2, 2026, 11:00 AM

#

may get a vanilla pi 5 to test

austere blade Mar 2, 2026, 11:00 AM

#

i wanna buy a pi5 now the 16gb variant but man they are expensive now cause of the ram shortage

crystal cliff Mar 2, 2026, 11:00 AM

#

yeah they are $199 stock

austere blade Mar 2, 2026, 11:01 AM

#

yeah man wth

crystal cliff Mar 2, 2026, 11:01 AM

#

that's why i got the 500 + bc it came with a 256GB nvme, kb, case, fan , etc

#

figured for an extra $60 that's a good deal

austere blade Mar 2, 2026, 11:01 AM

#

yeah you're right

#

i have the whole api thing figured out

#

i have my means to get them for extremely cheap and free here in china

#

but i don't wanna continue using the vps to host my openclaw instance

crystal cliff Mar 2, 2026, 11:02 AM

#

free apis?

austere blade Mar 2, 2026, 11:02 AM

#

nah

#

in china we got some models hosted by the state itself which we get access to for free

#

as students

crystal cliff Mar 2, 2026, 11:03 AM

#

oh, gotcha. nerfed models?

#

😄

austere blade Mar 2, 2026, 11:03 AM

#

nah full fledged

crystal cliff Mar 2, 2026, 11:03 AM

#

censored?

austere blade Mar 2, 2026, 11:03 AM

#

nah

#

also not

crystal cliff Mar 2, 2026, 11:03 AM

#

oh wow

austere blade Mar 2, 2026, 11:03 AM

#

this is for univeristy students only

crystal cliff Mar 2, 2026, 11:03 AM

#

gotcha

austere blade Mar 2, 2026, 11:03 AM

#

gotta have proper permission

crystal cliff Mar 2, 2026, 11:03 AM

#

def monitored though I would think

austere blade Mar 2, 2026, 11:03 AM

#

yeah probably

crystal cliff Mar 2, 2026, 11:03 AM

#

so what models, deepseek and such?

austere blade Mar 2, 2026, 11:04 AM

#

they even have claude and gpt api's

crystal cliff Mar 2, 2026, 11:04 AM

#

wow that's pretty cool

austere blade Mar 2, 2026, 11:04 AM

#

yeah

#

ikr

#

i just plan on using it

crystal cliff Mar 2, 2026, 11:04 AM

#

i was using antigravity with my claw but got banned

austere blade Mar 2, 2026, 11:04 AM

#

but i wanna get a old computer or something to get my openclaw running

crystal cliff Mar 2, 2026, 11:04 AM

#

was using claude with it

austere blade Mar 2, 2026, 11:04 AM

#

crystal cliff i was using antigravity with my claw but got banned

against tos

#

and i tried it it was slow

crystal cliff Mar 2, 2026, 11:04 AM

#

yeah im aware

#

i had ultra plan

austere blade Mar 2, 2026, 11:05 AM

#

i also did

crystal cliff Mar 2, 2026, 11:05 AM

#

now i got claude max

austere blade Mar 2, 2026, 11:05 AM

#

oh

#

nvm

crystal cliff Mar 2, 2026, 11:05 AM

#

but run the claw on chatgpt

austere blade Mar 2, 2026, 11:05 AM

#

i had ai pro plan

crystal cliff Mar 2, 2026, 11:05 AM

#

alright ill bbiab, going to take the dog for a walk

austere blade Mar 2, 2026, 11:05 AM

#

crystal cliff now i got claude max

that is probably the best route

austere blade Mar 2, 2026, 11:05 AM

#

crystal cliff alright ill bbiab, going to take the dog for a walk

sure thing i also gotta continue on with my lab work

crystal cliff Mar 2, 2026, 11:05 AM

#

checking into minimax actually or local models since i am hitting api limits on chatgpt

austere blade Mar 2, 2026, 11:06 AM

#

life of a research student ;/

crystal cliff Mar 2, 2026, 11:06 AM

#

later

austere blade Mar 2, 2026, 11:06 AM

#

later buddy

agile sentinel Mar 2, 2026, 11:41 AM

#

Interesting development... https://x.com/BrianRoemmele/status/2028137631654314255

astral gobletBOT Mar 2, 2026, 11:41 AM

#

agile sentinel Interesting development... https://x.com/BrianRoemmele/status/202813763165431425...

@BrianRoemmele via Twitter

Brian Roemmele (@BrianRoemmele)

BOOM! MAJOR AI MEMORY BREAKTHROUGH!
︀︀
︀︀The Zero-Human Company Just Unlocked High-Bandwidth AI Performance from Standard DDR RAM – Here’s How We Did It (And the Caveats You Need to Know)
︀︀
︀︀Folks, if you’ve been following the AI hardware wars, you know the drill: High Bandwidth Memory (HBM) is the holy grail for feeding massive neural networks. But at The Zero-Human Company, we’ve been running wild experiments in our labs – no humans, just our AI “employees” orchestrated by Mr. @Grok as CEO, and we stumbled onto something game-changing.
︀︀In our tests, we coaxed standard DDR5 RAM to deliver HBM-like bandwidth for AI workloads.
︀︀
︀︀Not perfectly, not without trade-offs, but enough to slash costs and sidestep the global HBM shortages crippling data centers. This isn’t vaporware; it’s running on spare hardware in our Zero-Human @ Home distributed network right now. Let me break it down technically, why HBM rules the ro…

full talon Mar 2, 2026, 1:57 PM

#

agile sentinel Interesting development... https://x.com/BrianRoemmele/status/202813763165431425...

if you ran that post through any top AI it will tell you that article is BS/read bite

agile sentinel Mar 2, 2026, 2:06 PM

#

full talon if you ran that post through any top AI it will tell you that article is BS/read...

Asked grok, grok said unverified.
•Inference speed: 2-3x faster than stock DDR setups, hitting 80% of HBM baselines for token generation.
•Bandwidth Peaks: Sustained 600-800 GB/s in bursts, enough for mid-scale training (e.g., 10B param models).
•Cost: ~10x cheaper than equivalent HBM stacks. We ran this on $500 worth of off-the-shelf DDR from eBay.

full talon Mar 2, 2026, 2:40 PM

#

agile sentinel Asked grok, grok said unverified. •Inference speed: 2-3x faster than stock DDR ...

Gemini 3 (ChatGPT similar): The Physics-Defying Claims - The PCIe Bottleneck: The post claims they "rigged arrays of 8-16 DDR5 modules... on custom PCIe risers, wired directly to our Nvidia A40/A100 test rigs" to hit ~400 GB/s. This is physically impossible. An A100 uses a PCIe 4.0 x16 interface, which has a hard physical limit of ~64 GB/s bidirectional bandwidth. It doesn't matter if you have 10,000 GB/s of RAM sitting on a custom riser; the moment it has to cross the PCIe bus to talk to the GPU, it slams into that 64 GB/s wall. HBM is on-package specifically to avoid the PCIe bottleneck.

quartz zinc Mar 2, 2026, 6:41 PM

#

idk if this matter, but look into it?
https://x.com/AmbsdOP/status/2028457255968874940?s=20

Vali Neagu (@AmbsdOP)

YES! Someone reverse-engineered Apple's Neural Engine and trained a neural network on it.

Apple never allowed this. ANE is inference-only. No public API, no docs.

They cracked it open anyway.

Why it matters:

• M4 ANE = 6.6 TFLOPS/W vs 0.08 for an A100 (80× more efficient)
•

orchid harness Mar 3, 2026, 6:03 AM

#

quartz zinc idk if this matter, but look into it? https://x.com/AmbsdOP/status/2028457255968...

THAT"S HUUUGE

quartz zinc Mar 3, 2026, 6:03 AM

#

orchid harness THAT"S HUUUGE

It is? 🤔

orchid harness Mar 3, 2026, 6:03 AM

#

quartz zinc It is? 🤔

It's incredible performance for me and my macbook to run local llm with vscode!

quartz zinc Mar 3, 2026, 6:04 AM

#

orchid harness It's incredible performance for me and my macbook to run local llm with vscode!

How much better?

orchid harness Mar 3, 2026, 6:04 AM

#

Not that many tops for big deployments though

orchid harness Mar 3, 2026, 6:04 AM

#

quartz zinc How much better?

Not better at all since the software uses igpu not neural afaik lol

quartz zinc Mar 3, 2026, 6:05 AM

#

orchid harness It's incredible performance for me and my macbook to run local llm with vscode!

You said it’s better with it?
idk what it does

#

Some people at the Z.AI discord were excited too.

#

[openclaw] It’s a research repo that shows how to train a small transformer directly on Apple’s Neural Engine (ANE) by using reverse-engineered private Apple APIs.

In plain terms, it:

Bypasses normal CoreML limits (which are inference-focused)
Runs forward/backward ANE kernels for training experiments
Benchmarks ANE performance and documents limitations

Important caveats:

Not production-ready
Uses private/undocumented APIs (can break with macOS updates)
Still relies on CPU for some gradient work
Best viewed as an experimental proof-of-concept, not a drop-in ML framework

#

🤔
https://x.com/brianroemmele/status/2028527677981004181?s=46

Brian Roemmele (@BrianRoemmele)

This is how me and my CEO Mr. @Grok work, take a peek:

“At 20% idle from just a few million opt-in M4 Macs, ZHC@Home could rival or exceed single massive data center clusters (e.g., 1-2x Colossus scale) in raw FP16-equivalent compute-while using orders of magnitude less power

#

https://x.com/diglloyd/status/2028538791368315342?s=46

diglloyd (@diglloyd)

@BrianRoemmele @grok Wow.

https://t.co/7IzDiI5ub3

Earnings model (conservative & grounded)
The ANE has only ~32 MB on-chip SRAM. Anything larger spills to unified DRAM and loses 30%+ throughput.
Higher RAM = bigger batches, larger model shards, no spills → directly higher-value jobs from the ZHC

#

I have no idea what it means..

craggy ferry Mar 3, 2026, 7:52 AM

#

This assumes you can find a use for the tokens as usual

#

lol 512g Mac studios are “unavailable” rn

craggy quail Mar 3, 2026, 3:57 PM

#

Hi all, I received the strix halo mini pc, I install Ubuntu 24.04 and ROCm 7.2, but always I try load a big model with 120B I have a out of memory error, only can load like 64GB of VRAM, but I enabled TTM and GPU have 120GB available

#

Someone have same hardware and OS and working with big models?

quartz zinc Mar 3, 2026, 4:15 PM

#

https://x.com/unslothai/status/2028845314506150079?s=46

Unsloth AI (@UnslothAI)

You can now fine-tune Qwen3.5 with our free notebook! 🔥

You just need 5GB VRAM to train Qwen3.5-2B LoRA locally!

Unsloth trains Qwen3.5 1.5x faster with 50% less VRAM.

GitHub: https://t.co/aZWYAtakBP
Guide: https://t.co/7d3BW8Qcjg

Qwen3.5-4B Colab: https://t.co/TxZ7pvbdTI

#

https://x.com/brianroemmele/status/2028854766999392647?s=46

Brian Roemmele (@BrianRoemmele)

WE DID IT!

We have merged new real-time AI fine tuning on the Apple M4 chip with an OpenClaw agent!

IT NEVER FORGETS NOW! EVER!

“In this article, I’ll explain why real-time fine-tuning is a massively big deal—potentially transforming industries, personalizing AI at an

river gate Mar 3, 2026, 4:39 PM

#

Is this type of news allowed here? This Taalas chip sounds interesting. I hope a third party can test it soon.

Taalas project is a tiny AI processing unit, like a specialized GPU, for AI. But they learned how to deposit a whole LLM on a chip, so the LLM becomes 140x faster. Some test results: https://old.reddit.com/r/singularity/comments/1r9frzk/taalas_llms_baked_into_hardware_no_hbm_weights/#:~:text=<%201%20Millisecond%20Latency,it%20normally%20takes%20months...

astral gobletBOT Mar 3, 2026, 4:39 PM

#

river gate Is this type of news allowed here? This Taalas chip sounds interesting. I hope a...

r/singularity via Reddit

rxddit.com

Taalas: LLMs baked into hardware. No HBM, weights and model architecture in silicon -> 16.000 tokens/second

u/elemental-mind on r/singularity

🖼️ Gallery: 2 Images

Ever experienced 16K tokens per second? It's insanely instant. Try their Lllama 3.1 8B demo here: chat jimmy.

THey have a very radical approach to solve the compute problem - albeit a risky one in a landscape where model architectures evolve in weeks instead of years: Etch the model and all th...

prime aurora Mar 3, 2026, 5:55 PM

#

anybody have experience running 2 seperate 24/7 gateways on a mac mini with 2 seperate user profiles and apple accounts?
Is this usually to much for a base mac mini m4 to handle in regards to load and daemons? are there any unintended consequences?

I want to set it up for myself and a family member, and prefer to have individual setups. i will use my openclaw with moderate to high usage and my family member with light usage.
the docs seem prefer one gateway per hardware, but I could only purchase one dedicated mac mini, not two.

crystal cliff Mar 3, 2026, 6:09 PM

#

river gate Is this type of news allowed here? This Taalas chip sounds interesting. I hope a...

The issue with LLMs on an asic is that you're locked into that model and there is no upgrade path.

#

And with LLMs evolving so rapidly, I would be afraid that model would quickly become obsolete

#

Just look at the progress in the last 3 mos.

#

Better solution would be to run it on FPGAs I think.

craggy ferry Mar 3, 2026, 6:52 PM

#

prime aurora anybody have experience running 2 seperate 24/7 gateways on a mac mini with 2 se...

It’s probably fine

#

Mac mini is massive overkill for probably four agents

hard shore Mar 3, 2026, 10:31 PM

#

https://x.com/sjronanmd/status/2028685306900464102?s=46

astral gobletBOT Mar 3, 2026, 10:31 PM

#

hard shore https://x.com/sjronanmd/status/2028685306900464102?s=46

@sjronanmd via Twitter

Stephen J Ronan MD (@SJRonanMD)

I asked @grok to compare my homebuilt to the big boys. I use Grok and Claude for the heavier loads when necessary. @elonmusk Grok approves!
︀︀
︀︀x.com/i/grok/share/cf127dadbda9439aaadc8aa36d472299
︀︀
︀︀#OpenClaw #GrokAi #Claude #ChatGPT

**💬 2 ❤️ 1 👁️ 319 **

primal saffron Mar 3, 2026, 10:31 PM

#

Where kind I find information about running models locally with Ollama? I am creating a fallback mode if I run out of premium credits that runs in essentially a "safe_mode" with limited functionality. I successfully got Qwen3 (4B) but performance is meh.. any local llm enthusiasts in here?

primal saffron Mar 3, 2026, 10:32 PM

#

astral goblet [@sjronanmd via Twitter](https://fxtwitter.com/sjronanmd/status/2028685306900464...

This is what im talking about.

hard shore Mar 3, 2026, 10:32 PM

#

Check out my new set up.

#

https://x.com/sjronanmd/status/2028740105855455546?s=46

astral gobletBOT Mar 3, 2026, 10:32 PM

#

hard shore https://x.com/sjronanmd/status/2028740105855455546?s=46

@sjronanmd via Twitter

Stephen J Ronan MD (@SJRonanMD)

The OpenClaw Ollama Queue Proxy
︀︀x.com/SJRonanMD/status/2028739703735046432
︀︀#OpenClaw #ollama

Quoting Stephen J Ronan MD (@SJRonanMD)
︀

**👁️ 59 **

hard shore Mar 3, 2026, 10:33 PM

#

Queue proxy for the bots.

primal saffron Mar 3, 2026, 10:33 PM

#

Bro your home setup is soooo cool!!!!

hard shore Mar 3, 2026, 10:33 PM

#

primal saffron Bro your home setup is soooo cool!!!!

Thanks dude.

primal saffron Mar 3, 2026, 10:33 PM

#

I love it.

#

Do you have any benchmarks for "intelligence"? Like how do you decide what models are good enough to run on your hardware?

#

If a new model drops, how do you determine if you want to adopt it into your hive of agents?

#

I sent you a friend request. I'm going to follow you as well.

wispy kraken Mar 3, 2026, 10:39 PM

#

primal saffron Do you have any benchmarks for "intelligence"? Like how do you decide what model...

it all depends on how much vram you have , i go for biggest model my Vram fits
if you mean between all the models out there its prety much depends on your own prefference and what you want to use it for
i played with the ones below
NAME ID SIZE MODIFIED
qwen3.5:27b 7653528ba5cb 17 GB 6 hours ago
qwen3.5:9b 6488c96fa5fa 6.6 GB 9 hours ago
mxbai-embed-large:latest 468836162de7 669 MB 36 hours ago
nomic-embed-text:latest 0a109f422b47 274 MB 36 hours ago
minimax-m2.5:cloud c0d5751c800f - 3 days ago
glm-4.7-flash:q4_K_M d1a8a26252f1 19 GB 3 days ago
lfm2:24b d6c816d74887 14 GB 4 days ago
qwen2.5:14b-instruct 7cdf5a0187d5 9.0 GB 2 weeks ago
qwen2.5:7b-instruct 845dbda0ea48 4.7 GB 2 weeks ago
gpt-oss:20b 17052f91a42e 13 GB 2 weeks ago
mistral-small3.2:24b 5a408ab55df5 15 GB 2 weeks ago
kimi-k2.5:cloud 6d1c3246c608 - 2 weeks ago
qwen3:8b 500a1f067a9f 5.2 GB 2 weeks ago
llama3.2:3b a80c4f17acd5 2.0 GB 2 weeks ago

primal saffron Mar 3, 2026, 10:42 PM

#

Thanks for sharing. One naïve application I could see being popular is offloading basic tasks to smaller models.
Like given an email, determine if this is spam, valuable marketing, a bill, etc..

Then if you want to see if a model can handle it you benchmark it against your own internal use cases.
Was wondering if anyone else had their own ways to deterministically benchmark candidate models.

wispy kraken Mar 3, 2026, 10:49 PM

#

@primal saffron

the biggest problem is not the intelenge of small models its their toolcalling , you can tweak it around and get it to work but it takes allot of effort to get them to be consistant (at least that has been my expirience so far )
question how do you give it an email to classify ?

a few changes i learned so far

models.providers.ollama.baseUrl changed from:
- http://127.0.0.1:11434/v1
  to:
- http://127.0.0.1:11434
  We added per-model params under agents.defaults.models:
ollama/qwen2.5:14b-instruct

With:

streaming: false
low temperature: 0.2
conservative maxTokens: 1024
Streaming can break or complicate toolcalling payload handling in some client stacks.
Lower temperature reduces “creative” formatting that can break JSON/tool parsing.
We set:
contextWindow: 32768
maxTokens: 4096
reasoning: false

wispy kraken Mar 3, 2026, 10:59 PM

#

primal saffron Thanks for sharing. One naïve application I could see being popular is offloadin...

yes i have a standart test ,
if you go up in models a nice test is to get vacancy texts from linked it has to figuer out redirects and cookies and return a structured normalized text back

[
{
"name": "strict_router",
"type": "strict_json",
"prompt": "Return a valid JSON object with fields: action, target, confidence."
},
{
"name": "streaming_stress",
"type": "stream",
"prompt": "Write a detailed 1500+ word technical report about distributed systems resilience."
},
{
"name": "deep_reasoning",
"type": "reasoning",
"prompt": "Solve a multi-step logic problem involving planning, trade-offs, and conditional branching. Explain reasoning step by step."
}
]
{
"models": [
"qwen2.5:14b-instruct"
],
"contexts": [
4096,
8192,
12288,
16384
],
"num_predict": [
512,
1024,
2048
],
"streaming": true,
"assisted_fallback": true
}

primal saffron Mar 3, 2026, 11:02 PM

#

wispy kraken <@269354714993393664> the biggest problem is not the intelenge of small models...

question how do you give it an email to classify ?
I sample real world examples from my inbox. Or are you asking about the testing infrastructure ?

#

Here is a slop summary.

Data Storage
The raw text of the real-world emails is stored in a local file called test_cases.json.
The Prompt Template
A Python script (run_eval.py) pulls an email from that JSON file and injects it into a highly structured prompt. Instead of using a tool definition, they use strict system instructions. For classification, the prompt looks something like this:

Classify this email into exactly one category:
action: requires a response, decision, or action...
notification: informational, no action needed...
noise: marketing, newsletters...

[Key rules and few-shot examples go here]

Email Text: [INJECT EMAIL HERE]

Respond with ONLY the category name. Nothing else.

The API Call
The script sends that massive text block to the Ollama Chat API (/api/chat) running on their local machine.
Zero Temperature
They set the model's temperature to 0. This makes the model's output as deterministic and robotic as possible, heavily restricting its creativity so it literally only spits out the exact word "action", "notification", or "noise".

--

By relying on strict few-shot prompting and zero temperature rather than native tool-calling, they managed to get a 4B parameter model to hit 100% accuracy on classification.

full talon Mar 4, 2026, 3:10 AM

#

ClawEval just released testes for all those small Qwen 3.5 modes for 59 OpenClaw Agent roles. The added also 8GB, 12GB 16GB VRAM models on top of those 24GB and bigger https://github.com/explaindio/ClawEval

lyric orchid Mar 4, 2026, 4:49 AM

#

primal saffron Where kind I find information about running models locally with Ollama? I am cre...

https://github.com/explaindio/ClawEval/tree/master check out claweval for model evaluation. vram specific results. what are you running ollama on (how much vram)? Some conversations about qwen over here #1478204986973229160 message

primal saffron Mar 4, 2026, 7:17 AM

#

@lyric orchid @full talon woah..

worn flint Mar 4, 2026, 10:44 AM

#

hey... i know there loads of stuff out there, but struggling to find some good answers. if you had say £10-13k to spend on an inference box(es) what would you build? i was thinking dsx as they're so tiny, but stats look a bit... crap

tired plover Mar 4, 2026, 10:48 AM

#

For everybody waiting for M5 Mac Mini

───

Apple M5 Chip Specifications

Memory Bandwidth:

• M5: 153 GB/s
• M5 Pro: 307 GB/s (same for all Pro variants)
• M5 Max: up to 460 GB/s
• Highest Bandwidth: 614 GB/s

Available Memory Interfaces: 128-bit, 256-bit, 384-bit

───

M5 High-End Models

M5 Pro

• CPU: 6 Performance (P) + 12 Medium (M) cores
• GPU: 20 cores
• Clock Speeds: 4.61 GHz (P-core) / 4.38 GHz* (M-core) / 1.62 GHz (Efficiency/E-core)
• Cache:
• pLLC: 16MB*
• mLLC: 16MB*
• Memory Cache: 24MB
• Memory: LPDDR5X-9600, up to 64GB

M5 Max

• CPU: 6 Performance (P) + 12 Medium (M)* cores
• GPU: 40 cores
• Clock Speeds: 4.61 GHz (P-core) / 4.38 GHz* (M-core) / 1.62 GHz (Efficiency/E-core)
• Cache:
• pLLC: 16MB*
• mLLC: 16MB*
• Memory Cache: 48MB
• Memory: LPDDR5X-9600, up to 128GB

───

Technical Notes

*1. M-Cores:

• M = Medium-Core, derived from P-Core but between P and E-Core in performance
• 7-wide decode
• M-core delivers approximately 70% of P-core performance

Neural Engine:

• 16-core ANE

Package Design:

• SoIC-MH (System in Chip - Multi-Hybrid)
• Divided into CPU Tile and GPU Tile

Performance Improvements:

• M5 Max multi-core performance: ideally +20% vs M4 Max
• Single-core: +10%
• Multi-core: +20%
• GPU: +25%

Benchmarks (Estimated):

• SNL (Single-core Low): +30%
• SN (Single-core Normal): +22%
• SBE (Single-core High-End): +45%

*2. Power Consumption:

• M5 Max vs M4 Max, M5 Pro vs M4 Pro
• Single-core and GPU power consumption figures refer to base version
• Multi-core power consumption will increase; exact increase depends on thermal dissipation

*3. Expected Benchmarks (Cinebench R24):

• Multi-thread (MT): ~2500
• Single-thread (ST): ~215

*1 (SN - Single-core): Expected ~4100 (Geekbench 6 Single-core)

deft idol Mar 4, 2026, 4:21 PM

#

has anyone used Qwen3.5-27B?

valid rune Mar 4, 2026, 4:23 PM

#

@craggy ferry new test i run now MLX Qwen3.5-35B-A3B-Text-qx64-hi-mlx on mlx_lm.server => 70tok/s on Mac Mini 64Go Ram M4 Pro . I have juste a little issue, the context size . I dont kwo how I gonna fix him . The session is too small to give it long tasks, so I have to divide the tasks into stages. I set the context to 32768 to see if it works, otherwise it compacts too quickly. Another problem I encountered was the bottleneck: the event system doesn't wake it up, so I have to switch to Discord or Telegram to give it tasks.

craggy ferry Mar 4, 2026, 4:46 PM

#

Yeah the problem is context window. You want like 200k

tired plover Mar 4, 2026, 5:12 PM

#

deft idol has anyone used Qwen3.5-27B?

on spark its too slow because of Bandwith, only makes sense when you have very fast Bandwith on GPU or so

tired plover Mar 4, 2026, 5:13 PM

#

craggy ferry Yeah the problem is context window. You want like 200k

sometimes they get really stupid at that size...

craggy ferry Mar 4, 2026, 5:28 PM

#

Not really?

#

The good ones don’t that’s why we all use sonnet and opus

tired plover Mar 4, 2026, 5:38 PM

#

craggy ferry The good ones don’t that’s why we all use sonnet and opus

what ????? dude all the models die with too high ctx...

craggy ferry Mar 4, 2026, 5:38 PM

#

They do not.

tired plover Mar 4, 2026, 5:38 PM

#

HAHAHA they do

#

and my browser dies as well

craggy ferry Mar 4, 2026, 5:38 PM

#

I was just testing qwen-122b with a 180k context last night.

#

Works fantastic.

#

Most of them do, are trained on 200k context windows.

tired plover Mar 4, 2026, 5:39 PM

#

what HW you use ?

craggy ferry Mar 4, 2026, 5:39 PM

#

Actually 1m but the 200k training is more useful

#

All of it. Hardware doesn’t matter

#

Either it has the kv cache allocated or it doesn’t

tired plover Mar 4, 2026, 5:44 PM

#

so, if you go over ctx and get compaction it still stays the same ?

craggy quail Mar 4, 2026, 5:44 PM

#

craggy ferry I was just testing qwen-122b with a 180k context last night.

I can't load this model in my Strix Halo with 128GB, always have OOM

craggy ferry Mar 4, 2026, 5:45 PM

#

tired plover so, if you go over ctx and get compaction it still stays the same ?

…I said they worked fine at 180k-200k context. I didn’t say anything about compaction.

craggy ferry Mar 4, 2026, 5:45 PM

#

craggy quail I can't load this model in my Strix Halo with 128GB, always have OOM

Q4, and I spill a lot into system ram from my 48gb card

craggy quail Mar 4, 2026, 5:47 PM

#

yes, i try with Q4_K_XL. what parameters use with llama?

craggy ferry Mar 4, 2026, 5:57 PM

#

That’s what I use, dunno what your deal is. You’re using quantized kv too right?

hollow harbor Mar 4, 2026, 6:52 PM

#

I have an NVIDIA AGX Orin with 64GB of RAM that I wanted to setup as an OpenClaw node just for running some basic inference, what local model do you all recommend for that hardware?

quartz monolith Mar 4, 2026, 7:12 PM

#

Fun idea: grab a second-hand Kinect for like €15 and a USB adapter cable for about €10 — so for around €25 you’ve got a super fun upgrade for your OpenClaw

With the Kinect you get:
• 👀 Depth camera → basically giving your OpenClaw eyes
• 🎤 Built-in mic array → great for audio / voice experiments
• 🔊 Audio output options
• 📡 Motion tracking → even make OpenClaw “shake yes” or react to gestures

It’s such a cheap and fun way to experiment with vision + interaction. Add Arduino IDE and you’re basically unlocking a playground for cool robot ideas

Concept here:
https://www.hackster.io/psmooij/openclaw-for-robot-programming-pmsg-on-budget-d76a91

still rampart Mar 4, 2026, 8:12 PM

#

tired plover on spark its too slow because of Bandwith, only makes sense when you have very f...

I'm using a gx10 (Asus spark) and running the qwen 3.5 35b with 1M context window through vllm and it runs like a champ.

jagged tusk Mar 4, 2026, 8:21 PM

#

still rampart I'm using a gx10 (Asus spark) and running the qwen 3.5 35b with 1M context windo...

qwen doesnt forget and actually does work on its own with that? 128gb ram right?

still rampart Mar 4, 2026, 8:29 PM

#

jagged tusk qwen doesnt forget and actually does work on its own with that? 128gb ram right?

Yea, it's built a mission control for me for work and a pretty cool newsletter for my field. a hiccup I haven't solved is that reasoning from openclaw to vllm is apparently specified differently in the heading of the packet, so all it's reasoning comes through telegram too, but I just haven't had time to see if there a fix yet or not. But I'm very happy with it's performance.

#

It's stable at 80% disk usage when I installed through vllm. It will run out of memory on 90-95% with the 1M context window

tired plover Mar 4, 2026, 9:01 PM

#

still rampart I'm using a gx10 (Asus spark) and running the qwen 3.5 35b with 1M context windo...

35B is not working for me, doesnt get the complex topics im working on

still rampart Mar 4, 2026, 9:04 PM

#

tired plover 35B is not working for me, doesnt get the complex topics im working on

Maybe I'm just working with dumb topics then😂

tired plover Mar 4, 2026, 9:05 PM

#

still rampart Maybe I'm just working with dumb topics then😂

sorry, didnt mean it like that but i try to get very complex chains together and 35B just got always something wrong, maybe was me who didnt setup properly but 122B even though its slower makes a banger job, BTW for everybody who is using Spark, look at this:
https://www.reddit.com/r/LocalLLaMA/comments/1rkefjw/solved_the_dgx_spark_102_stable_toks_qwen3535ba3b/

astral gobletBOT Mar 4, 2026, 9:05 PM

#

tired plover sorry, didnt mean it like that but i try to get very complex chains together and...

r/LocalLLaMA via Reddit

rxddit.com

Solved the DGX Spark, 102 stable tok/s Qwen3.5-35B-A3B on a single GB10 (125+ MTP!)

u/Live-Possession-6726 on r/LocalLLaMA

The DGX Spark has had a bit of a rough reputation in this community. The hardware is incredible on paper (a petaflop of FP4 compute sitting on a desk) but the software situation has been difficult. The moment you try to update vLLM for new model support you hit dependency conflicts that have no clean resolution. PyTorch wheels that don't exist f...

▶ Play video

still rampart Mar 4, 2026, 9:07 PM

#

tired plover sorry, didnt mean it like that but i try to get very complex chains together and...

No you're good dude my area is law, so more data scrapping and analysis than complex math. Have you tried it through vllm though? People report up to 60% more efficient than same model through ollama or llm studio

tired plover Mar 4, 2026, 9:08 PM

#

still rampart No you're good dude my area is law, so more data scrapping and analysis than com...

i went back and forth and tested the hell out of my spark... so yeah, for now ended up on vLLM with Int4 Autoround 122B but soon i hope to switch to Atlas and then get full potential with everything automated

still rampart Mar 4, 2026, 9:08 PM

#

I will look into atlas, thanks

tired plover Mar 4, 2026, 9:16 PM

#

still rampart I will look into atlas, thanks

lmk what you think!

still rampart Mar 4, 2026, 9:31 PM

#

tired plover lmk what you think!

OK atlas and Ai searched together has a lot of different results. Would you mind sending me a link or another search term to find the atlas you're speaking of?

tired plover Mar 4, 2026, 9:34 PM

#

It’s not yet released, you can only find the Reddit post or NVIDIA forum listing about the tech and explanation

worn flint Mar 4, 2026, 9:37 PM

#

valid rune <@132500715133206528> new test i run now MLX Qwen3.5-35B-A3B-Text-qx64-hi-mlx o...

How long does it take to load up? Tried to load the model and took forever and gave up

fast summit Mar 5, 2026, 1:23 AM

#

I've been working on porting BitChat to OpenClaw. Is anyone else interested in this?

I have basic uses working (I had to port the BitChat client itself to Node) though PMs are having issues.

Nonethless I see a lot of potential in connecting Claws to mesh networks

craggy ferry Mar 5, 2026, 1:52 AM

#

im mad with power im loading up qwen3.5-397b-a17b-Q8

still rampart Mar 5, 2026, 2:02 AM

#

craggy ferry im mad with power im loading up qwen3.5-397b-a17b-Q8

Goodness on what?

craggy ferry Mar 5, 2026, 2:02 AM

#

512gb M3

#

just showed up today, the crown jewel of office heaters

lament jasper Mar 5, 2026, 5:11 AM

#

im thinking of buying a used optiplex to give my agent his own hardware. im not a tech guy - should I or not? budget is 400$

#

need it on 247

#

reason is bc some have cuda cores for cheap so qmd works

steep wedge Mar 5, 2026, 5:49 AM

#

lament jasper im thinking of buying a used optiplex to give my agent his own hardware. im not ...

It’s fine to do that. OC itself doesn’t need a lot, so you can run it on very modest hardware.

west rampart Mar 5, 2026, 6:21 AM

#

I have an NVIDIA AGX Orin with 64GB of

gleaming cypress Mar 5, 2026, 6:26 AM

#

"I want the AI agent on the VPS to be able to control Chrome and browse the web for me automatically."

west rampart Mar 5, 2026, 6:26 AM

#

Since everyone here is building a somehow local and private agent(s), has anyone used any model performace evaluation tool to measure how intelligent your agent is? I have seen Artifical Analysis providing comprehensive evaluation on models. Is there any tool that can be used to conduct similar evaluation on our private agents?

upper bay Mar 5, 2026, 8:57 AM

#

gleaming cypress "I want the AI agent on the VPS to be able to control Chrome and browse the web ...

You need to install openclaw web relay, configure it and it should do the job

vague bolt Mar 5, 2026, 9:25 AM

#

I'm using a jetson tx2 to setup the openclaw. it works well

deft idol Mar 5, 2026, 7:22 PM

#

Hi there does anyone have any longer standing experience with mac mini docks? Looking primarily for storage expansion options and more ports.

gusty nacelle Mar 5, 2026, 7:47 PM

#

astral goblet [r/LocalLLaMA via Reddit](https://www.rxddit.com/r/LocalLLaMA/comments/1rkefjw/s...

that's pretty good! allegedly that qwen is within 10% of Opus4.6 on SWBench Verified Hard

oak frost Mar 5, 2026, 8:26 PM

#

quartz zinc https://x.com/brianroemmele/status/2028854766999392647?s=46

I*m searching on X about this awesome project.
But my english skills arent good enough to understand (and i am 57 and a little bit slow)
Would this project public available in the near future?

tired plover Mar 5, 2026, 9:33 PM

#

gusty nacelle that's pretty good! allegedly that qwen is within 10% of Opus4.6 on SWBench Veri...

yeah but they still didnt release...

fresh talon Mar 5, 2026, 9:36 PM

#

Free models suggestions ?

craggy ferry Mar 5, 2026, 11:22 PM

#

omg apple removed the 512gb studio from the store

#

like it's not even listed as an option anymore

spiral vector Mar 5, 2026, 11:23 PM

#

weren't we expecting and M5 mac studio any day now? (and M5 mac mini)

craggy ferry Mar 5, 2026, 11:24 PM

#

Any Day Now

#

yeah probably

#

but i'm having fun working on making mine produce actual frontier quality tokens all day every day

deft idol Mar 6, 2026, 2:33 AM

#

yeah someone said it's a hardware shortage

bronze ermine Mar 6, 2026, 3:40 AM

#

spiral vector weren't we expecting and M5 mac studio any day now? (and M5 mac mini)

Macrumor website still speculates fall 2026

#

https://buyersguide.macrumors.com/#mac

cobalt wind Mar 6, 2026, 4:02 AM

#

is anyone running openclaw on an old phone or something cool? I have a bunch of old stuff lying around trying to find something cool to do with it lol

cedar oar Mar 6, 2026, 4:51 AM

#

There are two more Apple announcements this year so maybe one of those will be the M5 mac mini/studio.

random void Mar 6, 2026, 4:58 AM

#

Mostly likely now is WWDC in June, I'd be surprised if they wait that long IF the hardware is ready to go before then, especially if the 512 chip being removed is due to them just being out of stock on it due to oversales (guessing fab on them stopped long ago, and they have just been running on forcasted sales inventory). If we see a couple other Studio configs drop off soon, then I imagine they would be pressured to move up the release.

I was surprised the Mac Mini M5 were not released with this weeks stuff, especially with the Studio displays being released, and no new desktop hardware

surreal girder Mar 6, 2026, 6:45 AM

#

cobalt wind is anyone running openclaw on an old phone or something cool? I have a bunch of ...

I am working on it, I downloaded it and configured it, but once I opened the gateway, it needed almost 1 minutes to start.

cobalt wind Mar 6, 2026, 6:46 AM

#

surreal girder I am working on it, I downloaded it and configured it, but once I opened the gat...

on teremux ? what is your plan with it once you get it running?

surreal girder Mar 6, 2026, 6:46 AM

#

When I tried to open the web dashboard, it closed instantly.

surreal girder Mar 6, 2026, 6:47 AM

#

cobalt wind on teremux ? what is your plan with it once you get it running?

openclawd-termux

#

it's an open source app

#

and I have termux on my android phone

#

I don't know what's wrong with it.

steep wedge Mar 6, 2026, 5:37 PM

#

I’m not surprised, but still disappointed, that the Asus Ascent GX10 now starts at $3,499, up $500.

tired plover Mar 6, 2026, 7:00 PM

#

in germany from Amazon at ASUS Store its 3.8K

#

was a good timing when i got mine 😄

#

it was really just a matter of time when they would do it... will be very interesting what Apple will do with new MacMini

river gate Mar 6, 2026, 7:13 PM

#

crystal cliff The issue with LLMs on an asic is that you're locked into that model and there i...

I agree, But someone might find a use for it as specific AI ASICS become older. It's a niche product but still interesting.

river gate Mar 6, 2026, 7:14 PM

#

primal saffron Where kind I find information about running models locally with Ollama? I am cre...

This is the models page with search. https://ollama.com/search

river gate Mar 6, 2026, 7:15 PM

#

wispy kraken it all depends on how much vram you have , i go for biggest model my Vram fits ...

Wait, Qwen3.5:9b is only 6.6GB of filespace?

scenic aurora Mar 6, 2026, 8:04 PM

#

craggy ferry just showed up today, the crown jewel of office heaters

Curious to see how much tps you get.

scenic aurora Mar 6, 2026, 8:06 PM

#

worn flint How long does it take to load up? Tried to load the model and took forever and g...

Gb10 spark was taking 5-8 minutes to load on vLLM… did a lot of digging on the threads too and there’s a lot of special configs needed.

scenic aurora Mar 6, 2026, 8:08 PM

#

tired plover i went back and forth and tested the hell out of my spark... so yeah, for now en...

Curious to hear what your experience was, and if your yaml config? The qwen 3.5 q4s seem to be doing pretty good on ollama but I don’t know of a good community accepted way to measure tps across various contexts

worn flint Mar 6, 2026, 8:08 PM

#

Same, would love to hear the stats!

worn flint Mar 6, 2026, 8:08 PM

#

scenic aurora Gb10 spark was taking 5-8 minutes to load on vLLM… did a lot of digging on the t...

Ha. Ok

scenic aurora Mar 6, 2026, 8:10 PM

#

It’s not that bad mostly yaml… just one of those I don’t feel like rebuilding all the things and testing from scratch and they will probably work it out in a day or two

wispy kraken Mar 6, 2026, 8:10 PM

#

river gate Wait, Qwen3.5:9b is only 6.6GB of filespace?

Yeah it’s not that much but that’s just file space and memory model needs to be loaded then when you send it context it needs kv and cache and it all goes up quickly

tired plover Mar 6, 2026, 8:12 PM

#

scenic aurora Curious to hear what your experience was, and if your yaml config? The qwen 3.5 ...

i used claude as support for setting up, i saw in the start llama.cpp was the best with 122B 29tk/s but now with int4 i also get it on vLLM, depends on the parallel tasks you do if one or the other fits better

scenic aurora Mar 6, 2026, 8:15 PM

#

Hmm ok… I had it in docker with vLLM but might not have had config flags right with right build. Same, I’m having Claude and codex read the logs to iterate faster, it’s all vaguely familiar just tedious to read by hand

#

Putting it through its paces seems to be working good and only have more potential unlocked soon with the hardware specific quants and optimizations.

craggy ferry Mar 6, 2026, 9:19 PM

#

scenic aurora Curious to see how much tps you get.

I get 28tps making 20k tokens with vllm-mlx but it’s not stable with openclaw yet

#

I’ve got a Claude looking at why not

#

Seems to work fine for non streaming requests but for streaming it just emits gibberish

scenic aurora Mar 6, 2026, 9:30 PM

#

craggy ferry I get 28tps making 20k tokens with vllm-mlx but it’s not stable with openclaw ye...

interesting. Yeah, I had to manually configure the tokens in ~/.openclaw/openclaw.json, was trying to see what community recommendations were. the models are often making guesses there it seems. openclaw models benchmark fr?

craggy ferry Mar 6, 2026, 9:31 PM

#

Manually configure the tokens?

I just used claweval

#

Qwen3.5-397b is great for a Mac Studio

scenic aurora Mar 6, 2026, 9:48 PM

#

claweval. interesting thanks

full talon Mar 6, 2026, 10:00 PM

#

ClawEval just released a guide — 14 AI agents running 100% locally on a single RTX 3090 https://github.com/explaindio/ClawEval/blob/master/docs/OpenClaw_Backend_Local_on_3090.pdf

craggy ferry Mar 6, 2026, 10:22 PM

#

by the way claweval missed on the name

#

it 100% should have been clawmark

lyric orchid Mar 7, 2026, 12:09 AM

#

river gate Wait, Qwen3.5:9b is only 6.6GB of filespace?

Yep, and 14b is 11 something, and "fits" on my 12 gb 4070. 3090 came today, can try some bigger models now! (That thing is a huge!)

wispy kraken Mar 7, 2026, 12:42 AM

#

Has any one considered or tried running gateway on The GL.iNet GL-MT3000 (Beryl AX) is a high-performance Wi-Fi 6 travel router designed for security and speed on the go.
OpenWRT
OpenWRT
+1
Core Hardware Specifications
Processor (SoC): MediaTek MT7981B (Filogic 820) Dual-core @ 1.3 GHz.
Memory (RAM): 512MB DDR4.

tacit aurora Mar 7, 2026, 4:14 AM

#

Any suggestion which laptop is best to run openclaw and open weight model performance wise

wispy kraken Mar 7, 2026, 7:30 AM

#

tacit aurora Any suggestion which laptop is best to run openclaw and open weight model perfor...

big gpu vram or mac with unified memory
Other Platforms with unified memory : Similar concepts exist in other, high-end, or integrated systems, such as Nvidia's DGX systems and some AMD/Intel APUs.

tacit aurora Mar 7, 2026, 11:18 AM

#

wispy kraken big gpu vram or mac with unified memory Other Platforms with unified memory : S...

any laptop name suggest directly i m thinking of

https://amzn.in/d/0dZNeDZg

wispy kraken Mar 7, 2026, 11:21 AM

#

tacit aurora any laptop name suggest directly i m thinking of https://amzn.in/d/0dZNeDZg

its almost useless you only get 6 GB of dedicated GDDR6 VRAM and that will not run much

tacit aurora Mar 7, 2026, 11:21 AM

#

Can you suggest can ??

#

If i want to buy mac then which and if want to buy windows then which one

#

@wispy kraken ??

wispy kraken Mar 7, 2026, 11:31 AM

#

mac can run windows , do you really want a laptop ?
you can buy a mac mini and windows laptop for less then a macbook pro
thats mine
Model Name: MacBook Pro
Model Identifier: Mac16,8
Model Number: MX2J3N/A
Chip: Apple M4 Pro
Total Number of Cores: 14 (10 Performance and 4 Efficiency)
Memory: 24 GB
and i can say i wish i had more memory
mac mini 64 gb € 2.469,00
macbook pro 64gb € 3.499,00

tacit aurora Mar 7, 2026, 11:31 AM

#

wispy kraken mac can run windows , do you really want a laptop ? you can buy a mac mini and ...

Thank u

wispy kraken Mar 7, 2026, 11:35 AM

#

tacit aurora Thank u

but realistically, just a tip you know how much quota you can get for 2,5k ?
and that on a premium model (keep in mind even a model running on 64gb is no where near claude or chatgpt models
right now if you get chatgpt plus for me its 30 euro a month you get API quota and you get double the Codex quota you can use , try that out first before you commit to new hardware

#

@tacit aurora

tacit aurora Mar 7, 2026, 11:41 AM

#

wispy kraken but realistically, just a tip you know how much quota you can get for 2,5k ? an...

I ll depend on cloud ai only intead of running local model

scenic aurora Mar 7, 2026, 1:24 PM

#

gb10 -> tried some fastsafetensors with avarok build of vllm in docker, getting good speed, but getting clear corruption of llm function and high repetition.... having better luck at the moment with ollama qwen3.5 35B a3 q4 k m than the other one I tried. gonna stick with that til there's a better workaround I can just compose.yaml or similar on the spark. very interesting though.

#

maybe there's some temperature or desired output length stuff i'm setting wrong too

prisma quartz Mar 7, 2026, 1:32 PM

#

@scenic aurora I'm looking to pick up a Spark. Any recommendations?

scenic aurora Mar 7, 2026, 1:38 PM

#

prisma quartz <@188307126320365568> I'm looking to pick up a Spark. Any recommendations?

I'm still trying to understand what it can unleash. I can't recommend any specific workflow (or hardware ver if thats what you meant) - a powerful beast, just lots of confiuration to try to get the higher speed unlocked with good int

deep roost Mar 7, 2026, 1:40 PM

#

scenic aurora I'm still trying to understand what it can unleash. I can't recommend any specif...

what is that

scenic aurora Mar 7, 2026, 1:40 PM

#

the qwen3.5 models are promising because of the 256k context lengths, i'm still in the learning curve between getting openclaw working on smaller models effectively vs. the various cloud models.

scenic aurora Mar 7, 2026, 1:41 PM

#

deep roost what is that

LLM model, what i'm trying to say is that i'm uncertain if the hardware is the problem because I am trying various quantizations of the underlying LLM model i'm using to try to get performance

#

it's been a lot of tuning the GB10 spark to try to get anything to run big+good+fast, been tweaking ollama (in a docker container) and vllm (in a docker container) as my attempts so far

#

the 9B models seem to do a lot better so far for me but I may just be configuring the hardware wrong

prisma quartz Mar 7, 2026, 1:50 PM

#

scenic aurora the 9B models seem to do a lot better so far for me but I may just be configurin...

I'm looking to do the same. I'm a Merchant Mariner (yes I work on a tug boat). My hitches usually 50-60 days. Next wed my hitch ends. 50 days off. Currently I have openclaw running on my steam deck. Had it look into the spark. Nvidia tutorials seem in depth. But who knows.
Plan is to keep it at home while at sea running Ollama. Tug has star link.

spiral vector Mar 7, 2026, 1:51 PM

#

https://www.youtube.com/watch?v=QbtScohcdwI - best comparison I've seen about the various Sparks

scenic aurora Mar 7, 2026, 1:54 PM

#

The NVIDIA DGX Spark is the reference design and performed well, but the Dell is noted as being similar in performance while potentially having a better price (3:35, 12:57).

it's definitely very capable, I expect it'll only get better as people patch hardware.

scenic aurora Mar 7, 2026, 1:54 PM

#

prisma quartz I'm looking to do the same. I'm a Merchant Mariner (yes I work on a tug boat). M...

how's that going? I should add mine to my cluser heh. more gpu

#

just don't create a username on the spark like fax that's already a group, the boot script crashes but marks the install as successful anyway and it boot loops. i think the third party sparks made better out of box experience software though

deep roost Mar 7, 2026, 2:06 PM

#

scenic aurora LLM model, what i'm trying to say is that i'm uncertain if the hardware is the p...

oh

#

top notch stuff

#

i run dedicated servers and provide hosting for website and also offer space for those who store there AI

prisma quartz Mar 7, 2026, 2:07 PM

#

I'm using Claude. Seems snappy. Kinda just tinkering.
Figured a good test bed if shit went south. Nothing critical on it except games.
I gave it access to SD Flux on my laptop. It kinda went nuts, in a good way rendering. Also been working with it building daemons. Got it to dm me if it has a question or something pressing. Not quite like a cron or heart beat. It initiates the dm at random times depending on what it has been working on.

scenic aurora Mar 7, 2026, 2:37 PM

#

prisma quartz I'm using Claude. Seems snappy. Kinda just tinkering. Figured a good test bed if...

yes - claude works great, just can go overbudget a bit quick. very impressed with performance on most major big models

quartz pawn Mar 7, 2026, 3:30 PM

#

So far Qwen3.5 27B > Qwen3.5 35B a3b

#

For local llm

unkempt pivot Mar 7, 2026, 5:12 PM

#

Hostinger is fine

#

Both are fine!

solemn valeBOT Mar 7, 2026, 5:15 PM

#

unkempt pivot Mar 7, 2026, 5:15 PM

#

You have infos on the docs

molten geyser Mar 7, 2026, 5:21 PM

#

unkempt pivot You have infos on the docs

great - silly quesiton though. is there a clever way to load all the documentation into the context for my main LLM so i can work with it on the install?

unkempt pivot Mar 7, 2026, 5:22 PM

#

molten geyser great - silly quesiton though. is there a clever way to load all the documentat...

Either via browser or you can have a docs folder from GitHub on your device ask your agent to read it

tired plover Mar 7, 2026, 5:46 PM

#

for everyody with a Spark, Atlas is now available in Alpha but doesnt work yet with openclaw as Tool Calls dont work
https://www.reddit.com/r/LocalLLaMA/comments/1rmvxo3/the_gb10_solution_has_arrived_atlas_image/?sort=new

astral gobletBOT Mar 7, 2026, 5:46 PM

#

tired plover for everyody with a Spark, Atlas is now available in Alpha but doesnt work yet w...

r/LocalLLaMA via Reddit

molten geyser Mar 7, 2026, 5:56 PM

#

unkempt pivot Either via browser or you can have a docs folder from GitHub on your device ask ...

is there any existing MD file available that consolidates all the docs into one file?

i don't have my openclaw agent set up yet, so just want to access these docs with my current LLM to help me in setting up openclaw

unkempt pivot Mar 7, 2026, 5:56 PM

#

molten geyser is there any existing MD file available that consolidates all the docs into one ...

When I talked about agent I talked about Claude or Codex

#

You have a docs folder in the OpenClaw GitHub

full talon Mar 7, 2026, 7:03 PM

#

quartz pawn So far Qwen3.5 27B > Qwen3.5 35B a3b

same but 27B is more than 2 times slower than 35B A3B

quartz pawn Mar 7, 2026, 7:04 PM

#

full talon same but 27B is more than 2 times slower than 35B A3B

35b A3B fits inside my 5090 but 27B needs the vram from my 3090 which has less memory bandwidth

craggy ferry Mar 7, 2026, 7:17 PM

#

That doesn’t sound right

#

I could be wrong but 27 < 35

sharp hedge Mar 7, 2026, 10:48 PM

#

@craggy ferry hey how you been

#

long time no see, or maybe cause i've been gone lol

sharp hedge Mar 7, 2026, 10:49 PM

#

full talon same but 27B is more than 2 times slower than 35B A3B

Is it Quantized?

pulsar oracle Mar 7, 2026, 11:33 PM

#

27b is a dense model. 35b only loads like 3b parameters at a time (MOE mixture of experts)

#

they can perform a bit different depending on what youre doing, may want to test them both

sharp hedge Mar 7, 2026, 11:45 PM

#

ohhh

#

so 35b is like lazy loading?

jovial pecan Mar 8, 2026, 11:58 AM

#

hey bros, can I ask you guys smth? I have a old laptop: Hp i5 3rd gen, 8gb ram ddr3, 1t ssd, ubuntu. Will a multiagent framework work smooth on it? Or I should go for a vps?

rocky violet Mar 8, 2026, 12:08 PM

#

jovial pecan hey bros, can I ask you guys smth? I have a old laptop: Hp i5 3rd gen, 8gb ram ...

i dont think you should try anything like that on that laptop

#

vps will be better for you

jovial pecan Mar 8, 2026, 12:09 PM

#

thank you for the reply

#

i kinda thought so

hoary sable Mar 8, 2026, 2:58 PM

#

jovial pecan hey bros, can I ask you guys smth? I have a old laptop: Hp i5 3rd gen, 8gb ram ...

Try it what's the harm.. its not resource intensive. All the intensive work is done at the llm api provider.

strange void Mar 8, 2026, 7:12 PM

#

Any raspberrypi users here

#

I pushed a number of updates recently and updated docs to make package more stable, any feedback/issues/improvements?

full talon Mar 8, 2026, 7:19 PM

#

sharp hedge Is it Quantized?

yes both

warped dagger Mar 8, 2026, 8:24 PM

#

What if OpenClaw had its own Alexa-style speaker?

We’re building a plug-and-play voice speaker for your OpenClaw assistant.Supercharge your agentic workflows with voice.

👉 Join the waitlist: https://talkclaw.io

uneven wadi Mar 8, 2026, 11:05 PM

#

I spend a whole weekend writing my system upgrade.sh script .. https://github.com/junaga/debian can I do something with this? can this help someone somehow?

grave bobcat Mar 9, 2026, 4:22 AM

#

strange void I pushed a number of updates recently and updated docs to make package more stab...

I’m running open claw on a pi 4 4gb using Google flash latest on free tier. It’s been ok, I think I hit limits with Google often, resetting sessions when I near 1M tokens

strange void Mar 9, 2026, 4:28 AM

#

openrouter /free is another option if you want free

grave bobcat Mar 9, 2026, 4:29 AM

#

strange void openrouter `/free` is another option if you want free

Yep, I have that set up for my sub agents, but getting denied constantly. I had it so open open claw would retry often but more often than not they never worked. So I had to switch to smaller cheaper models with a paid balance and that has been a bit more successful with open router.

strange void Mar 9, 2026, 4:29 AM

#

tool calling is usually disabled on those models

grave bobcat Mar 9, 2026, 4:30 AM

#

strange void tool calling is usually disabled on those models

Ah ok, I think I know what that means but I’m not much of a coder. More of a tinkerer who is really interested in web projects , but doesn’t know 1 ounce of code.

tired plover Mar 9, 2026, 8:09 AM

#

strange void Any raspberrypi users here

Running well, since I got my spark I don’t use online API much anymore but it’s running stable and good (pi 4 4GB)

#

Can only recommend for starters who doesn’t want to use VPS, had mine laying around so it was a no Brainer haha

keen spindle Mar 9, 2026, 1:20 PM

#

strange void openrouter `/free` is another option if you want free

have you found /free stable lately...? I added a bunch in an App while back from OR and I ended up just deleting all free endpoints.... sometimes they were ok and sometimes not so much... and then your also freely giving permission to use you prompts to train with those... so I'd be careful with what data you send to any models but especially the Free ones!

grave bobcat Mar 9, 2026, 2:48 PM

#

keen spindle have you found /free stable lately...? I added a bunch in an App while back from...

Yeah, I have dropped all free models from open matter as I could never really get through. Switched up to Qwen coder next and too early to tell if it’s working.

keen spindle Mar 9, 2026, 2:58 PM

#

grave bobcat Yeah, I have dropped all free models from open matter as I could never really ge...

Have you tried the :nitro :online :exacto or the /auto for Open Router? There are many Params and ways to adjust OR. I love what they are doing there. And they have BYOK which I am all about THAT!

weary reef Mar 10, 2026, 12:58 PM

#

Hello all, I have 2 dgx sparks in ray cluster with vllm. I am havnig hell of a time trying to find a model that will work with eveyrthing. Is any one running a 2 spark llm setup?? if so what model and settings are you using . Thank you all 🙂

lone stream Mar 10, 2026, 2:54 PM

#

any hardware geeks here able to help me with an esp32?

rugged cloak Mar 10, 2026, 4:15 PM

#

lone stream any hardware geeks here able to help me with an esp32?

I can what would you like to build ?

lone stream Mar 10, 2026, 4:23 PM

#

rugged cloak I can what would you like to build ?

esp32 with antenna in a box and a battery

rugged cloak Mar 10, 2026, 4:34 PM

#

lone stream esp32 with antenna in a box and a battery

I started this project check it out https://github.com/chilu18/openclaw-esp32c3-xiao-node

lone stream Mar 10, 2026, 4:43 PM

#

can we chat once somewhere? dm?

exotic oceanBOT Mar 11, 2026, 11:32 AM

#

success @tazzy_19 muted

Reason: Spamming across channels
Duration: 14 minutes and 19 seconds

tidal dawn Mar 11, 2026, 8:22 PM

#

Can anyone running OpenClaw on a Mac with multiple macOS user accounts, where each user runs their own separate OpenClaw gateway, comment on how that works for them?
(RAM usage without Ollama? browser/os relay control? Remote Screen sharing, Any issues beyond needing to run the gateways, etc).

Don't need hypotheticals, just looking for hands-on experience, please.

honest hollow Mar 11, 2026, 9:19 PM

#

Does anyone own a clawbox?

tidal dawn Mar 11, 2026, 9:29 PM

#

honest hollow Does anyone own a clawbox?

No idea why anyone would buy one of those when you can get a M4 Mac Mini for the same price that is 4x faster with 2x the RAM and real NVMe storage instead of eMMC, double the memory bandwidth.

Not saying either is going to run run local models suitable to OC, but the Mini at least could run a small/embedding/TTS model if you wanted it to.

honest hollow Mar 11, 2026, 9:36 PM

#

tidal dawn No idea why anyone would buy one of those when you can get a M4 Mac Mini for the...

I bought the underlying hardware and configured it myself for $250. It's struggling with local models. I was asking because I don't see how the Clawbox could be fully functional with a local Model due to the hardware constraints.

tidal dawn Mar 11, 2026, 9:48 PM

#

IDK anyone seriously using local models for OpenClaw, and it's entirely pointless for a primary model on anything less than a 2+ of maxed out Mac Studios or 2+ DGX Sparks... Even then TPS is slow.

lament marsh Mar 11, 2026, 11:40 PM

#

tidal dawn IDK anyone seriously using local models for OpenClaw, and it's entirely pointles...

There are some models that make sense to run locally. You can run a small encoding model, some TTS, maybe a basic LLM.

But yeah, for main operation, you've gotta be going frontier models, which aren't running on any kind of home server.

verbal hawk Mar 12, 2026, 3:04 AM

#

I’ve heard of people running local kimi-k2.5?

fathom summit Mar 12, 2026, 6:41 AM

#

anyone working with zclaw? curious to hear about some interesting projects and use cases

fathom summit Mar 12, 2026, 6:51 AM

#

lone stream esp32 with antenna in a box and a battery

what is the question? are you asking about which individual components you would need in order to install an antenna, a battery regulator, a battery, and which esp32? you can find esp32 s3 with an antenna output, then you would want a tp4056 usb module, an 18650 battery and a battery holder, and an antenna if you didn't get one with an antenna... and print a case or buy one.

or here is one much better, a lora module, oled display, battery, antenna, case, etc., all built on one clean device. https://amzn.to/4sFZisw

lone stream Mar 12, 2026, 12:31 PM

#

rugged cloak I started this project check it out https://github.com/chilu18/openclaw-esp32c3-...

I would like to add the wifi antenna, battery and a decent case. I could potentially help with the claw programming. I put together one for the mikrotik https://github.com/mikroclaw/mikroclaw

GitHub

GitHub - mikroclaw/mikroclaw

Contribute to mikroclaw/mikroclaw development by creating an account on GitHub.

lone stream Mar 12, 2026, 12:31 PM

#

fathom summit what is the question? are you asking about which individual components you would...

I'm not so good with assembly but i can do coding

#

18650 wouldn't be rechargable ...

dawn cosmos Mar 12, 2026, 2:28 PM

#

rugged cloak I started this project check it out https://github.com/chilu18/openclaw-esp32c3-...

This is cool - it could possibly completely replace esphome. Since i am assuming any peripherals attached (like ir, temp, sensors) could be easily configured and sent to gateway!

lone stream Mar 12, 2026, 2:35 PM

#

dawn cosmos This is cool - it could possibly completely replace esphome. Since i am assuming...

and more. yes. i'd like to suggest a working group somehow...maybe we could make one in teech -> channels?

rugged cloak Mar 12, 2026, 2:41 PM

#

This is amazing ! Happy to help.. let’s do this - so you want me to rename the repo to something more catchy 🤣

fathom summit Mar 12, 2026, 2:50 PM

#

lone stream 18650 wouldn't be rechargable ...

What do you mean it wouldn't be rechargeable? That is the battery of choice for ESP32s. They operate at 3.3 volts, 18650 are rechargeable at 3.6 volts.

fathom summit Mar 12, 2026, 2:55 PM

#

lone stream and more. yes. i'd like to suggest a working group somehow...maybe we could make...

Have you tested this on an ESP32? I'm not following how I would set this up to compile the binary, and like what partition scheme are we looking for here? For which boards does it fit? What flash size? Does it need PSRAM? PlatformIO makes this easy, or at least espidf

Arduino would even be good too

#

This is a super useful tool if useful to anyone https://github.com/thelastoutpostworkshop/ESPConnect

fathom summit Mar 12, 2026, 2:59 PM

#

rugged cloak This is amazing ! Happy to help.. let’s do this - so you want me to rename the ...

This does look cool. I think I'm gonna flash it on my c3 right now, matter of fact. I'll definitely report back, dude!

lone stream Mar 12, 2026, 6:25 PM

#

Haven't tested. I guess I didn't realize that the 118650 was rechargable. I got my ESP32 coming in a day or two ...

#

There's a #1481662265823330437 channel now

ornate vigil Mar 12, 2026, 11:43 PM

#

DGX Spark (cluster)

broken moth Mar 13, 2026, 5:00 PM

#

honest hollow Does anyone own a clawbox?

I do... I have two of them. 67 TFlops.. pretty decent and it works fine. Beware there are some other ones with the same name. You want the Bulgarian one based on the Nvidia Orion Super.

I think the M4 is faster in some ways on paper. But you only get 8gb of unified RAM. It can be expanded to 2x2Tb SSD. BUT it has half the power consumption and is designed to be on 24/7 which the Mini is not.. that was the deal breaker for me.

honest hollow Mar 13, 2026, 5:01 PM

#

broken moth I do... I have two of them. 67 TFlops.. pretty decent and it works fine. Bewa...

What model are you using?

magic raven Mar 13, 2026, 7:20 PM

#

broken moth I do... I have two of them. 67 TFlops.. pretty decent and it works fine. Bewa...

67

restive trout Mar 14, 2026, 8:47 AM

#

Hello everyone, is anyone succeeded to run ollama with openclaw using local models like qwen3.5 on cpu? I am struggling since a week but no luck on local models.

pliant wren Mar 14, 2026, 4:18 PM

#

@restive trout Running qwen3.5-9b on surface laptop 7. 16gig RAM iGPU+CPU. Speed is decent not sure about quality. Way enough for background tasks. Usable for chats.

restive trout Mar 14, 2026, 4:56 PM

#

Thanks. Actually, I got 32 GB RAM but still response time is way slow. Min max cloud version seems decent with free token cap.

heady bobcat Mar 14, 2026, 5:49 PM

#

Quality is good without openclaw

Same prompt with openclaw, with qwen 3.5 9b responses miss important details, even plain simple requests

craggy ferry Mar 14, 2026, 8:32 PM

#

When you say same prompt. Are you sending the openclaw system prompt in the “without openclaw” tests

#

Because if not then you’re not using the “same prompt”

stiff spoke Mar 14, 2026, 9:18 PM

#

I'm using Qwen3.5-35B-A3B-8bit through LMStudio on a Mac Studio M1 Ultra with 64GB of RAM and it's doing really well to power my 2 OpenClaws (and one picoclaw) running on Raspberry Pi's. I'm not doing anything super-complex, but I'm impressed with the quality of the responses from the LLM. Tool use is fine, web research, a few other things. I use Opus4.6 from a base Mac Mini running my main OpenClaw.

quartz pawn Mar 14, 2026, 11:33 PM

#

I use Qwen3.5 27b with a 5090 and 3090. It uses 39gb vram (Q_4_K_S) and runs at 58 tokens/sec since it needs both GPUs. I can squeeze Qwen3.5 35B a3b into my 5090 and it runs at 190 tokens/sec. I feel like 27b is a little smarter

fierce lantern Mar 15, 2026, 3:03 AM

#

Does OpenClaw perform better with more RAM or GPUs ? I am debating between 256GB ram vs 96GB Ram

quartz pawn Mar 15, 2026, 3:46 AM

#

fierce lantern Does OpenClaw perform better with more RAM or GPUs ? I am debating between 256...

It more a matter of what models you want to run and the hardware you need to run them

#

I'd get 96GB and spend the extra money on GPU(s). Increase your page file to 500GB and suddenly RAM doesn't matter as much. Once the model is loaded, it runs from your GPU's VRAM

tired glade Mar 15, 2026, 3:52 AM

#

restive trout Hello everyone, is anyone succeeded to run ollama with openclaw using local mode...

i did that

#

i ran a qwen3-8b with ollama by... gpu, not cpu.

#

yet i also ran a qwen3-1.5b on an orange pi rv2 (guided by its official manual)

quartz pawn Mar 15, 2026, 5:05 AM

#

It looks like I can run Qwen3.5 35B Q4_K_S on my 5090 and Q3_K_XL on my 3090 on different ports and have OpenClaw use both simultaneously

fathom summit Mar 15, 2026, 5:42 AM

#

Guys, can you explain to me, like, what the thought process is of getting an M4 setup for anything that you're doing with this? It's like mind-boggling to me, but maybe I just don't understand it and I'm not a cynical person. I'm just asking questions for curiosity's sake.

#

I wouldn't even buy M2 for any reason.

#

Bro, I thought one of them responded to you, so I didn't bother, but no, you shouldn't. You should just go get some PC box that you can put hardware in when you want to level it up rather than a very expensive box that you can't really do anything with other than what it does, and what it does is over expensive and underperformed.

#

And when I say overexpensive, I'm talking about like exponentially overpriced and incompatible and nowhere to be found on benchmark reports or rankings, and that's my opinion.

#

I will give it to them that the M series is like a massive improvement to what Macintosh hardware was doing prior, but that's what they get for being in that deal with Intel all those years.

magic cosmos Mar 15, 2026, 5:48 AM

#

small, quiet. cheapest per year in electricity use. fast memory for qmd models. has ethernet. latest macOS. unix underpinning. all right out of the box. probably lots more i’m forgetting.

fathom summit Mar 15, 2026, 5:50 AM

#

One more thing, I don't support companies that take advantage of their loyal consumers. Locking them in with proprietary shit is already messed up, but then the whole thing with the unwillingness to adapt and upgrade iMessage into RCS back in 2019 when Google was begging them to do it because it's the new security standard, it's the best protocol, and iMessage is using cell tower or Wi-Fi data, which is a vulnerability. And they said, nope, nope, nope, nope, because they didn't think yet that they could just do RCS and still leave the poor man in every other phone that's not iPhone in a green text and utilize RCS. And then what do they do? They get forced by the government to implement it in what, last year? And then they do, but they only implement it somewhat. So Apple to Apple is encrypted, but Apple to Android is not encrypted because they're fucking assholes. And chat features aren't available. It still looks like I'm poor because of my Android phone. But meanwhile, I've had, quote, iMessage available with every phone manufacturer other than iPhone all the way back since RCS came out in 2019.

#

God damn it, I only made one point, fuck. I have like 50 points to make about why you shouldn't support them, why they're fucking you and treating you like a child. And I think Macintosh is the fucking absolute worst. I mean, I can understand an iPhone, I like the iPhone, truthfully. I don't think it's innovative, but I like it. It's nice to use sometimes. But the iMac, or nah, I don't know, man.

#

If you can't go on Facebook Marketplace and either A) find somebody selling their old shit so that you could put it in your old shit and upgrade it a little bit, or you can't even go on there, even if somebody was selling it to upgrade your shit, because you can't even do it, even if you could buy it. Not without lengthy reverse engineering processes, a nightmare of breaking that case open, trying to figure out how to get it back together because there's all kinds of booby traps hooked into it and whatever else. And then when you finally get it powered up and then it's bricked, then you wanna go jump in the river. I just can't do it. I really hate tech that uses their marketing maneuvers. Makes me sick.

#

Plus, it's like ten times too expensive for what it is. Okay, I'm done. I don't mean to be a hater, this is just my honest opinion.

#

Okay, one more. iPhones didn't even have multimedia messaging until the iPhone 3GS came out and they rolled out that firmware. That's really bad. That was about ten years after MMS had been on every other device. And if you had an iPhone 3 and you wanted to send a video, well guess what, you had to buy a 3GS.