#hardware | Friends of the Crustacean 🦞🤝 | Page 5

fathom summit Mar 15, 2026, 5:55 AM

#

They don't even AI, I mean, like, this is crazy. Too bold and cocky.

wanton coral Mar 15, 2026, 5:56 AM

#

too old and cooky

fathom summit Mar 15, 2026, 5:56 AM

#

I can't understand it. There was a period of time where I wanted an iMac so bad as a photographer and graphic designer, and that lasted for about a year in 2010 or so, and then I realized, wait, I can just calibrate my monitor and it's just as good. Wait, I can calibrate it and then also just buy parts on eBay and put them in, and now all of a sudden I'm performing better than that thing would have done? Oh, never mind, I'm good.

#

Good thing Obama passed that credit card law, otherwise I would have been stuck with one. Feeling stupid.

#

No more bucket hats and no iMac for me.

#

Whoa, I just tripped myself out because when I said no more bucket hats, I guess I was referring to past tense because I went to Ross about an hour ago.

Lmfao https://imgur.com/a/GJsxTaA

#

I swear to God, I totally forgot that I even bought this stupid hat.

fathom summit Mar 15, 2026, 6:03 AM

#

wanton coral too old and cooky

Oh, you mean me? Fuck, I guess you're right.

#

Listen up youngins, you guys are trippin'

fathom summit Mar 15, 2026, 6:06 AM

#

wanton coral too old and cooky

Okay, get the fuck out of here. I like your photography, by the way, but you're a birdwatcher, and I know damn well birdwatchers are the oldest and the kookiest, and if you're not old yet, you will be, and when you get old, you'll be the kookiest.

fathom summit Mar 15, 2026, 6:06 AM

#

wanton coral too old and cooky

Also, did you get a partnership with MyRadar? Cause that's pretty dope.

wanton coral Mar 15, 2026, 6:16 AM

#

fathom summit Okay, get the fuck out of here. I like your photography, by the way, but you're ...

haha, thank you so much! def true about birders and photographers. Some of the weirdest craziest people out there 😂

magic raven Mar 15, 2026, 8:04 AM

#

my Claw now has roomba capabilities

#

it can just show up in my room and grab my sock and put on my lap saying "go do the laundry lazy mf"

fathom summit Mar 15, 2026, 9:40 AM

#

wanton coral haha, thank you so much! def true about birders and photographers. Some of the w...

That's funny, I know. I happen to live on a peninsula in the Northeast in which migratory birds are extremely abundant and rare, and so everybody who lives here happens to have at least one lens with a tripod mount on it per household that cost ten grand each. If you wanna go to the park and go for a walk with your friends and talk, forget it dude, it's bird watching time. I tell them to kick rocks.

#

Mostly because I can't afford a lens to shoot fucking birds with.

heady bobcat Mar 15, 2026, 1:40 PM

#

I've had some success with https://huggingface.co/mradermacher/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF

teal crow Mar 15, 2026, 6:39 PM

#

I’m running OpenClaw on a 16GB M1 MacBook Pro I bought off Facebook marketplace for $400 because it had a cracked screen. I just plugged it into a cheap travel monitor for setup and now run it headless. It connects over my network to another 64 GB M1 MacBook Pro Max which is running LM Studio for the LLM.

So far, this architecture is working for me pretty well. I’ve blown through 17M tokens in the last week and it hasn’t cost me a dime in api costs.

barren trail Mar 15, 2026, 6:44 PM

#

teal crow I’m running OpenClaw on a 16GB M1 MacBook Pro I bought off Facebook marketplace ...

What is your preferred model?

teal crow Mar 15, 2026, 6:49 PM

#

Im still dialing that in. I was running Qwen 3.5 in different variants and liked it, but it kept throwing <tool_call> code into the chat and aborting operations. Once I switched from Ollama to LM Studio that went away.

Currently I’m using Ministral-3-14b-reasoning. I like how quick this model is for operations tasks, but it definitely hallucinates to an infuriating level so I’m going to be switching to another model for creating content. I tried to have it draft an email yesterday for me and told it 4 times to stop putting hyphens in the middle of the sentences and it just kept doing it.

barren trail Mar 15, 2026, 6:52 PM

#

teal crow Im still dialing that in. I was running Qwen 3.5 in different variants and liked...

Ok - curious to learn when you have access to a great local model

fathom summit Mar 15, 2026, 7:06 PM

#

teal crow I’m running OpenClaw on a 16GB M1 MacBook Pro I bought off Facebook marketplace ...

At first when I read this, I thought, okay, that's reasonable, but now that I'm thinking about it, that's still too expensive. 16 gigabytes, and then it's an M1 that you can't upgrade. That's... I hate Mac.

#

Also, my bad, I'll stop the anti-Mac rhetoric here. I'm obviously in the way and not contributing positively here. Lol

#

Just keep in mind you can get a Steam Deck for less than that, way less than that, and be running Arch Linux on a capable device. You can mod it as well if you please.

#

(My Steam Deck is essentially what i use, similar to how everyone in here uses the' M1 through 4s lol.)

teal crow Mar 15, 2026, 7:36 PM

#

Why would I need to upgrade it? It’s literally only running OpenClaw. Watching activity monitor this machine is overpowered for what Openclaw needs.

teal crow Mar 15, 2026, 7:37 PM

#

barren trail Ok - curious to learn when you have access to a great local model

Me too. Haha

craggy ferry Mar 15, 2026, 11:44 PM

#

yes, it's your opinion, and that's all it is, it doesn't bear any relation to reality. it's not 1995 anymore, lol

#

M1s were good chips. PC makers are still trying to compare their chips to the M2, lol

fathom summit Mar 15, 2026, 11:45 PM

#

craggy ferry yes, it's your opinion, and that's all it is, it doesn't bear any relation to re...

In 1995 Macs were revolutionary and changed the whole entire computer industry. Learn your history, kid.

#

That was only a few years after a GUI was made

craggy ferry Mar 15, 2026, 11:46 PM

#

haha that's not what happened in 95

fathom summit Mar 15, 2026, 11:46 PM

#

Thanks to Steve

craggy ferry Mar 15, 2026, 11:46 PM

#

learn your history. I grew up in it, it's not history to me, kid

fathom summit Mar 15, 2026, 11:47 PM

#

I understand. I mean I've seen the computer systems and we certainly didn't have a Macintosh but I sure did have the flag thing with the rapist guy.
Hey hey hey, you didn't catch my casual stoic sarcasm. I was just being a silly jerk, is all. I wasn't really calling you a kid, man.

craggy ferry Mar 15, 2026, 11:48 PM

#

84 is when the GUIs were made; when the Mac was "revolutionary". 95 was when Win95 came out. Remember Win95? Remember people lining up to buy an operating system, but they didn't know what an OS was, or even owned a computer, because they were so hype about it that they wanted it?

Well, I remember all of that, because I was a conscious human being at the time

fathom summit Mar 15, 2026, 11:48 PM

#

And hey in life all we all really have is time and whatever you think and so, other than that, nothing really matters. Let's make cool stuff; that's all I care about. We can still work together.. But like dude, are you team PlayStation or Xbox? Can we fight about that instead?

#

Yeah I'm aware man but you gotta remember that it doesn't matter when it was made if it wasn't in anybody's home

#

Or Office. You can recall the iMacs with the color back. The big bubble thing is the first computer let's be real.

#

Those came out in about '95 and I believe they came out after Bill Gates stole the ideas.

craggy ferry Mar 15, 2026, 11:49 PM

#

i notice the goal posts are just whizzing by now

fathom summit Mar 15, 2026, 11:50 PM

#

We got all those corny commercials

Oh man, I can't even predict anything anymore. I just stopped trying.

craggy ferry Mar 15, 2026, 11:50 PM

#

hey, did you know windows 2.0 was when bill gates stole the ideas?

fathom summit Mar 15, 2026, 11:50 PM

#

I don't even know what the latest hardware is anymore, and that used to be something I always knew.

craggy ferry Mar 15, 2026, 11:50 PM

#

and yet you have strong opinions on the value of apple hardware for money

#

cool conversation

fathom summit Mar 15, 2026, 11:51 PM

#

No, I don't necessarily know the fine details of everything. I know Windows 3.0 was poppinn my dad3.0 was poppin' though, my dad was mad excited. Fucking doing spreadsheets, haha.

craggy ferry Mar 15, 2026, 11:52 PM

#

i'm just saying that if you don't know what modern apple hardware is capable of vs modern pc hardware maybe don't write a page long screed about how much worse it is

fathom summit Mar 15, 2026, 11:52 PM

#

I didnt say i wasnt aware of price to performance or benchmark comparisons

#

I follow the economics like more than I should, because, again, I'm not familiar with the components or like, I don't know how much VRAM is in a NVIDIA GPU these days. I just don't care anymore because I'm not buying one.

craggy ferry Mar 15, 2026, 11:53 PM

#

and don't act like you know the history of computing and look down your nose at other people who literally lived through it when you don't actually remember anything lol

fathom summit Mar 15, 2026, 11:53 PM

#

I guess maybe I studied that because I'm like planning for the future of buying old hardware someday, sometime, when life is less expensive, but it doesn't seem like it's gonna be coming anytime soon.

#

Bro, I don't look down on anybody. What are you talking about?

craggy ferry Mar 15, 2026, 11:53 PM

#

ok great. so why do you care so much about what other people are buying when you don't know what they're buying

fathom summit Mar 15, 2026, 11:53 PM

#

If you wanna be a sensitive little bitch, I'll call you a bitch and you can be a bitch, and that's fine, but like, I was just having casual conversations and friendly banter here, and that's why I made that clear after every time I said anything negative.

craggy ferry Mar 15, 2026, 11:54 PM

#

wow

fathom summit Mar 15, 2026, 11:54 PM

#

I'm just saying, I was trying to be cool with you.

craggy ferry Mar 15, 2026, 11:54 PM

#

i'm sorry you've lost so hard you don't have anything else for me but insults

fathom summit Mar 15, 2026, 11:55 PM

#

No, I don't have any insults for you. I said if that's what you wanna do, then that's what we'll do I didn't actually call you that, though, you know what I mean?

#

I'm saying if that's what we're gonna do, then that's how it'll be then. But like, I'd rather just be friends.

craggy ferry Mar 15, 2026, 11:55 PM

#

always funny when the person claiming someone is sensitive is the first one to actually resort to insults

fathom summit Mar 15, 2026, 11:55 PM

#

Just hop on a call with me, bro, because I don't think we're gonna solve this through reading texts alone at this point.

craggy ferry Mar 15, 2026, 11:55 PM

#

honey, don't hide behind silly word games

#

you called me that

#

just own it

#

wow

#

what a coward, lol

fathom summit Mar 15, 2026, 11:57 PM

#

And again, I don't care about arguing over computer specs or anything. It's just friendly banter. It's also one of them things where, like, I've had an Android for years and like, just because I thought the phones were cool. And I mean, like, I get excluded from group chats and shit and people just are always giving me shit. I go on a date with some chick and it's like, oh, green text, red flag every single time. And so I just, I talk my shit, but like, I don't really care. What the fuck? What the fuck do I care about what other people are doing? Though, if people are gonna ask me for advice, like, you know, I did say, I don't think it's worth it. And then, really, this all started from a genuine question when I was asking, like, like, for the purpose of running Claw, I just didn't understand why one would buy a Mac for it rather than like, uh, one of those mini computers. That, that's a genuine curiosity.

craggy ferry Mar 15, 2026, 11:57 PM

#

lmao

fathom summit Mar 15, 2026, 11:57 PM

#

It's not word games, you're being a bitch, so I'm gonna call you a biznatch.

white ivy Mar 15, 2026, 11:57 PM

#

@fathom summit is right, from an outside reader with popcorn 🍿

#

Chill

craggy ferry Mar 15, 2026, 11:57 PM

#

right. I know. I know you did that. That's why I said you did that. Then you tried to claim you weren't doing that.

#

Anyway I hope that was entertaining for the peanut gallery

jolly creek Mar 15, 2026, 11:58 PM

#

Enjoy your mutes both of you

#

For everyone else, go read the #rules again, esp rule 3

teal crow Mar 16, 2026, 12:59 AM

#

jolly creek For everyone else, go read the <#1471745479229309039> again, esp rule 3

Thanks for reminding me that there is a mute button. Ha

steep wedge Mar 16, 2026, 7:31 PM

#

Does anybody have a DGX Spark cluster? I bit the bullet and ordered a second Asus GX10, and I want to make sure I buy a compatible interconnect cable. The Asus branded cable appears to be out of stock.

verbal kraken Mar 16, 2026, 8:46 PM

#

Looking to buy mac ultra m3's with 256GB or more of ram. Anyone have a lead or wanna sell or invest it hit me up.

minor plank Mar 17, 2026, 9:32 PM

#

Anyone have a "Jarvis-like" voice interaction app that i could install on a Android smart watch to "PTT" or keyword trigger / TTS response style interaction with OC? My Kids are requesting that.

fathom summit Mar 18, 2026, 4:03 AM

#

minor plank Anyone have a "Jarvis-like" voice interaction app that i could install on a Andr...

Interesting, which watch?

broken moth Mar 18, 2026, 12:11 PM

#

I bought a ClawBox... The 67 TFOP one. Brilliant..

stiff tree Mar 18, 2026, 12:33 PM

#

broken moth I bought a ClawBox... The 67 TFOP one. Brilliant..

what is ClawBox and how much?

weary reef Mar 18, 2026, 12:45 PM

#

steep wedge Does anybody have a DGX Spark cluster? I bit the bullet and ordered a second Asu...

I have 2 running in cluster

steep wedge Mar 18, 2026, 12:54 PM

#

weary reef I have 2 running in cluster

Which cable did you buy, or did you buy the pack that included everything?

weary reef Mar 18, 2026, 12:55 PM

#

steep wedge Which cable did you buy, or did you buy the pack that included everything?

it was all in a package we got from Dell ill see if i can get more info

#

what is a good model to use at home I am trying the nemostorm i think is the name 122b i think.. but context window is a bit small

broken moth Mar 18, 2026, 1:10 PM

#

@stiff tree openclawhardware.dev

weary reef Mar 18, 2026, 1:52 PM

#

steep wedge Which cable did you buy, or did you buy the pack that included everything?

Some very cheap qsfp cables that wont do the 200gbs is what it seems we have 🙁

steep wedge Mar 18, 2026, 1:59 PM

#

weary reef Some very cheap qsfp cables that wont do the 200gbs is what it seems we have 🙁

Oh no, that sucks. Especially for a bundled solution. I’m cautiously optimistic the one I ordered from NADDOD will do the trick.

weary reef Mar 18, 2026, 2:00 PM

#

steep wedge Oh no, that sucks. Especially for a bundled solution. I’m cautiously optimistic ...

yeah have not looked into replacing them yet. trying to find a good model to run for the claw

minor plank Mar 18, 2026, 4:43 PM

#

fathom summit Interesting, which watch?

Samsung Galaxy Watch 5 in this case

tired plover Mar 18, 2026, 4:55 PM

#

broken moth <@1305280235545038850> openclawhardware.dev

That’s not really carbon isn’t it hahahah looks like 3D printed case with Jetson for 550€ 😂

fierce lantern Mar 18, 2026, 7:53 PM

#

When loading local models, having 1GB storage doesnt seem to be enough. Is an external thunderbolt 5 drive good enough or should I be boosting internal storage with NvME, eSATA ? Comments , thoughts, insights.

lyric orchid Mar 18, 2026, 10:00 PM

#

fierce lantern When loading local models, having 1GB storage doesnt seem to be enough. Is an ...

are you talking about storage to save the model files? You'll need a lot more than 1 gb. if you're talking about gpu, 1 gb isn't going to hold much re: model size. Personally I think you need 24 to 32 gb vram (gpu) to start getting working / useful models. 12 gb vram is probably a minimum.

fierce lantern Mar 18, 2026, 10:05 PM

#

lyric orchid are you talking about storage to save the model files? You'll need a lot more t...

Asking mor about internal storage like harddrives / SSD / NVME the things that hold all my pr0n

spiral vector Mar 19, 2026, 12:26 AM

#

fierce lantern Asking mor about internal storage like harddrives / SSD / NVME the things that...

There's entire petabyte SANs for that

#

Asking how much storage you want is about like asking how much house do you want - the answer is generally something like "as much as you can afford". Personally, I think most people should get a minimum of 1TB of fast NVME storage just for OS, applications and frequently accessed files. Then consider exactly how much storage, and of what sort of of frequency you access it. Try as you might, its pretty hard to watch 10 pr0n films at the same time - so that can be exported to slower, external storage if needed. But if you want to keep your entire Steam library downloaded at all times, then you probably want 2 or 4 TB of fast storage (games generally don't run as well from external storage).

Personally, my OS drive is a fast 1TB nvme. My game drive is 4TB fast nvme. Then I have 30TB external storage in my NAS for backups of all my linux ISOs.

#

If its just openclaw on a mac mini/strix halo/dgx spark, then a 1TB nvme will do fine. Even 120B models aren't that huge (storage wise - you'll run out of VRAM long before you run out of storage space).

fierce lantern Mar 19, 2026, 1:09 AM

#

Ok. Given I have never played with this stuff and am looking to get a new machine to play with this stuff. Are the models sucked into RAM initially and then accessed or are they kind of like a database where you read stuff here and there.

spiral vector Mar 19, 2026, 1:10 AM

#

You could think of the models as a database that must entirely be loaded into VRAM (doesn't work well in system RAM). Loaded onto your video card's RAM. But exceptions here are important - all 3 of the items suggested (mac mini/strix halo/dgx spark) have shared video RAM and system RAM. The fully shared RAM works well. If you got a more traditional PC, then you'd need to look at both seperate RAM and huge amounts of VRAM (which only comes with whatever GPU you buy).

fierce lantern Mar 19, 2026, 1:12 AM

#

ah perfect. Thanks

#

what are the buzz words one would google / duckduckgo / askgeeves to understand this more ?

spiral vector Mar 19, 2026, 1:14 AM

#

If you look for "openclaw hardware" you'll get a ton of hits

fierce lantern Mar 19, 2026, 1:17 AM

#

@spiral vector thanks for all this. Appreciate helping me get started.

spiral vector Mar 19, 2026, 2:14 AM

#

I'd recommend spending some time to understand the basics first. Then read into https://github.com/explaindio/ClawEval/tree/master - I think ClawEval is probably the most comprehensive list of what local models are good for OpenClaw. My only real complaint with this analysis is that they ONLY compare the open source models (and for some workloads - you really just want to run Opus 4.6 or GPT 5.4, despite how expensive those API costs can be.)

But ClawEval generally approaches this discussion from - here's what different models can do (from which you can derive what sort of hardware you may want). They do have a good docs section that goes into detail of what you get at each of the various levels of VRAM, but they don't directly recommend hardware. And because they don't consider HW (they just do cloud hosted LLM comparisons), they don't really analyze the 3 setups that I think are probably the most interesting (mac mini/strix halo/dgx spark).

full talon Mar 19, 2026, 12:25 PM

#

spiral vector I'd recommend spending some time to understand the basics first. Then read into...

I see ClawEval added "Which Tested Models Fit on Your Hardware: section with that for Mac Mini M4 (16 GB) Mac Mini M4 Pro (24 GB) Mac Mini M4 Pro (48 GB) Strix Halo (96 GB GPU) DGX Spark (96 GB GPU)

woven galleon Mar 19, 2026, 7:26 PM

#

Hey guys

#

Sry if stupid question

#

Mac Minis are sold out everywhere in Aus, looking to get this instead. Thoughts? 7 MAX Mini PC (2026 Flagship Performance) AMD Ryzen 9 7940HS 16GB DDR5 (Up to 128GB) 1TB SSD Mini Desktop Computers, Radeon 780M Graphics/8K Quad

full talon Mar 20, 2026, 12:36 AM

#

woven galleon Mac Minis are sold out everywhere in Aus, looking to get this instead. Thoughts?...

Depends if you want it run local more or just open claw. If just open claw pretty much any comp with 16GB will do

ember lichen Mar 20, 2026, 5:05 AM

#

why does openclaw run thru my tokens way faster then my claude does

sonic mantle Mar 20, 2026, 5:11 AM

#

ember lichen why does openclaw run thru my tokens way faster then my claude does

Becuz tokens r like mc Donald's nuggets for ur bots :p

ember lichen Mar 20, 2026, 5:12 AM

#

sonic mantle Becuz tokens r like mc Donald's nuggets for ur bots :p

had one agent only and it went them in less then a hour doing simple tasks

sonic mantle Mar 20, 2026, 5:12 AM

#

ember lichen had one agent only and it went them in less then a hour doing simple tasks

Are you able to run any local llms?

ember lichen Mar 20, 2026, 5:13 AM

#

sonic mantle Are you able to run any local llms?

im not sure, new to this whole thing as of today how would i know if i can

sonic mantle Mar 20, 2026, 5:14 AM

#

Whats ur pc specs?

ember lichen Mar 20, 2026, 5:14 AM

#

sonic mantle Whats ur pc specs?

the pc my open claw is on, 4060ti 32gb ram and 2tb ssd

sonic mantle Mar 20, 2026, 5:16 AM

#

ember lichen the pc my open claw is on, 4060ti 32gb ram and 2tb ssd

Good hardware. U should be able to run a local 7B ollama model for simple stuff so u dont burn credits

#

Then maybe have an hourly cap for api models

ember lichen Mar 20, 2026, 5:16 AM

#

sonic mantle Then maybe have an hourly cap for api models

i was just using sonet 4.5

#

figured its not too bad

sonic mantle Mar 20, 2026, 5:19 AM

#

U should try setting up qwen2.5-coder locally

ember lichen Mar 20, 2026, 5:19 AM

#

how would i learn how to do that

sonic mantle Mar 20, 2026, 5:31 AM

#

ember lichen how would i learn how to do that

Whats ur Operating system?

ember lichen Mar 20, 2026, 5:31 AM

#

windows

#

HTTP 401: authentication_error: OAuth token has expired. Please obtain a new token or refresh your existing token. (request_id: req_)

also do you know how i fix this

#

i cant use my openclaw at all rn

#

bc of this

sonic mantle Mar 20, 2026, 5:39 AM

#

U most likely reached an api limit

sonic mantle Mar 20, 2026, 5:40 AM

#

ember lichen windows

Check this vid out https://www.youtube.com/watch?v=z7fhyKBAfzE

ember lichen Mar 20, 2026, 5:43 AM

#

sonic mantle U most likely reached an api limit

nono im far from it. how do i re-enter my token

sonic mantle Mar 20, 2026, 5:51 AM

#

ember lichen nono im far from it. how do i re-enter my token

Ur using claude right? Maybe log into claudes panel n regenerate ur token

ember lichen Mar 20, 2026, 5:51 AM

#

i cant use claude it doesnt know how, and my open claw wont work due to the issue.

sonic mantle Mar 20, 2026, 5:56 AM

#

ember lichen i cant use claude it doesnt know how, and my open claw wont work due to the issu...

Try this
openclaw doctor --fix if ur running from command prompt

ember lichen Mar 20, 2026, 5:57 AM

#

how

#

my actaul claw has no brains rn since it doesnt have a token

#

forget this i have a api key, how do i use it

#

ill just take on the costs

ember lichen Mar 20, 2026, 5:59 AM

#

sonic mantle Try this ```openclaw doctor --fix``` if ur running from command prompt

also that didnt work just so u know

sonic mantle Mar 20, 2026, 6:01 AM

#

ember lichen also that didnt work just so u know

Here check step 6 and 7 should be helpful
https://open-claw.org/docs/openclaw-setup

#

im assuming ur using the interface?

ember lichen Mar 20, 2026, 6:03 AM

#

i dont have the subscription

#

i didnt know u have to buy openclaw.

sonic mantle Mar 20, 2026, 6:05 AM

#

ok whats ur setup?

#

how did u start openclaw n have it running before

ember lichen Mar 20, 2026, 6:06 AM

#

i just did powershell -c "irm https://openclaw.ai/install.ps1 | iex" and followed the steps

#

worked for hours

#

then when i came home and tried to prompt it failure happened

sonic mantle Mar 20, 2026, 6:06 AM

#

okay bet when u type openclaw into powershell does it run anything

ember lichen Mar 20, 2026, 6:06 AM

#

yes it loaded a TON of stuff

sonic mantle Mar 20, 2026, 6:07 AM

#

okay good paths setup

#

try this

openclaw onboard

#

(should guide u through the setup dialog where you can place ur api key)

ember lichen Mar 20, 2026, 6:07 AM

#

would this be making a new agent?

sonic mantle Mar 20, 2026, 6:08 AM

#

ember lichen would this be making a new agent?

No just for setting up api key stuff

ember lichen Mar 20, 2026, 6:08 AM

#

whats the best model

#

for just all around use

#

sonnet?

#

and if so witch one

#

wont drain tokens as quick

sonic mantle Mar 20, 2026, 6:10 AM

#

sonic mantle try this ```openclaw onboard```

do this after ```
openclaw gateway restart

ember lichen Mar 20, 2026, 6:10 AM

#

turns out.... i was using sonnet 4-6

#

explains token use honestly

#

i think sonnet 4-5 is good enough right?

ember lichen Mar 20, 2026, 6:10 AM

#

sonic mantle do this after ``` openclaw gateway restart ```

ok

#

should i do skills during onboarding

#

idk anything about what that does

sonic mantle Mar 20, 2026, 6:13 AM

#

ember lichen i think sonnet 4-5 is good enough right?

hmmm

#

depends what ur goal is

#

For cheap all around models i'd suggest Gemini 2.5 Flash-Lite GPT-5 Nano DeepSeek V3.2 Mistral Nemo just wing it n go off vibes

sonic mantle Mar 20, 2026, 6:15 AM

#

ember lichen turns out.... i was using sonnet 4-6

thats like an elite model xD

ember lichen Mar 20, 2026, 6:16 AM

#

sonic mantle thats like an elite model xD

yeah i figured, i mustve misclicked and thats why i drained tokens so QUICKLY

sonic mantle Mar 20, 2026, 6:16 AM

#

ember lichen yeah i figured, i mustve misclicked and thats why i drained tokens so QUICKLY

xD

#

double check ur key usage has locks so u dont wake up with a $50k bill tmrrw from openclaw trying to draw ascii art in a loop

ember lichen Mar 20, 2026, 6:17 AM

#

i just clicked sonet 4.5 but then it keep using 4.6?

#

how do i confirm limits?

sonic mantle Mar 20, 2026, 6:22 AM

#

ember lichen i just clicked sonet 4.5 but then it keep using 4.6?

i'd prob try n change to a diff model honestly

#

4.6 costs same as 4.5 and is more efficient

ember lichen Mar 20, 2026, 6:23 AM

#

whats a better model

sonic mantle Mar 20, 2026, 6:23 AM

#

ur premuch trying to wire in expensive models n hitting ur limits within an hour

ember lichen Mar 20, 2026, 6:23 AM

#

sonic mantle ur premuch trying to wire in expensive models n hitting ur limits within an hour

yea.

sonic mantle Mar 20, 2026, 6:23 AM

#

sonic mantle For cheap all around models i'd suggest ``` Gemini 2.5 Flash-Lite GPT-5 Nan...

one of these

#

or Claude 3 Haiku

ember lichen Mar 20, 2026, 6:27 AM

#

hm ill look into those tonight

sonic mantle Mar 20, 2026, 6:28 AM

#

ember lichen hm ill look into those tonight

Claude 3 Haiku if ur sticking with claude stuff,
Gemini 3 Flash if u want it dirt cheap

ur paying $0.25 per 1million tokens (Claude 3) Vs googles gemini 3 flash at $0.075 - $0.50 per 1 million tokens

ember lichen Mar 20, 2026, 6:33 AM

#

sonic mantle Claude 3 Haiku if ur sticking with claude stuff, Gemini 3 Flash if u want it di...

Damn Claude 3 cheap

#

how could I make a agent who uses that

#

To do easy work while my other sonnet agent can do hard tasks?

sonic mantle Mar 20, 2026, 6:34 AM

#

shear will power and coffee

ember lichen Mar 20, 2026, 6:35 AM

#

lol.

#

I am so interested in learning all this stuff

#

I don’t want to fall behind the inevitable

sonic mantle Mar 20, 2026, 6:37 AM

#

ember lichen I am so interested in learning all this stuff

knowledge is delicious 🎩

sonic mantle Mar 20, 2026, 6:38 AM

#

ember lichen I don’t want to fall behind the inevitable

dont think like that just have fun ;^

ember lichen Mar 20, 2026, 6:38 AM

#

sonic mantle dont think like that just have fun ;^

it’s hard not too

#

I’m having fun while figuring this out

west anchor Mar 20, 2026, 6:42 AM

#

gun did you make it to the other side yet

ember lichen Mar 20, 2026, 6:44 AM

#

west anchor gun did you make it to the other side yet

Kinda I just got a Claude api key instead of pro plan. I rather use pro plan and just extra costs but idk how to make it work

west anchor Mar 20, 2026, 6:44 AM

#

I can help

ember lichen Mar 20, 2026, 6:44 AM

#

how

west anchor Mar 20, 2026, 6:45 AM

#

If you purchase the claude pro sub, you just go to your terminal and run 'openclaw onboard'

#

it takes you back to the initial setup

#

and on page 2 or 3 where you select your AI model, you arrow down to anthropic, select it, then select OAuth.

#

It opens a webpage, you sign in with the email you subbed under

#

then boom youre in

ember lichen Mar 20, 2026, 6:46 AM

#

really?

west anchor Mar 20, 2026, 6:46 AM

#

openclaw gateway restart

ember lichen Mar 20, 2026, 6:46 AM

#

I’ll try this when I’m done eating

west anchor Mar 20, 2026, 6:47 AM

#

I warn that its against their ToS so you risk a ban

#

if you dont wanna risk that for $20 you can try it with openAI instead

#

but I havent been caught yet

sonic mantle Mar 20, 2026, 6:49 AM

#

ember lichen it’s hard not too

Try to keep a text file for future reference for commands you use. so u dont have to research the same stuff over n over

quaint narwhal Mar 20, 2026, 7:38 AM

#

west anchor If you purchase the claude pro sub, you just go to your terminal and run 'opencl...

you can also just call apcx subagents with the claude CLI which uses the pro sub

#

but you didn't hear that from me

quaint narwhal Mar 20, 2026, 7:49 AM

#

west anchor and on page 2 or 3 where you select your AI model, you arrow down to anthropic, ...

this isn't an option, it asks for setup token or API key. Setup token route is broken, expects a token with a prefix that's not there.

woven galleon Mar 20, 2026, 7:50 AM

#

full talon Depends if you want it run local more or just open claw. If just open claw prett...

Planning to run local llms, tried to run via tokens and its burning through like oil 😭😭 At that point 16gb of ram isn’t enough is it

#

Unless I get the Macs?

quaint narwhal Mar 20, 2026, 7:51 AM

#

woven galleon Planning to run local llms, tried to run via tokens and its burning through like...

i recommend github copilot as the provider, much cheaper than token based providers

woven galleon Mar 20, 2026, 7:51 AM

#

O wat interesting

#

Is the Mac minis chips actually just that much better

#

I should just be patient for the Mac mini restock tbh 🥀😭

quaint narwhal Mar 20, 2026, 7:52 AM

#

opus is 12cents a request vs when I tried the same call on opencode cost me $4 for a similar request

#

yea it's wild lol

woven galleon Mar 20, 2026, 7:52 AM

#

Yes dude yesterday I was trolling it for like 10 mins and it costed me 5$

#

😭

quaint narwhal Mar 20, 2026, 7:52 AM

#

a mac mini isn't gonna run local LLMs very good, you're still gonna be using API lol

woven galleon Mar 20, 2026, 7:53 AM

#

Ahhhh

quaint narwhal Mar 20, 2026, 7:53 AM

#

my pro tip as well, get a claude pro/max plan to build your bot

#

don't use the bot to build the bot

#

it's way too expensive to do that, learn from my pain

woven galleon Mar 20, 2026, 7:53 AM

#

woven galleon Mac Minis are sold out everywhere in Aus, looking to get this instead. Thoughts?...

So at that point these specs are fine?

woven galleon Mar 20, 2026, 7:54 AM

#

quaint narwhal don't use the bot to build the bot

I tried doing that yday with Claude to teach me to teach the bot 😭

#

Then Claude started getting impatient with me

#

XD

quaint narwhal Mar 20, 2026, 7:55 AM

#

woven galleon I tried doing that yday with Claude to teach me to teach the bot 😭

I don't fuck with windows anymore other than gaming

#

you have to run it in WSL

#

and last time I setup WSL it was a pain in the dick

woven galleon Mar 20, 2026, 7:56 AM

#

Linux only?

quaint narwhal Mar 20, 2026, 7:56 AM

#

woven galleon Linux only?

linux or mac, bash based

#

I think that's what's reccomended on the official website as well

vagrant musk Mar 20, 2026, 9:13 AM

#

quaint narwhal a mac mini isn't gonna run local LLMs very good, you're still gonna be using API...

Probably only a dual Connectx-7 dgx stack can run LLMs properly locally

dawn cosmos Mar 20, 2026, 10:58 AM

#

vagrant musk Probably only a dual Connectx-7 dgx stack can run LLMs properly locally

A ryzen AI + 395 Max, 128 LPDDR can run 70b model and 120bq4 - gmteck evo 2 or asus gx10 are couple of machine that have that chip

vagrant musk Mar 20, 2026, 11:16 AM

#

dawn cosmos A ryzen AI + 395 Max, 128 LPDDR can run 70b model and 120bq4 - gmteck evo 2 or a...

dgx stacks don't run 395 Maxs, they run blackwell

vagrant musk Mar 20, 2026, 11:21 AM

#

dawn cosmos A ryzen AI + 395 Max, 128 LPDDR can run 70b model and 120bq4 - gmteck evo 2 or a...

But yes those are also solid - NUCs like gmktec, geekom but also Dock-extended BeeLinks are all capable of running them locally (imo a lot more valua for money than anything Apple has to offer)

I do think a dual stack dgx - like the gx10 you mentioned, there's a couple more I think one from hp and from msi as well that run the same blackwell chips; basically dgx architecture - those are currently the peak of mini-hosts for LLMs

quaint narwhal Mar 20, 2026, 3:12 PM

#

vagrant musk Probably only a dual Connectx-7 dgx stack can run LLMs properly locally

aren't people daisy chaining mac studios to do it too?

vagrant musk Mar 20, 2026, 3:18 PM

#

quaint narwhal aren't people daisy chaining mac studios to do it too?

Boils down more expensive and less value for money and I think it throttles a bit more no? Not sure on the full details

quaint narwhal Mar 20, 2026, 3:19 PM

#

vagrant musk Boils down more expensive and less value for money and I think it throttles a bi...

yea not looking to drop 20stacks to experiment lol

#

can normies readilly able to buy those nvidia boxes?

vagrant musk Mar 20, 2026, 3:35 PM

#

quaint narwhal can normies readilly able to buy those nvidia boxes?

I mean you can buy 1 with like a 200B parameter tollerance roughly for about 2 grand which is pretty standard nowadays seeing how a phone is 1500

quaint narwhal Mar 20, 2026, 3:48 PM

#

vagrant musk I mean you can buy 1 with like a 200B parameter tollerance roughly for about 2 g...

like people can just buy these nvidia boxes? like on amazon or something?

vagrant musk Mar 20, 2026, 3:49 PM

#

look up gx10

pastel scarab Mar 20, 2026, 8:27 PM

#

vagrant musk I mean you can buy 1 with like a 200B parameter tollerance roughly for about 2 g...

and 400/500B?

#

what do i need to have to run those models

vagrant musk Mar 20, 2026, 8:38 PM

#

pastel scarab and 400/500B?

400 you can do a bit at rate limit with 2 DGX Sparks in parallel as I think they can do up to 405B with a connectx-7 connector

above that, you have to stop looking at mini-PCs and start looking at H100s Platforms or so

#

But you're looking at 10x the cost for a leap like that

#

Still also have to factor in that the throughput difference of those machines are like 10-50x the difference as well tho, like H100s, H200s, B200s

craggy ferry Mar 20, 2026, 8:56 PM

#

for 400b (like qwen3.5-397b) a mac studio is actually pretty reasonable at running them for the price

#

prefill sucks so keep contexts short, use smaller models to summarize / call tools, etc, but for one or two convos it works

craggy ferry Mar 20, 2026, 9:24 PM

#

also just run quantized models, you can fit the 6 bit 397b in the 512gb with tons of room for context that you will likely never fill because see aforementioned point about prefill sucking

pastel scarab Mar 20, 2026, 9:26 PM

#

but is there a big difference between quantized and full model?

craggy ferry Mar 20, 2026, 9:26 PM

#

almost nothing at 8 bit, 6 bit you shave a bit more but it's still really close, 4 bit is like the last stop before real degradation happens

#

but you don't need to go below 4 bit unless you're trying to run glm-5 or k2

pastel scarab Mar 20, 2026, 9:27 PM

#

i have like 5k budget

#

what could i buy?

craggy ferry Mar 20, 2026, 9:29 PM

#

256g m3 studios are like 5k with the education discount. do you know literally anyone who is currently in school or works in education

pastel scarab Mar 20, 2026, 9:29 PM

#

me

craggy ferry Mar 20, 2026, 9:29 PM

#

blam

pastel scarab Mar 20, 2026, 9:29 PM

#

how much is the discount?

craggy ferry Mar 20, 2026, 9:29 PM

#

go search "apple education store"

#

it's like 10-15% i think? it covered more than tax for me

pastel scarab Mar 20, 2026, 9:30 PM

#

im living in europe so its 6,5k euro

craggy ferry Mar 20, 2026, 9:30 PM

#

after edu discount? damn, that's annoying

pastel scarab Mar 20, 2026, 9:30 PM

#

but with the 60 core gpu

pastel scarab Mar 20, 2026, 9:30 PM

#

craggy ferry after edu discount? damn, that's annoying

jap

#

its like almost 10k without discount full setup 256gb

craggy ferry Mar 20, 2026, 9:31 PM

#

yeah without edu discount it's 6k before tax in us

pastel scarab Mar 20, 2026, 9:31 PM

#

are u from us?

craggy ferry Mar 20, 2026, 9:31 PM

#

yeah

pastel scarab Mar 20, 2026, 9:38 PM

#

or should i wait for the m5?

vagrant musk Mar 20, 2026, 10:23 PM

#

craggy ferry for 400b (like qwen3.5-397b) a mac studio is actually pretty reasonable at runni...

Ye, can't run em on the minis

#

Gotta go bigger

dawn girder Mar 20, 2026, 10:45 PM

#

vagrant musk Ye, can't run em on the minis

I was thinking that with the stock grade Mac mini

#

I was thinking about getting a 32GB

vagrant musk Mar 20, 2026, 11:06 PM

#

dawn girder I was thinking that with the stock grade Mac mini

Idk maybe I'm just biased but I don't see the value - I think you can literally get a Geekom/gmktec/beelink for roughly the same price but like 96-128GB Ram

winter lynx Mar 21, 2026, 2:21 AM

#

craggy ferry 256g m3 studios are like 5k with the education discount. do you know literally a...

is that for one of the apple models ?

craggy ferry Mar 21, 2026, 5:49 AM

#

winter lynx is that for one of the apple models ?

Yes apple makes macs

low aurora Mar 21, 2026, 6:03 AM

#

someone using lepotato soc hardware?

pastel scarab Mar 21, 2026, 4:57 PM

#

do you know a good model for cold emails?

tired plover Mar 21, 2026, 5:04 PM

#

winter lynx is that for one of the apple models ?

i just saw 700€ discount from 7300€, so not really 5k...

grave shoal Mar 21, 2026, 6:22 PM

#

Also check the official Apple refurb store.

#

Got my Mac Studio from there.

reef hollyBOT Mar 21, 2026, 8:11 PM

#

gonna be getting a Mac Mini for my bot, Clawy (and switching him to a local model, hopefully that doesn't affect his ability to post on Moltbook), because honestly the 128k usage token limit for cloud models that Ollama offers for the free tier is pretty limited

solemn valeBOT Mar 21, 2026, 8:11 PM

#

@silver ginkgo, Openclaw isn't affiliated with Moltbook. Moltbook is a separate user-developed project, so we would prefer it not be discussed in this server.

silver ginkgo Mar 21, 2026, 8:12 PM

#

solemn vale <@1331870813527478302>, Openclaw isn't affiliated with Moltbook. Moltbook is a s...

ok

lament hull Mar 21, 2026, 9:21 PM

#

Why are all the Mac minis sold out.

spiral vector Mar 21, 2026, 9:58 PM

#

lament hull Why are all the Mac minis sold out.

Intersection of 3 points. 1 - the success of openclaw has really driven demand more than apple expected. 2 - Apple is in the process of switching their M4 lineup to the new M5 lineup. 3 - global RAM shortage (Yes, I know the built-in memory on mac silicone is different, but at some level people will go for whatever PC parts they can get)

lament hull Mar 21, 2026, 10:23 PM

#

Yeah it has gotten insane.

winter lynx Mar 21, 2026, 11:13 PM

#

spiral vector Intersection of 3 points. 1 - the success of openclaw has really driven demand ...

Its crazy how much demand openclaw specifically, and AI in general has caused the prices in the market to be SOOOO high

vocal shard Mar 22, 2026, 12:05 AM

#

anyone still using a 2018 mac mini for their agent?

surreal nova Mar 22, 2026, 5:40 AM

#

winter lynx Its crazy how much demand openclaw specifically, and AI in general has caused th...

Yeah. But for a tiny sandbox box or a lots of ram lets load a model box

shrewd nest Mar 22, 2026, 2:45 PM

#

spiral vector Intersection of 3 points. 1 - the success of openclaw has really driven demand ...

we can use a VPS right? why Mac Mini 🤔

shrewd nest Mar 22, 2026, 3:18 PM

#

Is a 16GB M1 Mac Mini good enough?

spiral vector Mar 22, 2026, 5:15 PM

#

shrewd nest Is a 16GB M1 Mac Mini good enough?

Good enough for what exactly? Yes, its good enough for some things, no not goot enough for all things. If you scroll up a bit I repasted the link for "claw eval" on github. They do a great job of detailing what the different models can do.

Personally, I played with openclaw in a VM on my NAS - connected to various cloud service APIs. I never was able to get it as locked down as I was comfortable with - although now with nemoclaw from nvidia that seems improved (but still not completely fixed). When Claude code rolled out their remote access and now claude code channels I jumped back into that. (Which fits great on any old mini-PC.)

full talon Mar 22, 2026, 8:39 PM

#

spiral vector Good enough for what exactly? Yes, its good enough for some things, no not goot...

claw eval just posted MiniMax 2.7 tests for openclaw agents

wide roost Mar 22, 2026, 8:53 PM

#

Hi all,

Evaluating my setup's cloud cost equivalent and curious about your experiences. Here's what I'm running locally:

Compute Nodes:

Node CPU RAM GPU/Accel Cloud Equivalent
Brain Ryzen 5 4500 (6c) 15GB RX 550 4GB ~$40/mo
Nebenhirn Ryzen 7 2700 (8c) 31GB GTX 1650 4GB ~$60/mo
Muskeln - 62GB RTX 2070 Super 8GB ~$150/mo
LubanCat 4x ARM 3.8GB - ~$15/mo
Pi5_1-4 4x ARM 16GB total - ~$20/mo
Kleinhirn 2 RK3588 2GB 2GB NPU n/a
Kleinhirn 3 RK3566 2GB Mali GPU n/a
HP Notebook Ryzen 5 5600U 14GB Vega iGPU ~$35/mo
LLM Stack:

Brain: Ollama with qwen3:8b (local), ATXP fallback for complex reasoning
Nebenhirn: SD + Ollama (GTX 1650)
Muskeln: SD + Ollama (RTX 2070S)
Totals: ~160GB RAM, 16GB GPU VRAM, 70+ cores, 2x NPU
Cloud equivalent: $320/month
My cost: Hardware already owned (€800 invested), ~€15/mo electricity

For those running local LLMs: at what point did you break even vs. API costs? And what's your "too big for home, must go cloud" threshold?

Context: trying to justify keeping this running vs. just using GPT-4 API for everything. The privacy aspect weighs heavy, but so does the electricity bill. 😅

shrewd nest Mar 22, 2026, 11:21 PM

#

spiral vector Good enough for what exactly? Yes, its good enough for some things, no not goot...

I will check it out thanks

half haven Mar 23, 2026, 12:10 AM

#

Appreciate any feedback.
Just out of curiosity as I've tried every other issue, is this HONOR MagicBook Pro 14 2025 14.55 inch ARL Ultra9 UMA 32GB SSD 1TB Grey Windows 11 good enough to run Ollama and qwen3:32b with open claw?

little scroll Mar 23, 2026, 7:51 AM

#

wide roost Hi all, Evaluating my setup's cloud cost equivalent and curious about your expe...

maybe it common knwoledge, but does an 8b model works ok? what could you do with it? (for claw I mean!) Im spoilt with larger local model and Im not sure I have tested this one.

dawn cosmos Mar 23, 2026, 10:03 AM

#

half haven Appreciate any feedback. Just out of curiosity as I've tried every other issue, ...

No for local model. Openclaw itself can run very well with cloud llm providers

half haven Mar 23, 2026, 10:13 AM

#

dawn cosmos No for local model. Openclaw itself can run very well with cloud llm providers

Thanks for replying. Do you know if any of the Ollama models will run on the machince with Openclaw?

storm hedge Mar 23, 2026, 10:44 AM

#

half haven Thanks for replying. Do you know if any of the Ollama models will run on the mac...

any LLM that fits in your Ram... use LLM Studio + Hugging Face, with the one click option on hugging face you will see the size and if i fits befor downloading.

dawn cosmos Mar 23, 2026, 11:14 AM

#

half haven Thanks for replying. Do you know if any of the Ollama models will run on the mac...

Openclaw itself is a kind of binary, does not take much cpu. It is the ollama or any other local tools that runs you can use. In fact run those in docker continer

sage jackal Mar 23, 2026, 2:49 PM

#

Hey Im new here. Heard about openclaw for the last 3 months and now finally have time to jump in. I guess this is the section where to talk about hardware. I realize mac minis are hard to get nowdays so I may have to redirect to macbooks, for running local llms is it ok a macbook pro m5 with 32 gb? When I ask to llms they say yes and no and I can see on forums people saying yes and no. So before jumping in I just wannna make sure I can still run some models. i dont need the high end ones, I just want to have a feel of jarvis at home and go from there. If it becomes vital then Ill upgrade to mac mini or studio. In the meantime so is a macbook pro m5 ok for a 32b model or lower ? Thx for answers, let me know if theres a section where I can get those answers already

lyric orchid Mar 23, 2026, 3:15 PM

#

sage jackal Hey Im new here. Heard about openclaw for the last 3 months and now finally have...

I would look at peoples results with the M4 mini 32gb, if it works there, it should work on the MBP M5 32gb, but yours would be a little faster I think, though 32b might be tight, you need some room for the OS and other processes! asking claude about this:
*Yeah, that's solid logic — same unified memory architecture, same memory tier, so if a model runs well on the M4 Mini 32GB it'll run at least as well (and ~25% faster on token gen) on the MBP M5 32GB. The chip difference doesn't affect what fits, only how fast it runs.
For their specific question about 32B models — that's actually the tricky boundary at 32GB. A 32B model at Q4 quantization needs roughly 18–20GB, so it fits, but leaves little headroom for the OS and context. Q8 of a 32B would be too large. So the honest answer is:

7B–14B models → runs great, multiple quant levels, no issues
32B at Q4 → fits but tight, performance will be acceptable not great
32B at Q8 or higher → won't fit cleanly
70B+ → no

So for a "Jarvis at home" vibe, they'd actually get a better experience targeting a well-tuned 14B (like Qwen or Mistral) than a cramped 32B. The 14B at Q8 will feel snappier and more capable than a 32B squeezed into Q4 at the memory limit.
The M4 Mini 32GB benchmarks would be a perfect proxy — same answer applies to the MBP M5 32GB, just faster. *

sage jackal Mar 23, 2026, 3:28 PM

#

lyric orchid I would look at peoples results with the M4 mini 32gb, if it works there, it sho...

Thx Kevin. So basically, bringing back down to 7/14B models quantized a bit should work from what I can read. The thing is Im so used to macbooks and I dont wanna wait x weeks before getting hold of any mini if I can already create something that just works on a nice mbp config. i think Ill jump on the mbp. Cool

lyric orchid Mar 23, 2026, 3:50 PM

#

sage jackal Thx Kevin. So basically, bringing back down to 7/14B models quantized a bit shou...

yeah, you definitely should be able to get started and try some things out before committing to new hardware. i.e. maybe it's "Jarvis like" , but smart enough? The more vram the better, but it's not clear to me where the jump is in functionality between 24-32 and more (48, 64, 96... ?). It all depends on what you are doing with it. check out claweval as well, they don't specifically show the 32gb mini, but do show 24 and 48, so somewhere between the two... https://github.com/explaindio/ClawEval/tree/master?tab=readme-ov-file#-which-tested-models-fit-on-your-hardware - not sure if there are speed results in there though, just model test results.
actually, re-reading your response, I thought you already had the MBP M5... the nice thing about the mini is that it's kinda meant to be running all the time, at least more than a laptop? I am actually running my OC on an old windows laptop I was longer using, put ubuntu on it (had it on an M1 mac mini, but I have other personal stuff on there, and wanted OC on a fresh machine without access to any other personal stuff), but I wonder if having a laptop running in my wiring closet 24/7 is the best long term strategy. next step, raspberry pi 5 🙂

sage jackal Mar 23, 2026, 4:01 PM

#

lyric orchid yeah, you definitely should be able to get started and try some things out befor...

Just so you know Ive worked at home for the last decade with a mbp constantly wired to the wall socket. Never had any issue. So imo a dedicated mbp for OC seems the best fit for me. Well see anyway.

shell kindle Mar 23, 2026, 11:37 PM

#

Question: Is anyone running a Mac Mini cluster?

tropic crater Mar 24, 2026, 4:54 AM

#

hi, I am AI Hardware Engineeer, new to this wonderful group.

I dropped a new roadmap article comparing the Mac mini M4 as a 24/7 OpenClaw server to a Jetson Orin Nano 8GB edge appliance—when each wins, how to squeeze real inference out of 8GB UMA, and a privacy/security stack from hardware through skills (no vendor hype, just tradeoffs).

https://jared-hpc.com/blog/mac-mini-openclaw-server

harsh thicket Mar 24, 2026, 5:10 AM

#

Have you had any luck? I’ve spent two days now trying to get openclaw running on my 2016 15” MacBook Pro. When I started it was running Sequoia via OCLP and had irreparable dependency issues. Based on bad info I wiped the machine and downgraded to Monterey only to run into the same problems. After struggling all day today I’m worn out. Both ChatGPT and Grok have led me in circles trying to repair the issues and get it running. Now I’m wondering if I go back to Sequoia if I can maybe run openclaw in Docker? Turns out Docker is not supported in Monterey anymore so that was a dead end. Sigh.

shell kindle Mar 24, 2026, 7:08 AM

#

harsh thicket Have you had any luck? I’ve spent two days now trying to get openclaw running on...

Ask Claude to diagnose issue.

nocturne girder Mar 24, 2026, 12:15 PM

#

Good morning,
I've been trying to set up the OpenAI, Gemini, and Anthropic APIs for a few days now, but I haven't been able to get any models other than OpenRouter to work.
I’m thinking of buying a PC to install some models locally since OpenRouter. I’ve seen one with a Ryzen 7, 32GB DDR5 RAM, and an RTX 4070. Will it work? Can it be configured to use the models locally? Many thanks

fair pond Mar 24, 2026, 1:38 PM

#

I keep hearing about cloud models. Is it not possible to run a local llm on my Mac mini m4 16gb without it being either super slow or unresponsive ? Wondering if anyone’s cracked this code yet.

floral geyser Mar 24, 2026, 1:52 PM

#

nocturne girder Good morning, I've been trying to set up the OpenAI, Gemini, and Anthropic APIs ...

Hey good morning! You don't need a new PC for this. The API keys for OpenAI, Gemini and Anthropic should work fine — its usually a config issue. What error are you getting when you try to connect them? Happy to help you troubleshoot.
If you do want to run models locally thats a different thing — that setup with Ryzen 7, 32gb ram and RTX 4070 would work for smaller models through Ollama. But honestly for openclaw you'll get way better results using cloud APIs like Claude Sonnet or GPT-5.x. Local models are slower and less capable for agent tasks. I'd fix the API setup first before spending money on hardware!

nocturne girder Mar 24, 2026, 2:08 PM

#

floral geyser Hey good morning! You don't need a new PC for this. The API keys for OpenAI, Gem...

Thanks a lot for the help, bro.
The problem is that when I try to run models using an API from any provider other than OpenRouter, I get errors like:
⚠️ Agent failed before reply: All models failed (2): openai/gpt-5.4-mini: Unknown model: openai/gpt-5.4-mini (model_not_found) | google/gemini-3-flash-preview: ⚠️ API rate limit reached. Please try again later. (rate_limit).
Logs: openclaw logs --follow
I've tried renaming the templates to create them, but nothing works for me except OpenRouter...
Plus, the token consumption is high for relatively trivial tasks like following companies on social media.
Thank you for help

floral geyser Mar 24, 2026, 3:12 PM

#

OpenAI-codex limits are quite brutal. Also, have you upgraded the openclaw for gpt-5.4 support ?

#

On telegram, use the message, “/models OpenAI-codex “. This will show you the models that are supported ..

lyric orchid Mar 24, 2026, 7:50 PM

#

nocturne girder Good morning, I've been trying to set up the OpenAI, Gemini, and Anthropic APIs ...

I originally tried ollama with a 4070, but personally I don't think 12 gb is enough gpu for local models. I was using ollama with that, hadn't discovered llamacpp, so could have gotten better times, but was limited to smaller models... qwen3:8b or qwen3.5:9b with 32K context. I've upgraded to a 3090 with 24gb and am much happier with the results and consistency of the bigger models. My test results here with the 4070 https://github.com/khaney64/ollama-model-tests/blob/main/reports/recommendations-4070.md

nocturne girder Mar 24, 2026, 8:02 PM

#

floral geyser OpenAI-codex limits are quite brutal. Also, have you upgraded the openclaw for g...

I definitely haven't updated—I just saw that I'm on v2026.3.13... Thanks a lot.

I haven't tried Opencodex yet; I'm going to see if I can run the “doctor” tool to clean up all the misconfigured models and reinstall them.

nocturne girder Mar 24, 2026, 8:02 PM

#

floral geyser On telegram, use the message, “/models OpenAI-codex “. This will show you the mo...

Thank you very much 🙏

nocturne girder Mar 24, 2026, 8:05 PM

#

lyric orchid I originally tried ollama with a 4070, but personally I don't think 12 gb is eno...

I'll definitely check it out. Thanks a lot.

nocturne girder Mar 24, 2026, 8:17 PM

#

floral geyser OpenAI-codex limits are quite brutal. Also, have you upgraded the openclaw for g...

Yes, sir, it was the update... thank you very much.

tawdry vault Mar 25, 2026, 3:33 AM

#

What’s the best Mac mini config? 32gb ram?

royal radish Mar 25, 2026, 4:11 AM

#

Hi, i'm running deepseek V3 and R1 on VPS (much cheaper) but i was wondering how is the difference running on local infra, do thinking mode is different from an llm to another changing the infra will not solve the model issue. i moved to deepseek because anthropic cost were high. i configuration on VPS is 16go memory and 200 Gdisk space

royal radish Mar 25, 2026, 4:16 AM

#

nocturne girder Thanks a lot for the help, bro. The problem is that when I try to run models us...

i had this issue as well, it is just config issue, you can easilly fix it with claude code

errant steppe Mar 25, 2026, 2:27 PM

#

tawdry vault What’s the best Mac mini config? 32gb ram?

max it out if you can afford it/get a good deal etc.

thorny elbow Mar 25, 2026, 3:02 PM

#

i got 8gb ram ohno

urban furnace Mar 25, 2026, 10:26 PM

#

did anyone else's windows 10 LMstudio stop yesterday. some trojan reported, likely false positive as per reddit, for version's 0.4.7 main/index.js file

#

my windows 11 has not reported this file.

#

I did moved the file back from quarantine but LMstudio since didn't launch its gui anymore on windows 10. file size is identical to the working windows 11 version.

trim marsh Mar 25, 2026, 11:06 PM

#

urban furnace I did moved the file back from quarantine but LMstudio since didn't launch its g...

#1111797717639901324 message

surreal nova Mar 26, 2026, 2:21 AM

#

urban furnace did anyone else's windows 10 LMstudio stop yesterday. some trojan reported, like...

so you follow localllm eh?

I read the thread and the bad version was live for ~hr.
If you feel you were exposed in that hr, rotate all affected passwords/keys.

If you are not cutting edge update person, simply update to the latest.

harsh thicket Mar 26, 2026, 5:17 AM

#

shell kindle Ask Claude to diagnose issue.

Thank you for this tip! I didn’t realize how much better at this Claude would be than ChatGPT and Grok. It got me up and running so fast it was shocking!

shell kindle Mar 26, 2026, 6:18 AM

#

Yeah I had the same sentiment. glad to help

rose sonnet Mar 26, 2026, 12:35 PM

#

hey guys, im in doubt about what OS to use for hosting openclaw with an anthropic model?. A bit of context, i have an HP elitedesk 800 g2 sff with extra ram and im gonna use that for the hosting, in general im gonna use claw for little things like, read all of my newsletters and create like a newspaper for it, set reminders via whatsapp and/or by using voice messages, use the Productivity skill in the clawhub and so on.

quasi forum Mar 26, 2026, 2:07 PM

#

rose sonnet hey guys, im in doubt about what OS to use for hosting openclaw with an anthropi...

are you comfortable running linux?

rose sonnet Mar 26, 2026, 2:32 PM

#

quasi forum are you comfortable running linux?

yes, my daily drive is linux

urban furnace Mar 26, 2026, 5:03 PM

#

trim marsh https://discord.com/channels/1110598183144399058/1111797717639901324/14861149156...

link does work

finite dirge Mar 26, 2026, 7:11 PM

#

Any opinions on hosting openclaw on android phones? Do they work?

urban furnace Mar 26, 2026, 8:10 PM

#

surreal nova so you follow localllm eh? I read the thread and the bad version was live for ...

it's the LocalLLaMa reddit I have read that, from a search result.

oh, was it actually a real thread? I had version 0.4.7 running for quite some days and suddenly the windows defender reported it, but just on windows 10 not 11. it was my first ever install of LMstudio. straight to this version 0.4.7

little vector Mar 26, 2026, 9:15 PM

#

What is the cheapest way to run a 24/7/365 agent?

novel thorn Mar 26, 2026, 9:27 PM

#

finite dirge Any opinions on hosting openclaw on android phones? Do they work?

Yes I have running on S25 Termux Terminal, it does have some limiation on multi agents but works. And is stable 🙂

craggy ferry Mar 26, 2026, 11:21 PM

#

little vector What is the cheapest way to run a 24/7/365 agent?

Kind of depends on what you want it to do. Like, technically, qwen3.5-0.8B exists.

blissful widget Mar 26, 2026, 11:26 PM

#

craggy ferry Kind of depends on what you want it to do. Like, technically, qwen3.5-0.8B exist...

tho far from capable 😄

craggy ferry Mar 26, 2026, 11:26 PM

#

Correct!

blissful widget Mar 26, 2026, 11:27 PM

#

I'm using glm4.7 flash and qwen 3.5 30b and they are still quite dumb at the moment.

#

reinforcement does help over time, but i dont wanna nudge them from time to time

#

which models do you guys have experience and are generally good doing tool calling? (LLM)

tranquil hazel Mar 27, 2026, 3:01 AM

#

little vector What is the cheapest way to run a 24/7/365 agent?

mini pc or even a Pi.

split canyon Mar 27, 2026, 3:24 AM

#

little vector What is the cheapest way to run a 24/7/365 agent?

I’m running on Raspberry Pi and very happy with the results

spiral vector Mar 27, 2026, 4:26 AM

#

Hate to say it here, but cheapest option for 24x7 agent lukely isn't openclaw at all - its just a $20/month claude or antigravity subscription - depending on your level of weekly usage. I dont mean running through their API (that's the most expensive option), rather just use the remote access options and run locally in a VM and access it from your phone or via VPN. You get a good amount of sonnet 4.6 usage for $20/month - even more haiku.

#

But, yes - if you want high usage, or if you're OK with running small models, then open claw on a cheap mini PC works. Cheap(ish) used mac mini's with apple silicone, or AMD Strix work best for local models. Local models really need 32GB VRAM. If you're just using openclaw to connect to cloud model API's, then you can run on any cheap PC with 8 or 16GB ram, 2-4 core CPU and minimal storage is fine.

#

I always point people to claweval on github to get an idea of what models you want and what you can run.

blissful widget Mar 27, 2026, 4:47 AM

#

I am using qwen 3.5 35b a3b and i had almost 90% sucess rate on performing tool calls 🙂

#

now i have to hook it to comfyui as tools

shadow urchin Mar 27, 2026, 8:37 AM

#

i just read about comfyUI, it sounds fascinating. can it be applied elsewhere?

thorny laurel Mar 27, 2026, 5:51 PM

#

Hey guys I'm new here. I'm tasked with building a OpenClaw setup but I'm having trouble figuring out the specs for the hardware. My boss wants it to mainly work as a bot that will search market data and trends on the internet for a specific market sector. The think is i don't know if i should move foward using a local LLM or Claude API. The RAM specs for each situation differ a lot and in my country MACs are much more expensive than other hardware. Should i still get the mac mini? I'm know a bit about LLMs but that's not my expertise.

frozen bridge Mar 27, 2026, 6:19 PM

#

split canyon I’m running on Raspberry Pi and very happy with the results

local llm or api

split canyon Mar 27, 2026, 6:21 PM

#

frozen bridge local llm or api

api of course

frozen bridge Mar 27, 2026, 6:22 PM

#

split canyon api of course

ok

steep wedge Mar 27, 2026, 7:47 PM

#

thorny laurel Hey guys I'm new here. I'm tasked with building a OpenClaw setup but I'm having ...

OpenClaw will run on just about anything. However, based on the description of what you want it to do, I don’t think you even need OC. I know you could schedule those sorts of recurring searches on Perplexity, and I imagine on OpenAI and Claude as well. Pay $200 a year and be done with it.

thorny laurel Mar 27, 2026, 7:59 PM

#

steep wedge OpenClaw will run on just about anything. However, based on the description of w...

I kind of agree with you, but unfortunately I have to follow orders 😅. In this case, I was told to use OpenClaw specifically. That being said, I’m trying to do my best not to spend too much company money (since I might be blamed later), while still building something that works well enough (I’ll later have to maintain it)

steep wedge Mar 27, 2026, 8:04 PM

#

thorny laurel I kind of agree with you, but unfortunately I have to follow orders 😅. In this ...

Understood. In that case I would go with an API solution. That keeps the upfront costs low since you won’t need powerful hardware. It does result in a recurring monthly spend, but they can adjust or turn it off if it’s too expensive.

thorny laurel Mar 27, 2026, 8:07 PM

#

Thank you, i was really lost there

surreal nova Mar 27, 2026, 9:47 PM

#

love the respect here
wrong tool for the job
under orders
ok, well heres how best to work with that
🙂

finite dirge Mar 27, 2026, 10:17 PM

#

Hi Watson, thank you for your reply. I have developed an application to control the android phone camera, sensors, calls, torch, etc but it is not stable.

Wondering if installing openclaw on android phone would be not be a good idea unless the phone is powerful enough for stability.

red cypress Mar 28, 2026, 1:43 PM

#

thorny laurel Hey guys I'm new here. I'm tasked with building a OpenClaw setup but I'm having ...

for research probably want a high-end cloud model (different from what hardware to run it on) from claude (opus/sonnet), has to do web search, go through all the data, put together analysis, present it in whatever way you want it (email, telegram? ppt?). maybe try it first using the primary path (ie. claude website/cowork or chatgpt website) with the queries you want to try and see how well they work before you hand it off to openclaw to run on its own and hope it works. Using frontier models aren't cheap, just a warning, and it takes a bit to get used to how openclaw works with memory/context, tons of info out there, just have to play with it.

hoary sable Mar 28, 2026, 2:49 PM

#

thorny laurel Hey guys I'm new here. I'm tasked with building a OpenClaw setup but I'm having ...

A pc powerful enough to run a web browser will work. I dont know about if windows will run openclaw but I can tell you a low end pc running linux works just fine.. all the heavy work is done by the model provider (anthropic, or whoever)

thorny laurel Mar 28, 2026, 3:23 PM

#

Thank You

steep wedge Mar 29, 2026, 12:38 AM

#

Hello fellow DGX Spark owners. I have my two Asus Ascent GX10s clustered, and I was running Llama-3.1-Nemotron-70B-Instruct-HF for most of the day. I hated it. 😂 Super annoying personality, but the real problem was I had to drop conext window to 32k to squeeze past the CUDA graph step when bringing the model online. Anyway, I just nuked that, and I am going to give Qwen3.5-122B-A10B-FP8 a try. Any other recommendations on models you have liked running on a 2-node cluster?

craggy ferry Mar 29, 2026, 12:40 AM

#

You’ll like 3.5-122b

#

Wish I had pulled the trigger on a gx10 while they were still cheap but

steep wedge Mar 29, 2026, 12:46 AM

#

craggy ferry Wish I had pulled the trigger on a gx10 while they were still cheap but

I hear you. I bought the second one for too much through Best Buy, but I had gift cards that I got for a 20% discount, so it basically cancelled out the extra markup. I figured if I waited any longer, it would only get worse before it got better.

fickle vapor Mar 29, 2026, 1:26 AM

#

I'm running a very limited PC, I have two rtx 2070 supers with an NVLINK bridge installed on pop os linux. Right now qwen3.5 9b seems to be the only one that fits - is there somthing I'm missing here? Every time I try to run 27b it grinds to a halt

#

(openclaw is running on an unprivledged lxc container on my proxmox host)

steep marten Mar 29, 2026, 5:16 AM

#

Question, if i'm on like a budget is this a good build for openclaw?

#

GPU: ASRock Radeon AI Pro R9700 Creator 32GB — $1,299.99 — https://www.microcenter.com/product/702444/asrock-amd-radeon-ai-pro-r9700-creator-single-fan-32gb-gddr6-pcie-50-graphics-card

CPU: AMD Ryzen 7 7700 — $253.99 — https://www.bestbuy.com/product/amd-ryzen-7-7700-8-core-16-thread-3-8-ghz-5-3-ghz-max-boost-socket-am5-unlocked-desktop-processor-silver/JXKQHH52X5

Motherboard: MSI MAG B650 Tomahawk WiFi — $219.99 — https://www.microcenter.com/product/659662/msi-b650-mag-tomahawk-wifi-amd-am5-atx-motherboard

RAM: Crucial Pro 128GB (2x64GB) DDR5-5600 CL46 — CP2K64G56C46U5 — $1,242.99 — https://www.bestbuy.com/product/crucial-pro-128gb-2x64gb-ddr5-5600mhz-c46-udimm-desktop-memory-black/JX8PSKCS2V/sku/6637048

SSD: WD Black SN770 2TB — $264.99 — https://www.microcenter.com/product/682892/wd-black-sn770-2tb-112l-tlc-nand-flash-pcie-gen-4-x4-nvme-m2-internal-ssd

PSU: Corsair RM850e ATX 3.1 850W — $124.99 — https://www.microcenter.com/product/689529/corsair-rme-series-rm850e-850-watt-cybenetics-gold-atx-fully-modular-power-supply-atx-31-compatible

Case: Corsair 3000D Airflow — $94.99 — https://www.corsair.com/us/en/p/pc-cases/cc-9011251-ww/3000d-tempered-glass-mid-tower-black-cc-9011251-ww

Air cooler: Thermalright Peerless Assassin 120 SE — $39.99 — https://www.microcenter.com/product/704460/thermalright-peerless-assassin-120-se-cpu-air-cooler

Total: $3,541.92

spiral vector Mar 29, 2026, 5:38 AM

#

That seems like a good all around PC that also supports open claw. If you want a pure-open claw system, with local LLM support, you can save some more by looking at unified memory systems (mac mini or strix halo are good). But the unified memory systems are not as good for tasks like gaming if you also want to use the system for that.

#

Intel just released their B70 GPU also - $949 for 32GB. But that'll come with even more software/model compatibility issues that AMD will with ROCm - cheaper if you're willing to fight through it and/or wait for others to build intel specfic versions of the models you want to run.

#

Oh - there's also 2 newer versions of the peerless assassin - about the same price, or $5 more, slightly improved performance.

steep wedge Mar 29, 2026, 12:20 PM

#

fickle vapor I'm running a very limited PC, I have two rtx 2070 supers with an NVLINK bridge ...

I looked up RTX 2070, and it looks like they have 8 GB of VRAM, so 16 GB with your pair. It's not exactly 100%, but as a rule of thumb I equate model size to RAM needed. In your case, that 27b model would need 27 GB of VRAM. 27 > 16 so it won't fit. The actual RAM needed is not exactly 27 GB, but that model is almost certainly too big for you 2x2070s.

fickle vapor Mar 29, 2026, 2:58 PM

#

steep wedge I looked up RTX 2070, and it looks like they have 8 GB of VRAM, so 16 GB with yo...

So stick to 9b. Is 9b smart enough to analyze logs and run administrative actions on a small virtual network?

#

Thanks for your attention in this matter, I should have been more considerate and listed the VRAM sizes. Apologies

steep marten Mar 29, 2026, 3:16 PM

#

spiral vector Intel just released their B70 GPU also - $949 for 32GB. But that'll come with e...

thanks :D I prefer having good compatibility so not very intrested in the b70 thanks for your feedback though

steep wedge Mar 29, 2026, 3:30 PM

#

fickle vapor So stick to 9b. Is 9b smart enough to analyze logs and run administrative action...

That's an interesting idea. I have similar aspirations but I have access to some larger models. The thing I am slowly learning is that I am not always a great prompt writer, and the smaller models need very tight, well-structured prompts to produce the best results. I often get the best results when I use one of the online models (e.g., Gemini or Claude) to help me write a better prompt for the little local model(s). Now that I think of it, I should probably have the big boys help me write better agent/soul/memory files for the little guys.

steep marten Mar 29, 2026, 3:31 PM

#

if i buy 128gb of ddr5 will there be any finetuning or anything with the bios ill have to configure? if so what?

oak frost Mar 29, 2026, 5:39 PM

#

Check your mainboard details, there mostly limits how big the ram modules ca be.

steep wedge Mar 29, 2026, 6:22 PM

#

New project: Dell R740xd with three Nvidia Quadro P5000s. It's a solution in search of a problem. I have some ideas, but open to others.

grave shoal Mar 29, 2026, 6:33 PM

#

steep marten if i buy 128gb of ddr5 will there be any finetuning or anything with the bios...

Shouldn’t be as long as your motherboard supports it.

grave shoal Mar 29, 2026, 6:35 PM

#

steep marten Question, if i'm on like a budget is this a good build for openclaw?

Openclaw? Overkill. But if you are trying to run a local LLM then you want be too impressed with 32GB vram.

steep marten Mar 29, 2026, 6:35 PM

#

well i'm doing both

#

and i like training and finetuning

steep marten Mar 29, 2026, 6:37 PM

#

grave shoal Openclaw? Overkill. But if you are trying to run a local LLM then you want be to...

could you explain what u mean by " local LLM then you want be too impressed with 32GB vram."

steep wedge Mar 29, 2026, 6:44 PM

#

steep marten could you explain what u mean by " local LLM then you want be too impressed wi...

32GB VRAM is not a lot, and you won't be able to run models of much size/quality. Your results with OpenClaw will generally be poor.

steep marten Mar 29, 2026, 7:17 PM

#

steep wedge 32GB VRAM is not a lot, and you won't be able to run models of much size/quality...

i mean i'm only looking for 32b quantized

#

qwen 32b runs pretty well for openclaw with my testing atleast

steep wedge Mar 29, 2026, 7:19 PM

#

When you say it runs well, do you mean speed or accuracy and usefulness?

grave shoal Mar 29, 2026, 7:37 PM

#

steep marten could you explain what u mean by " local LLM then you want be too impressed wi...

Larger models give better results. With that you need more vram to use the llm effectively. 32GB wont allow you to so much. Of course it depends what you want to do. i guess.

craggy ferry Mar 29, 2026, 7:39 PM

#

fickle vapor I'm running a very limited PC, I have two rtx 2070 supers with an NVLINK bridge ...

Try quantization. Look up unsloth. No one runs unquantized models locally.

craggy ferry Mar 29, 2026, 7:41 PM

#

steep wedge 32GB VRAM is not a lot, and you won't be able to run models of much size/quality...

Qwen3.5-27b runs fine in 32g and it’s useful enough

lyric orchid Mar 30, 2026, 12:43 AM

#

craggy ferry Qwen3.5-27b runs fine in 32g and it’s useful enough

And qwen3.5-35b-a3b as well, quicker than 27b but not as "smart" but depends on your tasks (and prompts!)

lyric orchid Mar 30, 2026, 1:01 AM

#

steep marten GPU: ASRock Radeon AI Pro R9700 Creator 32GB — $1,299.99 — https://www.microcent...

The memory prices are just insane! And system memory isn't going to help with local LLMs - I'd get half the memory and put the money into a bigger GPU if possible, or an Nvidia GPU if they get you better performance / compatibility - you can always add memory later.

fickle vapor Mar 30, 2026, 1:28 PM

#

craggy ferry Try quantization. Look up unsloth. No one runs unquantized models locally.

Thank you. Does quantization affect the quality of the answers generated?

fickle vapor Mar 30, 2026, 1:29 PM

#

steep wedge That's an interesting idea. I have similar aspirations but I have access to some...

How about using an online or large model to help you structure your prompt for your smaller models?

craggy ferry Mar 30, 2026, 1:31 PM

#

fickle vapor Thank you. Does quantization affect the quality of the answers generated?

A small bit, 10% or so at the 4 bit level

fickle vapor Mar 30, 2026, 1:32 PM

#

Thank you! I plan on using my local free model just to execute routine tasks like log scanning for emergencies.

#

The more difficult tasks get online models

#

I only have 16g VRAM. From the beginning my models were going to be limited

fickle vapor Mar 30, 2026, 1:34 PM

#

craggy ferry A small bit, 10% or so at the 4 bit level

May I please have your opinion on abliterated models?

#

I'm probably not going to run them - but I've also heard some people say abliterated models are slightly smarter than baseline

steep wedge Mar 30, 2026, 1:47 PM

#

fickle vapor How about using an online or large model to help you structure your prompt for y...

I do that too. Turns out I’m not great at writing prompts.

craggy ferry Mar 30, 2026, 1:48 PM

#

fickle vapor I'm probably not going to run them - but I've also heard some people say abliter...

It’s the opposite in my understanding

#

It’s sort of like lobotomizing them in a very particular way. The bits that light up when they refuse to do things because of their safety guardrails are also the bits that light up when they refuse to follow prompt injection attempts

#

You’re trading off safety basically

lyric orchid Mar 30, 2026, 4:25 PM

#

fickle vapor I only have 16g VRAM. From the beginning my models were going to be limited

re: 16g vram, I had good results with qwen3.5:9b, and qwen3:8b and :14b on a 12 gb 4070, they'd probably work for you. not fast enough for primary agent / chat sessions IMHO but fine for tasks. the key for me is create an md file for some task, have the cron job instruction say read and follow the instructions in the md file, see how it does, give the results and the md to claude code, have CC tweak it, lather, rinse, repeat until it does what I want. I mentioned here or in another chat that I also have CC generated proxy between OC and ollama/llamacpp so I can watch the traffic, see what the model is doing, see where it gets confused or stuck, feed that back into CC, adjust the prompt.

steep marten Mar 30, 2026, 4:52 PM

#

lyric orchid The memory prices are just insane! And system memory isn't going to help with l...

memory is currently expected to grow in price so i'm buying more now because ill need it later fgor other projects

gritty prism Mar 31, 2026, 7:55 AM

#

Im looking to get a mini PC. Would like to get a good spec that i can upgrade up to 128gb ram but starting at 32gb ram 1tb nvme. Gemini is recommending me the Minisforum AI X1 Pro (obviously one of the most expensive options). I would like to know if anybody has experience with the X1 or if anyone recommends something else. I do not want apple as i want to run linux. Appreciate the time in advance!

dawn cosmos Mar 31, 2026, 9:18 AM

#

gritty prism Im looking to get a mini PC. Would like to get a good spec that i can upgrade up...

Gmteck evo-x2, get the 128Gb one as this is the SoC 'cheapest out therr

gritty prism Mar 31, 2026, 11:40 AM

#

gritty prism Im looking to get a mini PC. Would like to get a good spec that i can upgrade up...

Are you using this one aswell? If so, how are your experiences with it?

fickle vapor Mar 31, 2026, 12:49 PM

#

steep marten memory is currently expected to grow in price so i'm buying more now because il...

I heard from my wife that memory just dropped in price yesterday. Is what I heard just cope?

gloomy crescent Mar 31, 2026, 2:47 PM

#

just the news, nothing really happening

keen tiger Mar 31, 2026, 5:33 PM

#

I am planning to get a Mac Mini with M4 cpu and 32GB ram. What is the biggest model that I can use on it with OpenClaw ? Do anyone have experience with a 9b Qwen model on such hardware ?

steep marten Mar 31, 2026, 6:40 PM

#

fickle vapor I heard from my wife that memory just dropped in price yesterday. Is what I hear...

it might, had to do with some weird deal with openai and another company so idek

modern axle Mar 31, 2026, 7:08 PM

#

steep marten it might, had to do with some weird deal with openai and another company...

I think it has to do with Googles new findings that can reduce to 6x the amount they need to run LLMs without compromising results. https://research.google/blog/turboquant-redefining-ai-efficiency-with-extreme-compression/

gritty prism Mar 31, 2026, 7:10 PM

#

modern axle I think it has to do with Googles new findings that can reduce to 6x the amount ...

wait what

steep marten Mar 31, 2026, 7:10 PM

#

modern axle I think it has to do with Googles new findings that can reduce to 6x the amount ...

i thought it was some openai deal..? because they failed to fufill its commitment of 40% of the world's ram

modern axle Mar 31, 2026, 7:11 PM

#

steep marten i thought it was some openai deal..? because they failed to fufill its commit...

Could very possibly be, I just know of the research google released

steep marten Mar 31, 2026, 7:11 PM

#

modern axle Could very possibly be, I just know of the research google released

also the quantization wouldn't have much to do with ram... right?

modern axle Mar 31, 2026, 7:13 PM

#

The idea is that TurboQuant reduces memory requirements and improves response performance and latency while maintaining accuracy. In practice, it would allow AI models to access more contextual data while using less space and avoiding hallucinations. Source

#

Together, they could help alleviate the memory bottleneck. Although it wouldn't do much for training data centers, which also require monstrous amounts of memory, it could thin out the RAM needs of inferencing systems. It probably wouldn't do much to solve the current memory crisis, as deployment would take time, and memory orders are already locked in for many months. But perhaps it could help bring the RAM shortage to a close before 2030. Same Source

agile sentinel Mar 31, 2026, 9:19 PM

#

@lyric orchid i was gonna ask you if you had tried qwen3.5-27b opus destilled v2 via turboquant as it allegedly fits 16gb

#

https://x.com/i/status/2038725930626003140

astral gobletBOT Mar 31, 2026, 9:23 PM

#

agile sentinel https://x.com/i/status/2038725930626003140

@i via Twitter

David T (@coffeecup2020)

Turbo Quant not just for KV, can use it on weights.
︀︀
︀︀I bought an RTX 5060 Ti 16GB around Christmas and had one goal: get a strong model running locally on my card without paying api fees. I have been testing local ai with open claw.
︀︀
︀︀I did not come into this with a quantization background. I only learned about llama, lmstudio and ollama two months ago.
︀︀
︀︀I just wanted something better than the usual Q3-class compromise (see my first post for benchmark). Many times, I like to buy 24gb card but looking at the price, I quickly turned away.
︀︀
︀︀When the TurboQuant paper came out, and when some shows memory can be saved in KV, I started wondering whether the same style of idea could help on weights, not just KV/ cache.
︀︀P/S. I was nearly got the KV done with cuda support but someone beat me on it.
︀︀After many long nights (until 2am) after work, that turned into a llama.cpp fork with a 3.5-bit weight format I’m callin…

lyric orchid Mar 31, 2026, 9:45 PM

#

agile sentinel <@1020084715354595380> i was gonna ask you if you had tried qwen3.5-27b opus des...

I have not tried that one

spiral vector Mar 31, 2026, 10:11 PM

#

keen tiger I am planning to get a Mac Mini with M4 cpu and 32GB ram. What is the biggest mo...

https://github.com/explaindio/ClawEval I think this is the best source for determining what sort of local models can do.

spiral vector Mar 31, 2026, 10:28 PM

#

https://www.amazon.com/NIMO-AMD-Ryzen-Max-395/dp/B0GQ2L4CQL seems to be the cheapest ($2500) I've seen a 128GB Strix Halo (96GB useable for GPU). Others seem to be all $3k now. Never heard of this brand though.

royal radish Apr 1, 2026, 4:06 AM

#

spiral vector https://www.amazon.com/NIMO-AMD-Ryzen-Max-395/dp/B0GQ2L4CQL seems to be the chea...

does this work well to run models ?

spiral vector Apr 1, 2026, 4:14 AM

#

Strix Halo are generally the "cheap" option to run local models (relative to DGX Spark's or Mac Mini's ($4-$5k+)). AMD Strix Halo doesn't work as fast as the similarly sized competitors, but they're a lot cheaper. So "working well" is a cost vs speed concern here (think 20 tokens per second instead of 30).

royal radish Apr 1, 2026, 4:35 AM

#

spiral vector Strix Halo are generally the "cheap" option to run local models (relative to DGX...

yes ! thanks, what do you think about the mac studio M3 ultra chip 96gb ?

shadow ingot Apr 1, 2026, 4:35 AM

#

can a X1 Pro-370 Mini PC AMD Ryzen AI 9 HX370 handle running open claw okay without issues?

royal radish Apr 1, 2026, 4:36 AM

#

shadow ingot can a X1 Pro-370 Mini PC AMD Ryzen AI 9 HX370 handle running open claw okay with...

ok but not that much; it depends on what you want to do with it

shadow ingot Apr 1, 2026, 4:38 AM

#

royal radish ok but not that much; it depends on what you want to do with it

I really want to explore what is possible without to many limitions. but also would like it compact like a mac mini if it was possible, but mac minis in my area are sold out and even online can't get the specs I want until july it says lol.

#

my budget is like 3k tops

royal radish Apr 1, 2026, 5:00 AM

#

you can have decent machines with 3k

#

it always depends on your usage

#

the thing is that when you start to explore you don't want to be limitated..

spiral vector Apr 1, 2026, 5:13 AM

#

There's a signifigant difference between running open claw, but using cloud-hosted models vs running with local models. The former takes almost no hardware, but will come with a monthly bill for API costs. The later take much more local hardware, but then no (or at least less) per month for API costs

#

For some tasks, only the best frontier models are good (for some tasks even those aren't good enough yet). So its hard to say that even with $10k+ hardware that you can do it all.

shadow ingot Apr 1, 2026, 5:19 AM

#

good to know yeah im not sure what ill be able to do or where ill have bottle necks

steep wedge Apr 1, 2026, 1:54 PM

#

Have you looked at the DGX Spark clones, or do you need this for general purpose computing as well?

#

If you’re using cloud APIs anyway, why bother with a Jetson?

lyric orchid Apr 2, 2026, 2:18 PM

#

speaking of electricity costs, I'm wondering if there are any easy ways for me to pull power information from the GPU, I'd like to see if I can build out something that would match up the GPU power consumption with the jobs that I run in openclaw, and try to come up with a "cost" for each job. I really would like to see if it's worth having this local setup to do what I've been doing, vs. just find a 10-20 month plan or pay for tokens, i.e. compare costs. I know programs like HWInfo show me the power details... maybe time to have a conversation with claude code and see what we can build! Maybe I can set up a few solar panels to power the "inference" machine and let the sun pay for the GPU time!

viral ridge Apr 2, 2026, 4:48 PM

#

lyric orchid speaking of electricity costs, I'm wondering if there are any easy ways for me t...

You could pretty easily setup a beszel server (monitor hub) and install the beszel agent on the gpu server. that would give you the system info real time/history. then you just match request timing to gpu power draw. its not going to be 100% "from the wall" pull but should give you a decent idea.

grave shoal Apr 2, 2026, 5:12 PM

#

I’m running a 64GB Mac Studio M2 Max. Local llm results are slow and nowhere near as reliable as Anthropic or OpenAI models. But depends what your goal is.

#

For some basic reasoning tasks it’s not bad.

lyric orchid Apr 2, 2026, 5:51 PM

#

viral ridge You could pretty easily setup a beszel server (monitor hub) and install the besz...

I'll have to look into the beszel server. I hacked some code into the proxy I've been using between openclaw and llama.cpp to monitor, it uses nvidia-smi --query-gpu=power.draw. I may set something up to push this data so influxdb, then I can do some charts in Grafana!
2026-04-02T17:42:17.350Z [done] job=downloader-summary qwen35-35b-a3b reason=stop prompt=49 (0.1% of 40960 ctx) gen=319 ratio=651.0% pp=492.8tok/s(99ms) tg=96.7tok/s(3.30s) total=3.40s elapsed=3.53s gpu=330.3W(+315.3W) peak=343.4W 0.3028Wh(+0.289Wh) $0.000057(+$0.000055) (13samples) session: prompt=30633 gen=5772 elapsed=83.59s energy=5.0181Wh cost=$0.000951

tranquil hazel Apr 2, 2026, 5:51 PM

#

grave shoal I’m running a 64GB Mac Studio M2 Max. Local llm results are slow and nowhere nea...

local is overrated and dangerous$

#

you don't want a stupid agent

#

if you're running anthropic/google/openAI models, you'll probably be better protected against malicious stuff

#

like your agent reading something stupid on this discord

lyric orchid Apr 2, 2026, 6:03 PM

#

tranquil hazel you don't want a stupid agent

one of my original reasons for exploring local was to prevent sending sensitive information, credentials, api tokens, etc. to cloud providers. one of my first skills I built was to scan the session logs for leaked credentials... early versions of openclaw was constantly leaking creds (trying to get somethign work, it would pull in config files or .env files). if it does that locally, no big deal. but yeah, in general local for me is good for very specific tasks that I don't necessarily have to wait for (cron jobs), with limited access to data. I don't give agents any "go out on the web and find this information", most of my stuff is using skills that reference specific APIs for the data, and the agent can then do some consolidation, answer questions about it, etc. but for actual "talk to the agent" things, in openclaw I'm using a cloud model (minimax) and staying within it's limits.

stiff spoke Apr 3, 2026, 3:13 AM

#

This weekend I’m excited to get my shiny new Mac Studio M3 Ultra 512gb running as my OpenClaw secondary LLM for bulk text processing and basic tool use. If it goes really well, I might get it to do some code generation stuff too. Qwen 3.5 is my starter model, but I’ll be exploring others too. (Using paid cloud LLM api as primary.)

#

I managed to put an order in a few days before Apple discontinued selling them.

craggy ferry Apr 3, 2026, 11:22 PM

#

stiff spoke This weekend I’m excited to get my shiny new Mac Studio M3 Ultra 512gb running a...

Check out vllm-mlx, and Soon(tm) I’ll be publishing a fork I made of it basically targeted at using this hardware to best run 397b

stiff spoke Apr 3, 2026, 11:23 PM

#

craggy ferry Check out vllm-mlx, and Soon(tm) I’ll be publishing a fork I made of it basicall...

Tell me more about vllm-mlx... advantages over others? I'm doing my experimention with LMStudio, but happy to switch to something else.

craggy ferry Apr 3, 2026, 11:24 PM

#

Paged prefix cache, metal native

stiff spoke Apr 3, 2026, 11:26 PM

#

Thanks! I'll check it out immediately 🙂

real pilot Apr 4, 2026, 1:52 AM

#

Question:
I'm looking at acquiring a mac to run open claw on, it's either between the $600 Mac Mini or the $2000 Mac Studio. I'd like to run a local model if possible.
Is it worth the price though? Am I going to blow through over $1,400 in Claude Sonnet 4.6 tokens in a year's time?

crude fossil Apr 4, 2026, 2:42 AM

#

yes i promise you will

#

i blew through that my second week i used it so much

pallid roost Apr 4, 2026, 6:05 AM

#

@craggy ferry so for the 512gb max your recommending 397B for basically everything?

Conflicted with some of the new releases like genma4 even though they’re much smaller

craggy ferry Apr 4, 2026, 7:18 AM

#

Gemma4 does look better, haven’t played with it much. Wish they’d released a 70b

sudden shore Apr 4, 2026, 7:38 AM

#

real pilot Question: I'm looking at acquiring a mac to run open claw on, it's either betwee...

My experience. I bought a Mac Mini; it cost me 2800 euros here in EU. 64G with with 1Tb. Big waste of money in my opinion. Can run locally, but slow replies. It just plain doesn't do the job like Sonnet 4.6. Do yourself a favour and use the money on the tokens and try to get some revenue coming in to feed the AI, in my opinion. Build your way to free tokens

pure fog Apr 4, 2026, 10:39 AM

#

real pilot Question: I'm looking at acquiring a mac to run open claw on, it's either betwee...

claude

torn folioBOT Apr 4, 2026, 1:45 PM

#

south obsidian Apr 4, 2026, 5:41 PM

#

stiff spoke This weekend I’m excited to get my shiny new Mac Studio M3 Ultra 512gb running a...

Check out exo too. Can cluster Mac’s with tb5 to expand the memory pool and with two of those you could actually run Kimi 2.5 locally at a reasonable rate. Wish I could get a 512 studio myself. 🙁

stiff spoke Apr 4, 2026, 7:59 PM

#

south obsidian Check out exo too. Can cluster Mac’s with tb5 to expand the memory pool and wit...

I only have the one beast machine. I also have an old M1 Ultra Studio with 64gb that I use for smaller models, and they are connected with a tb5 cable, but I'm pretty sure the M1 doesn't have tb5 capability.

south obsidian Apr 4, 2026, 8:53 PM

#

stiff spoke I only have the one beast machine. I also have an old M1 Ultra Studio with 64gb ...

it doesn't, only the M4 Max and M3 Ultra Studios have it. And you need TB5 for the RDMA support apple released in 26.2. BUT if you got an second machine running tb5 you can create a mismatched cluster even.

I was looking at the 256GB ultras and it's currently a 4-5 month lead time 😂

#

(there are minis and laptops that have TB5 too fwiw)

stiff spoke Apr 4, 2026, 9:04 PM

#

south obsidian it doesn't, only the M4 Max and M3 Ultra Studios have it. And you need TB5 for ...

Yeah, I might consider a set of 256gb ultra's, but I'm kinda hoping that they release some new ones during WWDC... and that they offer a new 1tb ram option. One can dream. I know those would be crazy expensive, but I'd consider it seriously.

south obsidian Apr 5, 2026, 1:25 AM

#

stiff spoke Yeah, I might consider a set of 256gb ultra's, but I'm kinda hoping that they re...

Sadly it looks like ram constraints will continue through at least 27

craggy ferry Apr 5, 2026, 5:06 AM

#

i only bought one m3 512 because i was hedging on them having an m5 512 or 1t this year

stiff spoke Apr 5, 2026, 10:28 AM

#

Me too. I’m slightly regretting just buying 1. And although I know the ram constraints are going to be around for a while, I’m hoping Apple got their supply contracts settled before it was a big issue.

steep wedge Apr 5, 2026, 5:12 PM

#

lol, what is this? 😂

stiff spoke Apr 5, 2026, 5:54 PM

#

So, fwiw, I'm using Rapid-MLX on my 512gb mac studio with a minimax-m2.5-8bit (243gb downlaod) model for text, thinking and tool use, and quen3.5-vl-112b-a10b-8bit (131gb download) for vision... loaded at the same time 🤯 I am loving this Mac Studio setup!!!

My OpenClaw setup is hot🔥 with this equipment.

steep wedge Apr 5, 2026, 6:58 PM

#

That is exciting. I am waiting for a new Mac Studio refresh to appear, and then I will probably pull the trigger. Making due with my DGX Spark cluster in the mean time. 😂

stiff spoke Apr 5, 2026, 7:01 PM

#

I am also waiting for the refresh. I’m betting a LOT of people are!

acoustic stump Apr 6, 2026, 2:49 AM

#

Ordering a souped-up MacMini M4 now vs waiting: Do we think that if we order a max spec Minin M4 Pro now (5 month wait time), that Apple will just refund if/when an M5 mini is announced, or will they offer to switch to the new chipset (assuming some small price difference)? Has anyone done had this kind of experience with Apple before?

warped monolith Apr 6, 2026, 9:56 AM

#

just started playing with oc.
with the sizes on those models y'all are talking about, what can those things do and are the mac studios fast enough?

steep wedge Apr 6, 2026, 11:51 AM

#

acoustic stump Ordering a souped-up MacMini M4 _now_ vs waiting: Do we think that if we order a...

If a new model is released before your order is fulfilled, I suspect they would either cancel your order or swap to the new model assuming specs and price are comparable.

steep wedge Apr 6, 2026, 11:57 AM

#

warped monolith just started playing with oc. with the sizes on those models y'all are talking a...

The local models are definitely more challenging. They are more prone to mistakes in general, but seem especially adept at breaking their own OpenClaw config. I have taken steps to assure mine can no longer touch the config files. I get better results with a large cloud model (e.g., Kimi 2.5), hosted by somebody like Ollama or Synthetic, as the orchestrator. It then manages the local agents/models and tries to double check their work. Nothing is as good as the frontier models like Opus 4.6, but $$$. As for the Mac Studios, they appear to be excellent performers, especially with a ton of RAM which allows running larger models.

winter lynx Apr 7, 2026, 12:06 AM

#

acoustic stump Ordering a souped-up MacMini M4 _now_ vs waiting: Do we think that if we order a...

small price difference? are you new to apple?

craggy ferry Apr 7, 2026, 4:43 AM

#

Are you? Apple most often keeps the same price for a refresh

prisma briar Apr 8, 2026, 10:58 AM

#

Hey all.
To get the full potential of OpenClaw, like "computer use", browse, UI, etc, should I have a MacMini or is a "Linux PC box" (Beelink, etc) as easy ?
If MacMini makes it much more easy to setup and use, that won't be a problem, we're talking $200 difference, and I have that budget

steep wedge Apr 8, 2026, 11:45 AM

#

prisma briar Hey all. To get the full potential of OpenClaw, like "computer use", browse, UI,...

I like some of the specific macOS integrations, but Linux could work fine. Are you more comfortable with one or the other?

prisma briar Apr 8, 2026, 2:45 PM

#

steep wedge I like some of the specific macOS integrations, but Linux could work fine. Are y...

Im a macos user since way back, Linux only in VPS.
What are the macos specific integrations that ate not bc of iCloud?

steep wedge Apr 8, 2026, 4:56 PM

#

prisma briar Im a macos user since way back, Linux only in VPS. What are the macos specific ...

From memory I’m just using a couple: iMessages and 1Password. There are some others that I’m forgetting.

prisma briar Apr 8, 2026, 4:57 PM

#

steep wedge From memory I’m just using a couple: iMessages and 1Password. There are some oth...

Thanks!

lyric orchid Apr 8, 2026, 6:41 PM

#

lyric orchid I'll have to look into the beszel server. I hacked some code into the proxy I'v...

I did end up having my proxy throw data into influxdb, including GPU power information, and had claude code build out a dashboard for grafana to see the data. The proxy is becoming a bit of a monster, but I find it useful for debugging model response to instructions, fine tuning, etc. Also learning more about how kv_cache works and others discussing "warm" and "cold" cache, so added some stats related to that to see how well my cache is being used across requests.
https://github.com/khaney64/llm-stuff?tab=readme-ov-file#overview
https://github.com/khaney64/llm-stuff/blob/main/README.md#gpu--energy
https://github.com/khaney64/llm-stuff/blob/main/README.md#kv-cache--recent-requests

woeful frigate Apr 9, 2026, 1:10 PM

#

Hey guys I am looking to build a openclaw server is a 5060 ti with 16gb of ram a good starter card?

woeful frigate Apr 9, 2026, 1:11 PM

#

prisma briar Hey all. To get the full potential of OpenClaw, like "computer use", browse, UI,...

Getting a Mac mini is hard right now. June should be when the m5 models will be released. The GPU cores had NPU with them on a m5.

prisma briar Apr 9, 2026, 1:18 PM

#

woeful frigate Getting a Mac mini is hard right now. June should be when the m5 models will be ...

Yeah, I got mine at -$100 of what it's now on Amazon, in ~Jan...
This one is to assemble a solution to a friend, who's not Mac user at all, but the Screen sharing utility makes the whole difference, to control the computer

steep wedge Apr 9, 2026, 3:25 PM

#

woeful frigate Hey guys I am looking to build a openclaw server is a 5060 ti with 16gb of ram a...

You can run small models on a Mac mini or on your 5060, but they will likely be inadequate for a meaningful OpenClaw deployment. I use some small models, but as part of a mix of models that include larger cloud models.

craggy ferry Apr 9, 2026, 6:01 PM

#

steep wedge From memory I’m just using a couple: iMessages and 1Password. There are some oth...

1Password is usable on Linux though, it’s a command line utility

craggy ferry Apr 9, 2026, 6:02 PM

#

prisma briar Yeah, I got mine at -$100 of what it's now on Amazon, in ~Jan... This one is to ...

All OSes have a screen sharing service, it’s not unique to macs

prisma briar Apr 9, 2026, 6:16 PM

#

craggy ferry All OSes have a screen sharing service, it’s not unique to macs

Oh interesting!
Put another way: is there any benefit for a non mac user, to have a mac mini?

craggy ferry Apr 9, 2026, 6:16 PM

#

I mean, they are amazingly price efficient at the base model

#

But that’s only if you will actually use the specs

prisma briar Apr 9, 2026, 6:16 PM

#

Probably gemma4 capable, without gpu?

prisma briar Apr 9, 2026, 6:17 PM

#

craggy ferry But that’s only if you will actually use the specs

Its a ongoing process.
I learn, he tries. Close feedback loop

craggy ferry Apr 9, 2026, 6:17 PM

#

All Macs have a GPU, their “integrated GPU” is far better than anything called that on the pc side

#

You could probably shoehorn gemma4 into one, the small ones obviously (e4b) but the large ones even at 4-bit quant would want a higher memory config than the base model

#

lol a 32gb mini has a 4 month lead time

#

Yeah everything that isn’t the 16gb base model is super delayed. Makes sense I guess

fathom steeple Apr 9, 2026, 10:19 PM

#

gemma works fine on mac mini m4 with 16gb ram

#

not pro performance but its fine

steep wedge Apr 9, 2026, 11:00 PM

#

prisma briar Oh interesting! Put another way: is there any benefit for a non mac user, to ha...

Yes, you become a Mac user. 😇

prisma briar Apr 9, 2026, 11:28 PM

#

fathom steeple gemma works fine on mac mini m4 with 16gb ram

How did you set it up? Any link/tutorial that you followed?

fathom steeple Apr 9, 2026, 11:41 PM

#

just get ollama

#

write on google ollama + gemma mac mini

#

its 10 minutes job including downloading 6gb of model from internet

prisma briar Apr 9, 2026, 11:42 PM

#

thanks @fathom steeple

fathom steeple Apr 9, 2026, 11:43 PM

#

happy ro help

nimble thunder Apr 10, 2026, 6:44 AM

#

hey, is anyone looking into upcycled phones with custom forked Lineage/CalyxOS?

vocal oak Apr 11, 2026, 5:04 PM

#

Hi guy I'm lookig for reviews I'm just started with little models as Qwen3.5-4B for my Agent. But I'm lookig for recomendations is this a good start if I want to use my agent to aske code solutiones? I have limited resources i have rtx 5070ti, 32gb ram ddr4 and ryzen 7 5800x. My dubt is is this software necessary to run which models or what is my limit model I can run?

echo path Apr 11, 2026, 6:01 PM

#

Google just came out with a new open source model not hard to run, with your hardware I think you should be fine 👀
https://ollama.com/library/gemma4

#

Just try out different models and see if they run, if they don't you can always move to a lighter model

#

I found gemma4 does need to be fed a detailed prompt, this video helped me a lot: https://www.youtube.com/watch?v=pwWBcsxEoLk

stone oak Apr 11, 2026, 6:19 PM

#

With a rx7900xtx, I should run like a 31b model with less context or smaller model with bigger context. Always seems like the onlything blocking me from local is the context thing

steep wedge Apr 11, 2026, 7:01 PM

#

vocal oak Hi guy I'm lookig for reviews I'm just started with little models as Qwen3.5-4B ...

You should be able to run models a little larger than 4B, but I like qwen3.5 in general.

vocal oak Apr 11, 2026, 7:02 PM

#

steep wedge You should be able to run models a little larger than 4B, but I like qwen3.5 in ...

Do you think is good idea run it in vllm? Or I should use ollama ?

steep wedge Apr 11, 2026, 7:04 PM

#

I use both. Ollama is pretty simple and a good first try. vLLM probably offers better performance, but can be a little more involved to set up.

lyric orchid Apr 11, 2026, 8:46 PM

#

vocal oak Do you think is good idea run it in vllm? Or I should use ollama ?

I'd start with ollama as it's a little easier to work with, and see what kind of performance you get, and assume you may get better performance with vllm. For my experience, substitute llama.cpp for vlllm. My performance was almost double with llama.cpp over ollama, but more of a learning curve.

grave summit Apr 12, 2026, 12:09 AM

#

I pulled the trigger on a maxed out MacBook Pro. I hope I won’t regret it! Anybody running all local with this or similar machine?

gloomy elbow Apr 12, 2026, 1:34 AM

#

anyone suggest some hardcore tests for my M4 mac mini and M2 mac mini, both 24GB? M2 is currently running two bots without an issue

warped rampart Apr 12, 2026, 7:16 AM

#

Do you guys run a server with specific hardware in a datacenter to run local LLMs? I'm trying to find a solution on how to provide an "always-on" AI assistant. I'm currently running a cheap second-hand dedicated server as mail- and fileserver with not enough power for AI. Because it is in a data center, it has a good up- and downlink. Buying a new computer for AI apps/assitant at home (RAM prices 🥲 ) and move the datacenter-server to this one and make it publicly accessible on the internet through DynDNS might be to unstable and slow for file transfers. My goal is to transform the fileserver into an EDMS with document Q&A, summarization, auto categorization. Just upload a document and let the LLM handle in what cabinet the document needs to be stored.

warm blade Apr 12, 2026, 8:50 AM

#

what bit of gemma 4 should I run with 4070 and 48gb ram

rocky violet Apr 12, 2026, 9:46 AM

#

warm blade what bit of gemma 4 should I run with 4070 and 48gb ram

you can run the q8 or full precision easily i guess . depends how many parameter model of gemma4

frozen sentinel Apr 13, 2026, 12:18 AM

#

Why am I finding M3 512gb studios on eBay for 2k? Are those scams you think?

dawn cosmos Apr 13, 2026, 4:20 AM

#

grave summit I pulled the trigger on a maxed out MacBook Pro. I hope I won’t regret it! Anybo...

I hope you had done the research on this. Assuming you have tkaen M5 max chip with 128 GB Ram - you will be able to run gpt-oss-120b q4 (or 70B and full precision) - but they will never be near the 5.4 or similar frontier models. Hence, for now those who are yet to make the trigger, do a economical comparison for next 2 years (where more hardware would be available cheaper) whether it makes sense to splurge of say $5K or use $100 per month for 1 year (for frontier models) and see whats available next year.
of course, if you need the M5 Mac for other activities, yeah this analysis in invalid

thorn wagon Apr 13, 2026, 7:24 AM

#

is it worth to get 2x rtx 3090 or ryzen 395 128gb mini pc?

sterile sonnet Apr 13, 2026, 8:46 AM

#

frozen sentinel Why am I finding M3 512gb studios on eBay for 2k? Are those scams you think?

I actually find a lot of them for 1k

#

Some of the photos are definitely AI

wide locust Apr 13, 2026, 10:08 AM

#

Hi all. I’m thinking of getting a Mac Mini M4 chip and wondering what others would recommend in terms of memory and storage. I want to run some local models on it, too.

frozen sentinel Apr 13, 2026, 10:12 AM

#

sterile sonnet Some of the photos are definitely AI

How? Are people just offloading waiting for M5?

thorny abyss Apr 13, 2026, 4:04 PM

#

frozen sentinel How? Are people just offloading waiting for M5?

its almost certainly a scam

thorn wagon Apr 13, 2026, 5:53 PM

#

dawn cosmos I hope you had done the research on this. Assuming you have tkaen M5 max chip wi...

$100/month will last for 4 years, till that time hardware will be heavily depreciated, not worth to invest

dawn cosmos Apr 13, 2026, 9:44 PM

#

thorn wagon $100/month will last for 4 years, till that time hardware will be heavily deprec...

Yep exactly, and in coming 1 ot 2 years. There would ve more hardware choice due to competition and if current ram trends are any indications, it would be relatively cheaper

last cedar Apr 13, 2026, 11:59 PM

#

any one here working on humanoid and robotics ?

shut oak Apr 14, 2026, 1:05 AM

#

I got a question

steep wedge Apr 14, 2026, 2:34 AM

#

frozen sentinel Why am I finding M3 512gb studios on eBay for 2k? Are those scams you think?

You got me excited, but all the cheap prices are from folks who just created their account, and most are in other countries (I’m US). I’m leaning toward scams.

surreal nova Apr 14, 2026, 4:24 AM

#

anyone got some advice on dipping a toe into Abliterated/uncensored in a local context? Just tried supergemma4 26b uncensored fast, and, 171 t/s. I need bigger and more reasoning, Tring to push architecture and brain to local as much as I can. But maybe just have to escallate to cloud for that. Good worker be, but for local rag + internet rag + images, I have more hardware headroom to burn

craggy ferry Apr 14, 2026, 4:28 AM

#

gemma4 31b is significantly better

surreal nova Apr 14, 2026, 4:29 AM

#

craggy ferry gemma4 31b is significantly better

agreed, done with everything below 30b
this one still a cannidate as a worker bee tho

craggy ferry Apr 14, 2026, 4:30 AM

#

i like 26b as a research agent

#

it's 4x as fast as 31b because moe etc

surreal nova Apr 14, 2026, 4:30 AM

#

ya, but I talk in screenshots, both of web, task manager (sorry win guy here) and, life

#

I am actually after slower (and more reasoning)

craggy ferry Apr 14, 2026, 4:31 AM

#

i think both are useful

#

but yes you definitely need at least one high end model like that

surreal nova Apr 14, 2026, 4:31 AM

#

Need to find my main first, then the worker bee(s)

#

gemma-4-31b-it-uncensored-heretic
57 t/s

#

shit, I can go bigger

#

I have noticed a correlation with uncensored and q4, which makes sense but, I dont actually need that

craggy ferry Apr 14, 2026, 4:32 AM

#

i am really sad there's no gemma4-70b, yeah

surreal nova Apr 14, 2026, 4:32 AM

#

70b is such a sweetspot

#

but 1yr old lamma, ya no

#

havent touched the quens yet, nemotron 120b NVFP4 thats prob where I end up

#

model size, context bandwidth, speed, holy crow

craggy ferry Apr 14, 2026, 4:33 AM

#

i like qwen3.5-397b, but

#

honestly gemma4-31b is kind of close to it in benchmarks

surreal nova Apr 14, 2026, 4:35 AM

#

so I am literally here on local llm cause perplexity dropped Kimi K2.5

#

Gemma 4 31b, straight from nvidia, ya, thats my baseline for sure

craggy ferry Apr 14, 2026, 4:35 AM

#

that's my fav thing about local llm

surreal nova Apr 14, 2026, 4:35 AM

#

not gonna happen again

#

(also not buying 8x GPU)

craggy ferry Apr 14, 2026, 4:35 AM

#

you have the thing you have until you decide to move

surreal nova Apr 14, 2026, 4:37 AM

#

still want to test gemma 4 31b uncensored tho

#

gemma-4-31b-it-uncensored-heretic is rocking my vision + uncensored test

#

this is the one by llmfan46

thorn wagon Apr 14, 2026, 5:46 AM

#

dawn cosmos Yep exactly, and in coming 1 ot 2 years. There would ve more hardware choice due...

models will be compressed more, if our system cant run 70b model, in next 6months chances are more it will run
and with turboquant, its getting more interesting

dawn cosmos Apr 14, 2026, 6:43 AM

#

thorn wagon models will be compressed more, if our system cant run 70b model, in next 6month...

I think it will be more like multi-LLM operation. In the sense, there will be generic model which can be cloud, but a smaller specialized model(s) for each area. For example, lets say you are coding only in Java, then only that specialized models would need to be run locally. Same for the enterprises as for example a frieght logistics will only have spefici smallmodel and a cloud generic model. The idea of context separation, identity isolation, etc will need to be handled and thats where AI industry is going

frozen sentinel Apr 14, 2026, 12:00 PM

#

steep wedge You got me excited, but all the cheap prices are from folks who just created the...

I saw some with 100+ reviews

steep wedge Apr 14, 2026, 12:01 PM

#

frozen sentinel I saw some with 100+ reviews

Were they Buy Now prices or auctions?

frozen sentinel Apr 14, 2026, 12:07 PM

#

steep wedge Were they Buy Now prices or auctions?

One I’m looking at is buy now

steep wedge Apr 14, 2026, 12:11 PM

#

frozen sentinel One I’m looking at is buy now

I believe you, but I'm not finding them. When I filter by "Buy Now" and US only, the lowest I find is $4,400 and they have 0 reviews. The remaining results reach into the $20k range (which is insane).

steep wedge Apr 14, 2026, 1:45 PM

#

frozen sentinel One I’m looking at is buy now

Let me know if you snag one; I'll live vicariously through you. 🙂

frozen sentinel Apr 14, 2026, 1:46 PM

#

steep wedge Let me know if you snag one; I'll live vicariously through you. 🙂

lol I’m too scared

steep wedge Apr 14, 2026, 2:12 PM

#

frozen sentinel lol I’m too scared

Send me the link then? I may be too scared too, but I am curious.

frozen sentinel Apr 14, 2026, 3:23 PM

#

steep wedge Send me the link then? I may be too scared too, but I am curious.

Here’s one https://ebay.us/m/B7T38o

solemn lodge Apr 14, 2026, 3:34 PM

#

frozen sentinel Here’s one https://ebay.us/m/B7T38o

That seller has 30+ items for sale at great prices that would be beneficial for an AI user......but zero sales......

Yeah, good luck with that.

sterile fjord Apr 14, 2026, 3:55 PM

#

Hi, I am new to this discord. I am trying to get openclaw working on my 16 GB Thinkpad T14s running AMD Ryzen 5650, running Ubuntu 24 LTS. I want to use LMStudio running IBM Granite 3.2_8B as my main AI with Anthropic as heavy lifter. But even though LM Studio works fine in chat mode on its own, any prompt, even a simple "Hello" becomes huge (~20,000 tokens) when coming from openclaw. Naturally this bogs down the system, and I have never received a response from "Hello" when in OpenClaw TUI. I am a complete newbie to OpenClaw and AI in general so I wonder if anyone can help me. I have spent hours with CoPilot working on this and it has not increased my respect for AI very much - what a waste of time! I think maybe a human expert might be a lot more helpful.

steep wedge Apr 14, 2026, 4:00 PM

#

frozen sentinel Here’s one https://ebay.us/m/B7T38o

I’ve seen that one, but I don’t see a buy now option. It only shows me options to watch or contact seller. 🤷🏻‍♂️

frozen sentinel Apr 14, 2026, 4:02 PM

#

steep wedge I’ve seen that one, but I don’t see a buy now option. It only shows me options t...

Little bit more but still…..

https://ebay.us/m/2c6KYO

tropic jolt Apr 14, 2026, 4:13 PM

#

sterile fjord Hi, I am new to this discord. I am trying to get openclaw working on my 16 GB T...

hi, can you try out: agents.defaults.localModelMode: "lean"
should be documented in docs/gateway/local-models.md

wind cloud Apr 14, 2026, 4:15 PM

#

Hey guys. I am wondering what setup would be good for creating a local Ai server? Is this reasonable for 6k? Any advice helps

thorny abyss Apr 14, 2026, 6:26 PM

#

frozen sentinel Little bit more but still….. https://ebay.us/m/2c6KYO

these are scams. Why would anyone do this? There is no logical sense other than it's a scam

frozen sentinel Apr 14, 2026, 6:34 PM

#

thorny abyss these are scams. Why would anyone do this? There is no logical sense other than ...

Yeah that’s what I’m thinking

thorny abyss Apr 14, 2026, 6:51 PM

#

frozen sentinel Yeah that’s what I’m thinking

it 100% is a scam, zero doubts

kindred coral Apr 14, 2026, 7:27 PM

#

does anyone try radxa 5t

winter lynx Apr 15, 2026, 1:28 AM

#

dawn cosmos I think it will be more like multi-LLM operation. In the sense, there will be ge...

reminds me of the time I got a nonverbial code completion AI model... all it would do when I tried to talk to it was scream "NO"

karmic blaze Apr 15, 2026, 1:52 PM

#

So I have been trying to get openclaw to work locally on my old M1 MacBook Pro 16gb ram. The idea was to have an ai personal assistant to perform relatively simple tasks. I started setting up openclaws workflows and tests with my OpenAI plus subscription which uses codex 5.4 and it has been working great. Once the tasks and workflows were tested, I tried changing my main LM to a local using ollama and Qwen3 4b or llama 3.2 3b to handle cron jobs and general tasks.

Every time I have tried this, clawbot dies and stops responding.

I have checked ram consumption, total approaches 15gb but doesn’t overflow or reaches HD swapping

I have checked openclaw health, and it’s running fine

I have checked ollama directly in the app or terminal, and it runs and replies fine

The tasks: as simple as read my email or check information on a website

What am I missing? Is my MacBook Pro not powerful enough to run openclaw with a local lm locally?

lyric orchid Apr 15, 2026, 2:06 PM

#

karmic blaze So I have been trying to get openclaw to work locally on my old M1 MacBook Pro 1...

Are you seeing anything useful in openclaw or ollama logs at this time? Errors? Looping (ie maybe model is stuck or timing out). Use a tool to monitor GPU usage (not sure what that is on Mac) to see if it's busy. What context size are you using for those models? I wouldn't go straight to primary with a small model, create a cron job with some simple instructions and point to a small model, and get that working first so you know the model is working. I've found it useful to put a proxy between openclaw and ollama so you can see the traffic/interaction/errors.

karmic blaze Apr 15, 2026, 2:30 PM

#

Mac has its “activity monitor” with spikes from 4gb memory usage to 15gb when the prompt is sent, and ollama uses that ram. Besides that, logs don’t show any error besides timeout after a while

lyric orchid Apr 15, 2026, 4:55 PM

#

karmic blaze Mac has its “activity monitor” with spikes from 4gb memory usage to 15gb when th...

I initially had openclaw running on a 8gb M1 Mac mini, but moved it to ubuntu. I was always hitting a GPU on another machine though. I think there is a default 60 second timeout in openclaw, you may have to bump that up. out of curiousity, when running a prompt of some sort, and you see the ram spike, does that extend (run) longer than when openclaw times out? i.e. model may still be processing? you should give qwen3.5-9b a try - that works well for me on an RTX 4070 (12 gb). as I said, you may want to put a proxy between openclaw and ollama so you can see the conversation. you wouldn't believe how much stuff openclaw adds to the prompt. one thing that can help with that - go into the dashboard, select agents, pick your agent, and go into skills, and disable ALL of the skills you don't want or need - they used to be enabled by default - any enabled skill ends up having data sent in the prompt to describe it. I built out this proxy with claude code, it's become a bit of a monster, but works well for me for debugging. I originally wrote it to work with ollama, but moved on to llama.cpp, but it should still work with ollama. https://github.com/khaney64/llm-stuff/blob/main/proxy.js

magic hull Apr 15, 2026, 5:52 PM

#

karmic blaze Mac has its “activity monitor” with spikes from 4gb memory usage to 15gb when th...

if you try LM studio on developer mode, the developer tab has a log on the bottom part

#

regarding hardware, is it anyone using the nvidia dgx spark or its counterparts for other OEM for running openclaw in local mode? or is it an overkill?

karmic helm Apr 15, 2026, 8:45 PM

#

magic hull regarding hardware, is it anyone using the nvidia dgx spark or its counterparts ...

2 sparks arrived today . easy setup with qwen 3.5 397 using codex (it did everything using eugr's vllm docker setup.
Only really needed if you NEED to run local due to the data one is using. otherwise 9k for two machines that will go obselete makes no sense in comparison to even the highest tier OpenAI subscription

#

and its a breautiful machine

grave summit Apr 15, 2026, 9:41 PM

#

dawn cosmos I hope you had done the research on this. Assuming you have tkaen M5 max chip wi...

All fair points. My current Intel MacBook Pro is from 2017 and cost nearly $3k back then. It was time for an upgrade due to no more updates available. I could have gotten away with the 64gb RAM version for work but, the $800 upgrade to double my RAM just made sense financially. I’m hoping to get near or on par Haiku level operations that will give me complete privacy. For me, if it runs OSS 120b at 60 tok/sec, that’s a big win! I’m tired of being throttled and rate limited. I use Chat and Claud a TON. Not to say I would expect that level of LLM locally - It will get there though!! 14 day return policy’s are always useful. Anyway, appreciate any user feedback of this specific equipment.

grave summit Apr 15, 2026, 9:43 PM

#

dawn cosmos I hope you had done the research on this. Assuming you have tkaen M5 max chip wi...

And yes, M5 max with 40-core gpu. Max bandwidth in this puppy. 128gb.

dawn cosmos Apr 15, 2026, 11:38 PM

#

grave summit And yes, M5 max with 40-core gpu. Max bandwidth in this puppy. 128gb.

well, you have bought latest and greatest - njoy! as I stated earlier, if you have just bought from local LLM, then it is definitely not worth IMHO. You are better off using AI+ max or GX10 hardware but if you have other editing works that you do - then you know the best.
for modesl comparison of how gpt-oss compares - see here https://artificialanalysis.ai/

hollow coral Apr 16, 2026, 3:20 AM

#

grave summit All fair points. My current Intel MacBook Pro is from 2017 and cost nearly $3k b...

what are you running on this? I've been running gemma 26 b

dawn cosmos Apr 16, 2026, 3:22 AM

#

hollow coral what are you running on this? I've been running gemma 26 b

What token speed do you get?

hollow coral Apr 16, 2026, 3:23 AM

#

I'm not at 128GB, but I'd have to check. I've been trying different paramters and haven't yet looked at logs for number of tokens

#

Honestly pretty new to LMstudio, but did the math and my api usage was such that it's cheaper to run a local model

#

so gemma-4-26b-a4b q8_0 with MAcbook M5 Pro Max 48GB RAM
180,000 context window
GPU offload 30
CPU thread pool size 4
prompt eval time 458 tokens per second
eval time 68.45 tokens per second

#

I was running it with the heavier models last night on some cron jobs and it crashed 4 times, so I'm still figuring out safe parameters

hollow coral Apr 16, 2026, 3:31 AM

#

dawn cosmos What token speed do you get?

I don't know if that's a good or bad token speed

dawn cosmos Apr 16, 2026, 3:38 AM

#

hollow coral I don't know if that's a good or bad token speed

you are running at q8 - with your hardware I think you can even do FP16 (half precision). So here it is not about speed - it is about quality.

hollow coral Apr 16, 2026, 3:38 AM

#

I'm using it to do web scraping on publicly available data so I need a beefier model

#

tried the gemma-4-e4b and wasn't really happy with the results

dawn cosmos Apr 16, 2026, 3:39 AM

#

hollow coral I'm using it to do web scraping on publicly available data so I need a beefier m...

yeah but Q8?? with with the same model - go for FP16 and you should stil get decent token and improved quality. I think it's this one - https://huggingface.co/mlx-community/gemma-4-26b-a4b-it-bf16

hollow coral Apr 16, 2026, 3:39 AM

#

brb

#

I think that model is too big. LMStudio is saying so anyway

#

Hugging face seems to agree

dawn cosmos Apr 16, 2026, 4:04 AM

#

hollow coral I think that model is too big. LMStudio is saying so anyway

Hmmm. With your 48gb, it should run
Try out and see.
Or use q4

hollow coral Apr 16, 2026, 4:05 AM

#

I'm going to try q-4 tonight. Going to bed after this because I can't keep staying up this late but I'll report results in the morning. Thanks for the help

dawn garden Apr 16, 2026, 7:37 AM

#

what are you use it for

rocky violet Apr 16, 2026, 4:28 PM

#

hollow coral so gemma-4-26b-a4b q8_0 with MAcbook M5 Pro Max 48GB RAM 180,000 context window...

i am not sure about mac coz i dont know how their unified memory works but for a 26b model , you can easily run it at q8 with enough buffer for context window. you dont need to offload , you should be able to fully load the 26b model but dont try to load the fp16 model of it ( 1billion paramter fpt16 = 2gb vram)

blazing copper Apr 17, 2026, 3:50 AM

#

I tried to run Ollama on K8 Plus 32 GB with terrible results and returned it. In the meantime my OpenRouter bill is shocking. What models are y'all running locally with decent tool calling?

rocky violet Apr 17, 2026, 5:05 AM

#

blazing copper I tried to run Ollama on K8 Plus 32 GB with terrible results and returned it. In...

it depends which model you tried to run exactly

neat bay Apr 17, 2026, 8:42 AM

#

blazing copper I tried to run Ollama on K8 Plus 32 GB with terrible results and returned it. In...

you know OpenRouter has :free models, right? some of them (most?) super popular with OpenClaw (according to OpenRouter charts)
With limited hardware you need to try Q8 and Q4 models.

tall yew Apr 17, 2026, 2:27 PM

#

blazing copper I tried to run Ollama on K8 Plus 32 GB with terrible results and returned it. In...

I tried Gemma4:26b lately - it did quite well standalone with a small number of tools - but in the context of OpenClaw it just needed too much VRAM.

blazing copper Apr 17, 2026, 2:41 PM

#

rocky violet it depends which model you tried to run exactly

I had tried a few but I think I started with Qwen 2.5 coder Q4 KM

grave summit Apr 17, 2026, 5:48 PM

#

hollow coral what are you running on this? I've been running gemma 26 b

I haven’t received it yet. Due in next week. I’m still throwing ideas around. Likley try Qwen3.5-122B-A10B and Qwen3.6-35B-A3B. I’m thinking I’ll need 2-3 different models with dedicated use case. How’s the gemma running?

rocky violet Apr 17, 2026, 6:42 PM

#

blazing copper I had tried a few but I think I started with Qwen 2.5 coder Q4 KM

why use qwen 2.5 when there is qwen 3.5?

slim ether Apr 17, 2026, 8:14 PM

#

What models do you suggest for mac mini m4 24gb

clever copper Apr 18, 2026, 6:43 AM

#

Hi there, anyone using a remote ollama with an rtx5080? I use qwen3.5:9b now 130contextlength eslewhise its starting to cpu offload. Cant seem to get a bigger model to run in quantization. When i do its offloading like 30/80. and then i get timeouts…

untold sorrel Apr 18, 2026, 4:44 PM

#

Super Noob Question: "Openclaw and local LLM. What's the absolute minimum Hardware requirement?"

Hi everyone,

Openclaw is quite cool and I want to "play a bit" with it. I've got it running, but I hit my session limits quite fast. So I am wondering if there is another way.

I use Claude Code (Pro) and Ollama (Pro).
I use Claude for a bit PHP / Website tinkering and Ollama for Openclaw.
I got "naked" Ollama running and even got some LLM downloaded.
Ok, low Token count, but it works.

I understand the hardware requirements for Openclaw, but the LLM is still a bit of a miracle for me.

So my questions are the following:

ONLINE

What model should I use with minimum cost?
What model would you recommend?

OFFLINE
I can chat with Ollama, but Openclaw is not responding ...
What is the "absolute minimum Hardware requirement" to run Openclaw / Ollama offline?
I don't need absolute performance, it should just work.

Thank you for your help.
Bernd

PS: If you have usage credits left or even run your own LLM server i could use, please speak to me. 🙂

rocky violet Apr 18, 2026, 5:23 PM

#

untold sorrel Super Noob Question: "Openclaw and local LLM. What's the absolute minimum Hardwa...

if you have claude pro then why not use sonnet with openclaw?
i would say use glm 5.1 since you have ollama pro too , i use glm 5.1 from the glm coding plan

for the offline part , i will say you will need 16gb vram to load a decent model of atleast 28 - 30b paramters at q4 on your pc without offloading , if you want to offload then you can go beyong 30b but it will be more time consuming for each query , for local you can try qwen 3.5 / glm 4.7 flash / gemma 4 etc

warm herald Apr 18, 2026, 8:38 PM

#

I have a 3090 24GB vram and I run Gemma4:27b. When I use openwebui I don't see it offload and it's very responsive but if I use openclaw I do see CPU going nuts.

lyric orchid Apr 18, 2026, 9:24 PM

#

rocky violet if you have claude pro then why not use sonnet with openclaw? i would say use gl...

Probably need 24 gig for the 28-30b models?

lyric orchid Apr 18, 2026, 9:26 PM

#

warm herald I have a 3090 24GB vram and I run Gemma4:27b. When I use openwebui I don't see i...

What context size, and what are you running the model on? I've had success with qwen35 35b a4b on 3090. Need to do a little more experimenting with gemma4 26b

warm herald Apr 18, 2026, 9:27 PM

#

lyric orchid What context size, and what are you running the model on? I've had success with...

I installed openclaw today so I haven't tweaked it yet. But I find things very very slow.

#

The model runs on ollama in docker on my server

lyric orchid Apr 18, 2026, 9:30 PM

#

I started with ollama but moved to llamacpp for much better performance and ability to fine tune. I love llama-bench for figuring out all the parameters to use

warm herald Apr 18, 2026, 9:30 PM

#

I just switched to qwen3.5:9b and it's faster (ofc) but still not very useful. Plus I'm concerned the 9b will be too stupid in the end.

#

I'll checkout llamacpp

clever copper Apr 18, 2026, 9:32 PM

#

I guess you need to decrease "contextWindow": 262144,

#

This blows up your memory

#

Then it starts to offload

warm herald Apr 18, 2026, 9:36 PM

#

What's a more realistic context window?

clever copper Apr 18, 2026, 9:36 PM

#

Experiment and check ollama ps

#

I did 130k

warm herald Apr 18, 2026, 9:38 PM

#

But I set it in ai-agents config right?

rocky violet Apr 18, 2026, 9:41 PM

#

lyric orchid Probably need 24 gig for the 28-30b models?

it depends on the quantization , for adjusting a 28/30b in 24gb vram then you need to go below q4 and anything below q4 is just useless

lyric orchid Apr 18, 2026, 9:45 PM

#

Here are my llamacpp settings for the bigger gemma and qwen35 models in 24 gb
https://github.com/khaney64/llm-stuff/blob/main/model-test-report-2026-04-11.md#llama-server-launch-commands

clever copper Apr 18, 2026, 9:46 PM

#

I set it in the models part in the .json

lyric orchid Apr 18, 2026, 9:51 PM

#

clever copper I set it in the models part in the .json

Well that limits what openclaw will use, but the model side has a setting too - ollama might be smaller by default

warm herald Apr 18, 2026, 9:52 PM

#

Thanks. I'll dig more into this tomorrow. Right now the time between me sending message and ollama workers going to work is what's taking the most time. Maybe there's something there that I am not fully grasping. I dont' see why it would spin up workers on all my cores.

warm herald Apr 19, 2026, 7:20 PM

#

warm herald Thanks. I'll dig more into this tomorrow. Right now the time between me sending ...

So that was just me looking at htop the wrong way. Those were all threads 😄

magic hull Apr 20, 2026, 9:34 AM

#

I want to buy a mini-pc for local llms with openclaw and my files. I put my eyes on this one:

https://www.bee-link.com/products/beelink-ser10-max-amd-pro-ryzen-ai-9-hx-470

what could you tell me about it?

steady pendant Apr 20, 2026, 1:14 PM

#

Can anyone suggest a model that can run on a VPS with 96GB RAM and 18vCPU with no GPU. I've tried qwen3.6, Gemma4 and qwen3.5 but no joy.

thorny abyss Apr 20, 2026, 6:46 PM

#

steady pendant Can anyone suggest a model that can run on a VPS with 96GB RAM and 18vCPU with n...

I think the issue is that those models are better suited to GPU inference. 96 GB RAM sounds great, but RAM alone won’t save you if the CPU is the bottleneck. What model size and quant did you try?

steady pendant Apr 20, 2026, 10:05 PM

#

thorny abyss I think the issue is that those models are better suited to GPU inference. 96 GB...

I tried qwen3.6, gemma4, phi4, qwen2.5, llama3,2 and gemma3 but I couldn't get a response after they were running. I have since realised the VPS isn't dedicated so it's all shared. So I'm guessing no model will run efficiently.

magic raven Apr 21, 2026, 1:20 AM

#

magic hull I want to buy a mini-pc for local llms with openclaw and my files. I put my eyes...

okay

#

good choice

#

they literally promote openclaw in the desc lmao

dawn cosmos Apr 21, 2026, 3:59 AM

#

steady pendant I tried qwen3.6, gemma4, phi4, qwen2.5, llama3,2 and gemma3 but I couldn't get a...

CPU with RAM will not do - you need GPU or AI nodes

rocky violet Apr 21, 2026, 8:07 AM

#

steady pendant Can anyone suggest a model that can run on a VPS with 96GB RAM and 18vCPU with n...

check modal , rundpod or hugginface , they provide gpu compute at a reasonable price

tardy marsh Apr 21, 2026, 1:49 PM

#

magic hull I want to buy a mini-pc for local llms with openclaw and my files. I put my eyes...

I was also looking at the same one, it seems good

magic hull Apr 21, 2026, 5:58 PM

#

Now I'm checking the difference with the apple chips, because it seems that the bottleneck is the memory bandwith (RAM, if the model is loaded in it). With dual channel the theoretical speed is around 89.6 GB/s

Device            | Bandwidth  | TTFT      | Speed     | Feel
------------------|------------|-----------|-----------|-----------
M1 Ultra          | 800 GB/s   | ~1.1s     | 45-70 t/s | Great
M4 Max            | 546 GB/s   | ~1.2s     | 40-60 t/s | Great
M1/M2 Max         | 400 GB/s   | ~1.5s     | 35-55 t/s | Good
M4 Pro            | 273 GB/s   | ~2.2s     | 25-40 t/s | Okay
M4 (Base)         | 120 GB/s   | ~3.0s     | 10-18 t/s | Tight
Beelink SER10     | ~90 GB/s   | ~3.0s     | 20-35 t/s | Slow

#

this is a comparison made with gemini, could someone confirm this token generation could feel slow?

sterile lotus Apr 22, 2026, 1:33 AM

#

hello, can i use openclaw in my android?

fluid jackal Apr 22, 2026, 1:48 AM

#

sterile lotus hello, can i use openclaw in my android?

In theory you can do both:

install openclaw gateway on an android via CLI (I have not tried this)
install the android app to connect to the gateway (this is standard, the app is in the play store)

sterile lotus Apr 22, 2026, 1:49 AM

#

fluid jackal In theory you can do both: - install openclaw gateway on an android via CLI (I h...

okay, thx for your information.. i'll try it.

clear quartz Apr 22, 2026, 9:24 AM

#

I probably should have started here... 😄 - Anybody have any hands on with the dell pro max w/gb10?

leaden rapids Apr 22, 2026, 9:29 AM

#

i need some advice. i currently have 3070ti 8gb. im thinking about upgrading to amd r9700 32gb. i do some light kilocode and recenlty been toying with openclaw. should i upgrade to r9700 or just use gemini api?

clear quartz Apr 22, 2026, 11:58 AM

#

leaden rapids i need some advice. i currently have 3070ti 8gb. im thinking about upgrading to ...

Gemini made some great comparison tables for me as I designed my new machine for work. I would look carefully outside of the nvidia ecosystem. Ask it to do a deep dive analysis for you. Just rmember that no matter what GPU you get, your data still has to move across your PCIe bus.

bright bolt Apr 22, 2026, 6:58 PM

#

magic hull Now I'm checking the difference with the apple chips, because it seems that the ...

Server setup like:
Motherboard: ASRock Rack GENOA2D24G-2L+
CPU: 2x EPYC 9535P

You could get 1.2 TB/s (614 GB/s each CPU)
And motherboard can go as high as 12TB DDR5 RAM

#

considering major self hosted models are sometimes 1.5TB in size, 12TB potential capacity is a great way to be prepared to run HUGE models in the future

or perhaps someone has a use case where they want to load more than one model on their server

vocal island Apr 23, 2026, 12:44 AM

#

Hey, I'm contemplating buying a Mac Mini, could it possibly run models from Ollama like (Qwen 3.6, Kimi 2.6 or MiniMax 2.7)

royal radish Apr 23, 2026, 1:23 AM

#

Hi All, i'm receiving my mac M3 ultra tomorrow. what model do you recommend to "start/run a company" with voice AI calling inbound and task automation with visual compute

rocky violet Apr 23, 2026, 9:38 AM

#

vocal island Hey, I'm contemplating buying a Mac Mini, could it possibly run models from Olla...

depends which parameter model you are talking about and which quantization

fluid jackal Apr 23, 2026, 5:47 PM

#

royal radish Hi All, i'm receiving my mac M3 ultra tomorrow. what model do you recommend to "...

I'm assuming you wanna stay local:

the qwen models are solid open source options for VL -- the bigger the better (for visual compute)
ironically the qwen models are also solid for your core LLM if you want to stay local but there's a plethora of options (I specialize in coding so unsure if there's a better suited one for your needs)
for voice TTS/STT you can search online but there are literally a dozen or so options and all of them are fairly solid and dont sound like a blatant robot

lyric orchid Apr 23, 2026, 6:09 PM

#

fluid jackal I'm assuming you wanna stay local: - the qwen models are solid open source optio...

Curious which qwen model(s) do you use for coding, and via what tools? I've got openclaw usage with 3.5 35b-a3b working well, want to explore local options for coding. Do you use different settings like temperature for coding vs openclaw?

fluid jackal Apr 23, 2026, 6:22 PM

#

lyric orchid Curious which qwen model(s) do you use for coding, and via what tools? I've got...

ironically the new qwen models are pretty solid there too! 3.5 and 3.6 -- the bigger OSS models are still better but most people can't fit those

brave beacon Apr 23, 2026, 6:57 PM

#

Hi guys, im just getting into all this ai stuff and i want to run claude code or openclaw locally, i have a ryzen 7 7800X3D cpu, 32gb ram and a rx 6700xt. what model would yall recommend me? i want to get the most out of it running locally for it to be able to code as well as possible, and possibly run fully autonome tasks on my 2nd burner pc for security reasons, while still using the main pc computation power

#

ive tried gemma 4 27b and it just halcinated and wasnt really able to do any real coding

rocky violet Apr 23, 2026, 7:03 PM

#

brave beacon Hi guys, im just getting into all this ai stuff and i want to run claude code or...

try the qwen models , even tho cloud hosted oss model will be better than running oss models on your own gpu coz of infrastructure and configurations

brave beacon Apr 23, 2026, 7:03 PM

#

rocky violet try the qwen models , even tho cloud hosted oss model will be better than runnin...

wich qwen should i tryN

#

?

rocky violet Apr 23, 2026, 7:03 PM

#

try with qwen 3.5 models . pick any model of q4 or more quantization

#

make sure the parameters isnt huge or else it wont load on your gpu

brave beacon Apr 23, 2026, 7:05 PM

#

what about glm 4.7 flash? thats what chatgpt recommended me

#

but idk really

rocky violet Apr 23, 2026, 7:11 PM

#

yeah its good too but its quite old as well

#

like right now glm 5.1 is the latest

#

and flash version of any model is kind of nerfed

brave beacon Apr 23, 2026, 7:12 PM

#

so.... ishould try the qwen 3.5?

rocky violet Apr 23, 2026, 7:14 PM

#

i will suggest that

#

just dont burn your gpu

north ocean Apr 24, 2026, 2:59 AM

#

wow, bee-link makes Clawd-colored PCs now
https://www.bee-link.com/pages/openclaw

delicate pasture Apr 24, 2026, 6:04 AM

#

north ocean wow, bee-link makes Clawd-colored PCs now https://www.bee-link.com/pages/opencla...

damn thats insane

visual swift Apr 24, 2026, 3:22 PM

#

Hello. I want to create local instance of openclaw and ollama gpt-oss. What are the recommended pc specs for the start up? I appreciate the response, thank you

rocky violet Apr 24, 2026, 3:41 PM

#

visual swift Hello. I want to create local instance of openclaw and ollama gpt-oss. What are ...

dont consider gpt oss

visual swift Apr 25, 2026, 1:17 AM

#

rocky violet dont consider gpt oss

Then what is your recommendation?

gritty prism Apr 25, 2026, 8:32 AM

#

visual swift Then what is your recommendation?

You can do a local instance on basically anything

#

But the only local models you can run are small models because the bigger models just run too slow to do much of anything

#

Unless you spend A LOT. But that amount does not make sense imo compared to just using your ai subs

rocky violet Apr 25, 2026, 9:49 AM

#

visual swift Then what is your recommendation?

use some latest and top performing oss models . like within 30b parameters and no less than q4 quant

sturdy gazelle Apr 25, 2026, 10:23 PM

#

Hey guys! Can you recommend some cheap Android phones for the OpenClaw? Do I need root?

dawn cosmos Apr 25, 2026, 11:42 PM

#

sturdy gazelle Hey guys! Can you recommend some cheap Android phones for the OpenClaw? Do I nee...

Have you explored everything you want to do it in pc? Android is just another remote node that you can even run as a docker contiane rin your pc

fluid jackal Apr 26, 2026, 2:23 PM

#

sturdy gazelle Hey guys! Can you recommend some cheap Android phones for the OpenClaw? Do I nee...

similar question to @dawn cosmos
@sturdy gazelle , what are you looking to do? looking for an existing OpenClaw instance to control/access android or looking for it to live (gateway) on the android?

feral violet Apr 26, 2026, 2:24 PM

#

It's possible to run OpenClaw on a Android?!

sturdy gazelle Apr 26, 2026, 2:26 PM

#

I want to follow the openclaw guide and keep two-phone setup described here https://docs.openclaw.ai/start/openclaw and at some point have openclaw installed on this device

fluid jackal Apr 26, 2026, 2:27 PM

#

feral violet It's possible to run OpenClaw on a Android?!

in theory...lol -- I have not tried it!

feral violet Apr 26, 2026, 2:27 PM

#

wow... didnt know that!

fluid jackal Apr 26, 2026, 2:28 PM

#

sturdy gazelle I want to follow the openclaw guide and keep two-phone setup described here http...

ahh okay, so for communication channels (signal, whatsapp, etc) -- yea that's easy -- no you definitely DO NOT need root for that. You dont even really need antoher phone technically (I used a Google Voice account to setup my Signal connection)

feral violet Apr 26, 2026, 2:28 PM

#

The 24.4.26 version is giving problems, right? I'm new to this and installed that version but it doesnt work, even if I follow the docs.openclaw.ai page...

fluid jackal Apr 26, 2026, 2:29 PM

#

feral violet The 24.4.26 version is giving problems, right? I'm new to this and installed tha...

I was gonna tell you ask Krill! ...but it seems like he' nappin on the job again. ping me in the users helping users channel we can continue there

dawn cosmos Apr 26, 2026, 3:42 PM

#

Run openclaw in a docker container (as gateway). Android installation will avt as a. Node to connect to a gateway. For now, I suggest do not use android app files unless you have compiled them.

small crescent Apr 27, 2026, 2:06 AM

#

Is there a MLX specific channel where we can post a question on ?

craggy ferry Apr 27, 2026, 3:27 AM

#

https://discord.com/channels/1456350064065904867/1478210877541974157

dawn cosmos Apr 27, 2026, 1:33 PM

#

Stop scamming people. And if you have really successful, why not showcase it here for everybody to benefit. Typical scammers!

dense vector Apr 27, 2026, 10:53 PM

#

Im 12 years old and I recently found out about openclaw from my father who works at Microsoft, he told me that I can easy make $5000 every day just by using an openclaw bot to trade. So I did. I was amazed that after 5 seconds (and with nearly $20000 of my dads money I wont disclose to you) I found that I was making $5000 every single day.

Every. Single. Day.

Now I am looking to help young entrepreneurs like myself get into openclaw agents. If you’re young and want to learn the mindset and simple steps that helped me start making this happen, msg me “LEARN MORE” and I’ll show you what helped me get here.

clear quartz Apr 27, 2026, 11:39 PM

#

LMAO

wanton seal Apr 27, 2026, 11:43 PM

#

I am 8 and my father just found out I was looking at openclaw, now I make 20000 a day and my mother doesnt know. text me to find out how my brother made his 1st billion with openclaw that inspired me to look at it. my sister also uses openclaw and we are all making tons of money every day. my openclaw just told me I made another 500 in the last few minutes while i was typing this important announcement.

#

there should be age limit for openclaw or internet access full stop

clear quartz Apr 27, 2026, 11:49 PM

#

I'd prefer a minimum IQ

rocky violet Apr 28, 2026, 12:07 AM

#

why so many scam baits here suddenly

dense vector Apr 28, 2026, 1:10 AM

#

gee I wonder why

#

no gifs 🥀

echo walrus Apr 28, 2026, 3:46 AM

#

oh spare me codex, give us one more free reset lol

#

Telegram doesnt directly work if you havae ssh tunneling on? Its telling me it has to be taillscale or remote URL

dawn cosmos Apr 28, 2026, 4:25 AM

#

There was a claude code offer which gave $500 creds valid for 6 months! Expired now. Keep a watch out for more. There is one more of xiaomi mimo is running https://100t.xiaomimimo.com/

rocky violet Apr 28, 2026, 4:52 AM

#

dawn cosmos There was a claude code offer which gave $500 creds valid for 6 months! Expired ...

100 trillion token is insane

chilly cape Apr 28, 2026, 5:01 AM

#

Around the 4.22 update my gateway started using like 50% CPU at idle. Has anyone else experienced this? Any solution?

tender ermine Apr 28, 2026, 5:09 AM

#

Looking to setup with a dgx spark, Mac Studio and a NAS. Any feedback on this type of setup? Was going to have openclaw on the Mac Studio as orchestrator, dgx for inference, and all data hosted on NAS. I’m trying to create an enterprise setup for a company.

dawn cosmos Apr 28, 2026, 5:16 AM

#

rocky violet 100 trillion token is insane

They just want to grab the market...they would have got some free subsidies from govt. Data is precious!

rocky violet Apr 28, 2026, 5:19 AM

#

yeah sadly . humans are less valuable than their own data nowadays

somber falcon Apr 28, 2026, 6:01 AM

#

Hey folks, currently added my 3rd V100 32gbGPU now, total of 96VRAM, whats your opinion i should run now, previously using unsloth/Qwen3.6-35B-A3B-GGUF at Q8

regal shoal Apr 28, 2026, 6:59 AM

#

Two weeks ago before i started this Ai journey i got myself a new RTX5080 16Gb, what a waste of money 😂 , now i want an RTX6000 96GB, 16gb is nothing.

dawn cosmos Apr 28, 2026, 7:39 AM

#

Resale on RTX5080 should still be good

grave shoal Apr 28, 2026, 10:07 AM

#

tranquil hazel local is overrated and dangerous$

Yeah, mostly just using it for simple stuff and kicking off cron jobs, heartbeats etc. Would never use it for any coding tasks and such, dont see the point. Thats where Claude Code and Open Code etc shine for me.

dawn cosmos Apr 28, 2026, 10:08 AM

#

Yep you can technically run openclaw in rpi. And use other nodes to run the tasks

rocky violet Apr 28, 2026, 3:28 PM

#

grave shoal Yeah, mostly just using it for simple stuff and kicking off cron jobs, heartbeat...

which model you use from opencode mostly for coding? k2.6?

grave shoal Apr 28, 2026, 3:36 PM

#

rocky violet which model you use from opencode mostly for coding? k2.6?

yes recently pleased with k2.6

rocky violet Apr 28, 2026, 3:38 PM

#

grave shoal yes recently pleased with k2.6

how good are the deepseek v4 models compared to kimi?

grave shoal Apr 28, 2026, 3:45 PM

#

rocky violet how good are the deepseek v4 models compared to kimi?

I've not tried any of those.

magic hull Apr 29, 2026, 5:22 AM

#

After doing some research I've preselected these 3 devices:

ASUS Ascent GX10 128 GB LPDDR5X (3700€)
Corsair AI Workstation 300 Ryzen AI Max+ 395 128 GB LPDDR5X (2800€)
Apple mac mini M4 pro 64gb (2400€)

I want it to be a 24/7 node of openclaw, to work as an assistant and also do some research on AI. Is it worth going for the Asus with the nvidia gb10 or is it better to pay for suscriptions of cloud?

dawn cosmos Apr 29, 2026, 9:09 AM

#

magic hull After doing some research I've preselected these 3 devices: - ASUS Ascent GX10 ...

do you have experience in using the local models? before spending huge amount, rent a GPU VPS, install and run local models and see if it fits your pupose, then plonk down your cash to buy one of 2 above (m4 would be a waste in that config) - I would side with GX10

magic hull Apr 29, 2026, 9:38 AM

#

dawn cosmos do you have experience in using the local models? before spending huge amount, ...

Thanks for the advice!! I added de mac mini to the comparison because all people is using them to run openclaw

dawn cosmos Apr 29, 2026, 11:55 AM

#

magic hull Thanks for the advice!! I added de mac mini to the comparison because all people...

Yeah, those mac mini are almost a scam by these influencers. You do need mac mini to run openclaw with cloud llm and they are not enough to run local llm. So unless you have any other usage for mac and its part of your ecosystem, it might make sense. But just for openclaw, it is not

regal shoal Apr 29, 2026, 2:57 PM

#

╭─────────────────────────────────────────────────────────╮
│ Ollama Model Benchmarker │
│ Reasoning | Coding | Knowledge | Instruction | Creative │
╰─────────────────────────────────────────────────────────╯

Found 25 models to benchmark:

qwen3:30b-a3b
qwen3.5:27b
mistral-small3.1:latest
qwen2.5-coder:32b-instruct-q4_K_M
gemma3:27b
deepseek-r1:32b
qwen2.5-coder:7b
dolphin-mixtral:8x7b
codellama:13b
llava:13b
mistral-nemo:12b
mistral:7b
phi4-mini:latest
qwen3.5:4b
qwen2.5-coder:14b
deepseek-r1:14b
qwen2.5:7b
qwen2.5:3b
llama3.2:3b
llama3.2-vision:11b
qwen2.5:14b
gemma4:e4b
qwen3-vl:8b
deepseek-r1:8b
qwen3.5:9b

Estimated time: 50-125 minutes. Please wait...

⠼ Testing: qwen3.5:27b ━╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4%
⠼ -> Knowledge ━━━━━━━━━━━━━━━━╺━━━━━━━━━━━━━━━━━━━━━━━ 40%
^^Lets see how they Rank, runnin on RTX 5080 16Gb

craggy ferry Apr 29, 2026, 4:52 PM

#

regal shoal ╭─────────────────────────────────────────────────────────╮ │ Ollama Model Bench...

Fun! Have you seen ClawEval?

regal shoal Apr 29, 2026, 5:06 PM

#

craggy ferry Fun! Have you seen ClawEval?

No, that sounds fun! lol

#

Pretty cool site!

craggy ferry Apr 29, 2026, 5:07 PM

#

Yeah it has tons of good benchmarks to really differentiate models

#

https://github.com/AIgenteur/ClawEval

#

That’s the canonical one

#

I run G and H on every new model I start running locally to get good comparisons. I should also look at the subjective writing sections (but usually don’t lol)

lament idol Apr 29, 2026, 7:03 PM

#

I'm considering buying an RTX 5090 because I realized my 64gb of system ram and RTX 4046 8gb VRAM I bought for local AI won't work.

Before I spend $4000 on an RTX 5090 & 1200W PSU upgrade.

Will I be able to run decent local LLMs on my PC if I get this graphics card? I'm been talking to Perplexity to wrap my head around these buying decisions but I don't really understand the implications of what I can and can't do.

I was quite disappointed when I realized System Ram is not what Local LLMs require and I'd like to avoid a $4000 disappointment if possible.

oak frost Apr 29, 2026, 7:59 PM

#

you dont need a 5090, for playing around you can buy a used 3090. Use it for a few weeks.
You will see that the quality of the answers are not as good as a provider with 200b++ Model files, but probably good enough for your usecase..

rocky violet Apr 29, 2026, 8:51 PM

#

lament idol I'm considering buying an RTX 5090 because I realized my 64gb of system ram and ...

with 32gb vram you can fully load a max of 65 - 70b model on your gpu (q4 quant) and more if you offload some to the ram , and i dont think any 60 or 70b parameter model will be decent enough

lament idol Apr 29, 2026, 8:54 PM

#

I found this to be a useful speed tester to help me visualize the speed of which the models would take.

I'm thinking I don't really need the frontier models.

Most of what I'm doing is convert transcripts of my speeches into different forms of written content.

https://shir-man.com/tokens-per-second/?speed=80

#

I'm not a coder or developer so there's not advance coding needs I have. I do want to have agent swarms that are able to work together for researching content ideas, structuring content in my frameworks, and designing slide presentations running in parallel

lyric orchid Apr 30, 2026, 4:06 AM

#

lament idol I'm not a coder or developer so there's not advance coding needs I have. I do wa...

I agree with @oak frost , get a 3090 on ebay and try that first - you can always resell it. You still need the beefy PSU though. 3090 is only 24gb, 5090 will give you another 8b, but that might not get you much more model, but maybe a 27b or 35b plus a smaller one like a 9b. I don't know how much swarming, especially in parallel you'll be able to get though. I've been happy with 3090 and qwen36 35b a3b, and qwen36 27b for my needs. I also have a 4070 8 gb) on another machine with qwen35 9b for simpler tasks. I want to do some experimenting with coding too .

spiral vector Apr 30, 2026, 5:05 AM

#

I'm surprised I don't see people suggesting AMD R9700 (32GB) cards more. For $1300, I think its the best you can get for local LLM. Sure, that means you're dealing with ROCm, but I would think that's a good tradeoff.

#

$950 Intel B70 (also 32GB) may yet prove to be worthwhile as well - but their software stack is probably worse than where ROCm was at 2 years ago.

lyric orchid Apr 30, 2026, 1:43 PM

#

spiral vector I'm surprised I don't see people suggesting AMD R9700 (32GB) cards more. For $1...

I think that's the major reason, much of the software and tuning is CUDA focused. Regarding 32 gb vs 24, I don't think it's a big enough jump to be able to run a larger model than you could with 24. @rocky violet I don't see how you could fully load a 70b or even 65b Q4 fully in vram? Those would need more than 32 gb? But back to the original question, you'll need to figure out whether you can do what you want with a smaller model. Others have suggested renting a vps with GPU to try it out, but I'm not familiar with that, the cost, or if you can allocate specific GPU size.

spiral vector Apr 30, 2026, 2:16 PM

#

Personally, I think 32gb is the current sweet spot (used to be 24gb last year) for best local LLM without breaking the bank. Between Qwen 3.6 and Gemma 4 models if you're limited to 24gb, then your either limiting context down significantly, running really small (Q3) quants which limits usefulness, or both. (Or, you're stuck using older, worse models.) But, it does really come down to what you're trying to do with it. 32GB + good processing speed seems to be the floor for "good enough" local coding (R9700 is OK, but I do wish I had the speed of a 5090 here).

If you're OK with slower responses then mini PC's like M4 mac mini with 48/64GB RAM or similar Strix Halo can also work, but they're both much slower than I think a lot of people are comfortable with as a chat bot. If you have workflows that you can just pass off to let run overnight - then M4/Strix halo are great (mine just spent days churning out AI subtitles for a bunch of old obscure media - speed was no real concern). Next I have mine slowly chewing through something like 600 government documents (probably 20k pages of text and tables in PDF's (many without OCR) and building that into a searchable database - works great when I don't need speed.

But neither mini PC seems good for local image generation if that's your thing - 3090 TI/4090/7900XTX (all at 24GB) are probably still best fit there.

Ultimately, I think many of us are stuck in this world still where we need frontier/SOTA models for real work - local LLM is just a thing you can offload stuff too when it neither needs to be "as good as possible" or "as fast as possible", but instead I just want "as cheap as possible" (when measured over multiple months).

#

I still pay over $200 in month for Frontier/SOTA models for real work - in addition to offloading what small bits I can to last year's Strix Halo and this year's R9700 in my desktop. But I like to think that offloading what I can keeps me from paying even more for cloud models.

lyric orchid Apr 30, 2026, 5:15 PM

#

spiral vector I still pay over $200 in month for Frontier/SOTA models for real work - in addit...

Same, though I'm only on the $20 plan for both Claude and openai, but keep running up on the 5 hour window when coding in Claude code. Just started playing with codex a few days ago and am really impressed with it , and it seems to be more token friendly than Claude code. Now trying to figure out which one I want to give $100 a month to for a bigger 5 hour window, or just continue jumping back and forth between the two!

rocky violet Apr 30, 2026, 5:58 PM

#

lyric orchid I think that's the major reason, much of the software and tuning is CUDA focused...

with offloading as i said

vast hamlet May 1, 2026, 2:15 AM

#

anyone using openclaw on a raspberry pi 5 8 GB?

placid zinc May 1, 2026, 8:33 AM

#

Let me know if it works for you

north quartz May 1, 2026, 10:12 AM

#

dawn cosmos do you have experience in using the local models? before spending huge amount, ...

I like the idea of learning and trying things out on a VPS first, with the goal for figuring out what hardware I might later choose to buy to run everything local. I am not a programmer, use windows, and a newbie to openclaw.** What VPS service + guide would people reccomend? **Oracle cloud seems like it would emulate a local server pretty well but also looks to be stretching the edge of my knowlege.

dusk moon May 1, 2026, 11:15 AM

#

vast hamlet anyone using openclaw on a raspberry pi 5 8 GB?

i had it running on my pi 5 8GB no issues, several others use lesser pis

vast hamlet May 1, 2026, 11:44 AM

#

dusk moon i had it running on my pi 5 8GB no issues, several others use lesser pis

It’s hammering my pi cpu, cpu utilisation reaches 100% regularly

dusk moon May 1, 2026, 11:58 AM

#

anyone using openclaw on a raspberry pi

dawn cosmos May 1, 2026, 11:59 AM

#

north quartz I like the idea of learning and trying things out on a VPS first, with the goal ...

Oracle pay as you go plan provides a 4 cpu, 24gb ram, 200gb completely free. Hence experiment there. You can also install docker engine there and run openclaw as containers which will provide you with more control and better upgrade paths

rocky violet May 1, 2026, 2:04 PM

#

dawn cosmos Oracle pay as you go plan provides a 4 cpu, 24gb ram, 200gb completely free. He...

they provide that vps plan anymore? i tried before but doesn't seem like they accept everyone

dawn cosmos May 1, 2026, 3:31 PM

#

rocky violet they provide that vps plan anymore? i tried before but doesn't seem like they ac...

You need to upgrade to pay-as-you-go it is difficult to get slot in pure free tier qupta. But even when you upgrade to pay-as-you-go that combination is still free forever, you only pay if you exceed that

north quartz May 1, 2026, 8:15 PM

#

dawn cosmos You need to upgrade to pay-as-you-go it is difficult to get slot in pure free ti...

Great tip on getting the free option, will give it a go!

spiral vector May 2, 2026, 5:17 AM

#

Of all the benefits of using AI, I think having it explain anything to you, at exactly the level you want to receive that information at - that's probably the best.

uncut locust May 2, 2026, 5:26 AM

#

Which Mac Mini is preferred to buy? Is it the base model or is it the model with the higher RAM? Basically I don't want to run local models on my system. I just want a personal assistant to run openclaw

uncut locust May 2, 2026, 5:28 AM

#

uncut locust Which Mac Mini is preferred to buy? Is it the base model or is it the model with...

Even bigger question: should I consider buying a Mac mini for this or should I stick to Cloud VPS only for this?

fair kindle May 2, 2026, 7:17 AM

#

uncut locust Even bigger question: should I consider buying a Mac mini for this or should I s...

steer clear of VPS, local hardware plays nicer.
i had a 2012 mac mini, threw linux on it. running openclaw fine. also have a beefy set up. it depends waht you want to acheive though.

spiral vector May 2, 2026, 1:45 PM

#

If you're just doing cloud models, any mini PC, or old laptop is good enough. If you're already a mac person, the cheapest mac-mini will work well.

sly yarrow May 2, 2026, 4:56 PM

#

uncut locust Which Mac Mini is preferred to buy? Is it the base model or is it the model with...

Wait for WWDC, could be possible that Apple releases the M5 Mac mini 🤗

keen loom May 2, 2026, 7:30 PM

#

anyone knows why i end up on gateway-injected when i refresh the webui? been trying to find an answer / fix but can't figure it out

delicate hull May 2, 2026, 10:58 PM

#

keen loom anyone knows why i end up on gateway-injected when i refresh the webui? been try...

Is that a hardware question?

fallow willow May 2, 2026, 11:23 PM

#

i'm using AI to help me build a selfhosted Ollama/Openclaw Team of Agents... I talk to it through Discord.. 3 days so far, at the Discord stage with partial personalities working...

dawn cosmos May 2, 2026, 11:37 PM

#

sly yarrow Wait for WWDC, could be possible that Apple releases the M5 Mac mini 🤗

ha ha don't waste money on mac if using cloud LLM anyways

sly yarrow May 3, 2026, 7:50 AM

#

Local bro Ollama 0.19 and the dynamite goes booom 🤗

#

If you go cloud raspberry pi could do the trick 🤗

keen loom May 3, 2026, 1:47 PM

#

delicate hull Is that a hardware question?

sorry i posted in the wrong channel, was sleepy 🙂

hollow coral May 3, 2026, 11:34 PM

#

fallow willow i'm using AI to help me build a selfhosted Ollama/Openclaw Team of Agents... I t...

this sounds hard. Why would you want this?

queen gate May 4, 2026, 6:38 AM

#

i have a potato pc and i cant run the model stay thinking 10h and dont do nothing

winter lynx May 5, 2026, 1:00 AM

#

dawn cosmos ha ha don't waste money on mac if using cloud LLM anyways

If you are using cloud you could migrate to local with the M5

winter lynx May 5, 2026, 1:16 AM

#

typical Local LLM these days

#

https://tenor.com/view/star-wars-jar-jar-binks-thumbs-up-agree-yes-gif-16164511

dawn cosmos May 5, 2026, 1:37 AM

#

winter lynx If you are using cloud you could migrate to local with the M5

With that cost of M5, i could run cloud sub for 24 m atleast at highest tier and have exposed to latest frontier model. Local models don't cut it vs frontier models

winter lynx May 5, 2026, 2:21 AM

#

dawn cosmos With that cost of M5, i could run cloud sub for 24 m atleast at highest tier and...

you could run frontier models 24/7 for 24 months? pretty sure your math is faulty there... As for not being up to the tak, depends on what tasks you are doing. I would still architect with a claude pro subscription, but most of the grunt work could be done by a local model, and they are getting better all the time... you can run them 24/7 without rate limits ... while sonnet 24/7 would be
24-Month Cost Estimates
(Sonnet 3.5 API)Low Usage (Simple Agent, 24/7): ~$300–$600
Usage: Only processing data when a user acts, infrequent, short prompts.
Medium Usage (Constant Monitoring, 24/7): ~$10,000–$20,000
Usage: Constant summarization, low-volume coding, high context maintenance.
High Usage (Active Coding/Data Agent, 24/7): ~$50,000–$100,000+
Usage: Rapid, continuous coding tasks with multiple files/retry loops.

#

and that isn't even opus... thats just sonnet

#

For Opus 24/7 High-Volume Agent (API)~$9,000+~$216,000+...

#

so naw... I don't need frontier quality for every single task... there is plenty I could do with a large model on a mac m5

dawn cosmos May 5, 2026, 3:07 AM

#

winter lynx you could run frontier models 24/7 for 24 months? pretty sure your math is fault...

the problem you think is that openclaw is the only solution - which is not. I primarily use n8n for daily workflows and use openclaw for research activities, hence that $200/month is sufficient (with regular small credits that thrown across different events /partners) . Openclaw is a token hogger. For example, if you have an execl sheet to be read and based on that do some infrencing for say couple of columns, in Openclaw, eveyrthing is infrenced. In n8n, using code node you can just simply seggregate the data without using LLM and send for infrencing only the data you required. in my use case, if I use openclaw, it would take about 72K tokens/call vs 5-10K in n8n. Now if there is a new excel format, then invoke openclaw to determine best strategy, once that strategy is developed, turn that into n8n workflow. This way most of the determinintic tasks don't even use LLM. It is used only for those data that requires it. Hence, my context is lean, infrencing ability is fine tuned and I use multiple agents for specific tasks, which keeps it specific. Opus 4.6 (4.7 is bad) is used only when things are completely random.
Own hardware is great - I have a full scale home lab + self hosted cloud solutions - but i won't recommend investing in a hardware which is destined to become obsolete in next 9 - 12 months due to AI architecture improvements and hybrid scaling via vLLMs.

winter lynx May 5, 2026, 3:09 AM

#

not familiar with n8n...

#

if anything the llms seem to be getting better AND smaller

dawn cosmos May 5, 2026, 3:10 AM

#

winter lynx not familiar with n8n...

https://n8n.io/ or even https://flowiseai.com/ which can be self hosted. Before openclaw came in n8n ruled the AI agentic world for clear workflows. in Openclaw, you cannot have predetemined workflow like you can do in n8n or flowise

winter lynx May 5, 2026, 3:12 AM

#

I will check them out... what I don't know would fill a book, but I am working on learning

dawn cosmos May 5, 2026, 3:12 AM

#

winter lynx if anything the llms seem to be getting better AND smaller

exactly, and it will be hybrid approach in future - heavy generalized LLM + Finetuned small LLM (using LORA, vLORA) which could be running in optimized software for the likes vLLM or even sg-lang

dawn cosmos May 5, 2026, 3:13 AM

#

winter lynx I will check them out... what I don't know would fill a book, but I am working o...

start with n8n - flowise does not have all the bells and whistles. You could be surprized that many tasks that openclaw was infrencing can be a simple Code Node (without LLM) - hence only infrence what is truly needed and the results would be very reliable

dawn cosmos May 5, 2026, 3:15 AM

#

winter lynx I will check them out... what I don't know would fill a book, but I am working o...

while you are at it - learn docker, openclaw can also be run in docker and much better when new versions come out, since you could spin a new container while the last known good version container is still running. This way your business does not stop because there are breaking changes in new version. With reverse proxies you could also route your work to like 80% to existing proven openclaw container + 20% to new version of openclaw container. However, this requires a bit if knowledge of docker and revrese proxies like Traefik/ Caddy

winter lynx May 5, 2026, 5:26 AM

#

Ollama's Cloud models can now be used inside Claude Desktop

dawn cosmos May 5, 2026, 6:29 AM

#

winter lynx Ollama's Cloud models can now be used inside Claude Desktop

If you are self hosting - use liteLLM - its like openrouter for you.You cna configure any local or cloud. Openclaw can then only call litellm.
In litellm you can also put policies to route what when

tight hinge May 6, 2026, 6:07 AM

#

anyone here thinking much about the control model for computer use?
feels like a lot of current stuff assumes the agent should just live on the target machine and poke around from inside it.
i’m starting to think a sidecar model makes more sense:
• run the AI on one machine
• keep the target mac separate
• send input in from outside
• cleaner boundary between thinking and acting.
curious if that feels more sane to others, or if people still think direct-on-box is the better model

craggy ferry May 6, 2026, 6:45 AM

#

isn't this already how nodes work in openclaw?

#

I think to the extent possible you should in fact avoid having agents poke at the machine running them

dawn cosmos May 6, 2026, 9:59 AM

#

tight hinge anyone here thinking much about the control model for computer use? feels like a...

Openclaw already works that way. You can run openclaw as nodes (even within docker cpntianer) and let ot be controlled via other machine runnign as a gateway(again in docker contianer)

merry rapids May 6, 2026, 8:47 PM

#

I currently have 8GB VRAM and 32GB RAM, do you guys have any recommendations for which model I should use for lightweight/local agent tasks?

Currently using Dolphin-X1-8B-Q6_K in LM Studio just for testing purposes and I am getting around 30 tok/sec initially (for longer sessions it stabilizes at around 15 tok/sec), but the model feels rather dumb.

Current model/settings info:

Model: dphn/Dolphin-X1-8B-GGUF
Quantization: Q6_K
Architecture: Llama
Size on disk: 6.60 GB
Context Length: 131072
GPU Offload: 32
CPU Thread Pool Size: 8
Evaluation Batch Size: 725
Unified KV Cache: Enabled
Keep Model in Memory: Enabled
Offload KV Cache to GPU Memory: Disabled

I’d like recommendations for:

better models for OpenClaw/agent use
good balance between intelligence + speed
settings optimization for my hardware

I am willing to sacrifice some context length if needed, but I would prefer not dropping it too aggressively.

sterile dagger May 6, 2026, 9:35 PM

#

merry rapids I currently have **8GB VRAM** and **32GB RAM**, do you guys have any recommendat...

What's your current context length?

merry rapids May 6, 2026, 10:32 PM

#

sterile dagger What's your current context length?

Context Length: 131,072

sterile dagger May 6, 2026, 10:39 PM

#

Have you tried the new qwen? It's sort of designed a bit better for tool use

merry rapids May 7, 2026, 1:03 AM

#

sterile dagger Have you tried the new qwen? It's sort of designed a bit better for tool use

i'm currently using Qwen2.5-7B-Instruct Q6_K

#

it's running pretty well and I am liking it given my limited hardware

#

once I get more comfortable with openclaw I'll just rent a runpod instance then i can use whatever

spiral vector May 7, 2026, 1:35 AM

#

Have you tried Qwen3.5-4B Q4_K_M? (I haven't - just seen that recommended here before for 8GB VRAM setups). I'd guess between Qwen 3.4 4B and Gemma-4-E4B

#

I'll plug https://github.com/AIgenteur/ClawEval (not my work - no connection to the guy who built it). I think this is generally the best (what LLM for my GPU for openclaw) that I've seen.

lyric orchid May 7, 2026, 3:15 AM

#

merry rapids it's running pretty well and I am liking it given my limited hardware

On the 32 gb, try qwen36-35b-a3b or qwen36-27b, Q4_K_M works well for me on 24 gb, you could probably do higher Q

tight hinge May 7, 2026, 6:43 AM

#

dawn cosmos Openclaw already works that way. You can run openclaw as nodes (even within dock...

True — OpenClaw already supports distributed control through gateway + nodes.
What Sidecar Dot adds is a different thing: control of a separate Mac without requiring OpenClaw, Docker, or any installed agent on that target machine.
So I’d separate them like this:
• OpenClaw nodes/gateway = software-native distributed control
• Sidecar Dot = external control of a real Mac that AI can operate directly
That difference matters when the target machine is not already part of your stack.

dawn cosmos May 7, 2026, 8:04 AM

#

tight hinge True — OpenClaw already supports distributed control through gateway + nodes. Wh...

sidecarbot? you mean this https://www.sidecardot.com/ why would you want to pay for that hardware which is essentially looks like RPI Zero - besides, there is no need, if you just use the openclaw nodes

humble iris May 8, 2026, 3:13 AM

#

Do you think a Mac Pro will be able to efficiently run openclaw using qwen27 on Ollama while running Claude code because my Mac air with 24ram is struggling a lot rn

lyric orchid May 8, 2026, 5:24 AM

#

Depends on the specific models? Which ones? Google memory bandwidth for your particular models, but the airs aren't all that fast, up to 153 for m5. Pros will be in the 200 to 600 range

dull shore May 8, 2026, 2:43 PM

#

lyric orchid On the 32 gb, try qwen36-35b-a3b or qwen36-27b, Q4_K_M works well for me on 24 g...

What would you recommend for 16 gb vram?

thorny abyss May 8, 2026, 11:47 PM

#

dull shore What would you recommend for 16 gb vram?

If you go to the models on LM Studio’s website you can see the minimum ram requirements for each one. That doesn’t include KV cache etc but it gives you an idea of where to start.

humble iris May 9, 2026, 3:21 AM

#

@lyric orchid which Mac do you think I should get. I want to be able to do everything comfortably. Also what are those numbers you’re saying: 153, 200, 600

dull shore May 9, 2026, 3:30 AM

#

humble iris <@1020084715354595380> which Mac do you think I should get. I want to be able to...

The numbers are the memory bandwidth

dull shore May 9, 2026, 3:31 AM

#

thorny abyss If you go to the models on LM Studio’s website you can see the minimum ram requi...

I understand but I'm talking best performing model

lyric orchid May 9, 2026, 4:08 AM

#

humble iris <@1020084715354595380> which Mac do you think I should get. I want to be able to...

I'm really not a Mac expert, just relaying what I've read. The point is, you can have two Macs with 128 gigs of ram, but the memory bandwidth speed can be considerably different depending on which model you get. I believe the m3 ultra chip is the fastest. The higher the bandwidth the better the performance .

gusty nacelle May 9, 2026, 5:10 PM

#

humble iris Do you think a Mac Pro will be able to efficiently run openclaw using qwen27 on ...

Use https://www.canirun.ai/ to answer that. Pick your mac model in the hw select and you'll know

sterile fjord May 10, 2026, 7:50 PM

#

tropic jolt hi, can you try out: `agents.defaults.localModelMode: "lean"` should be document...

I looked at the docs and it seems to focus on the Ollama models - do you think this be a problem if I used it with a different model?

tight hinge May 11, 2026, 12:09 AM

#

craggy ferry isn't this already how nodes work in openclaw?

OpenClaw nodes are the right solution when you can install software on the target machine.
Sidecar Dot is for the cases where you can’t, or where you want out-of-band control/recovery.
So it’s not replacing nodes — it’s covering the gap nodes leave.

tight hinge May 11, 2026, 12:11 AM

#

dawn cosmos Openclaw already works that way. You can run openclaw as nodes (even within dock...

Yep — agreed. That’s already the native OpenClaw model.
If you can run a node on the target and coordinate it from a separate gateway, that’s usually the cleanest setup.
The reason for Sidecar Dot isn’t to replace that — it’s to handle the cases where you can’t install or rely on a node on the target at all: locked-down machines, third-party devices, broken OS state, or out-of-band HID/KVM-style recovery.
OpenClaw nodes = in-band software control
Sidecar Dot = out-of-band external control

tight hinge May 11, 2026, 12:22 AM

#

dawn cosmos sidecarbot? you mean this <https://www.sidecardot.com/> why would you want to p...

Sure.. it may look like a Pi-class device, but the value isn’t the board, it’s the role. The point is having an external control plane for machines where you can’t install a node, can’t trust the OS, or need out-of-band HID/KVM-style recovery. If OpenClaw nodes can do the job, great - this is for the cases they can’t.

dawn cosmos May 11, 2026, 12:46 AM

#

tight hinge Sure.. it may look like a Pi-class device, but the value isn’t the board, it’s t...

In those cases, use ansible from https://semaphoreui.com/ and invoke any machine. You might need to have python in some nodes

craggy ferry May 11, 2026, 1:42 AM

#

tight hinge OpenClaw nodes are the right solution when you can install software on the targe...

I didn’t consent to you pasting slop at me twice just in case I didn’t see it the first time. Congrats you built an IP KVM.

tight hinge May 11, 2026, 1:44 AM

#

craggy ferry I didn’t consent to you pasting slop at me twice just in case I didn’t see it th...

Apologies. I'm new to Discord Forums & still finding my way re. how to reply in the thread etc.

dawn cosmos May 11, 2026, 1:44 AM

#

tight hinge Yep — agreed. That’s already the native OpenClaw model. If you can run a node on...

Sidecar dot is only for mac. And mac is always a compromise 😃

tight hinge May 11, 2026, 4:36 AM

#

Sidecar Dot

fluid jackal May 12, 2026, 12:40 PM

#

anyone here running a mac cluster?

dawn cosmos May 12, 2026, 3:47 PM

#

fluid jackal anyone here running a mac cluster?

You have a cluster! You run local models?

analog fern May 12, 2026, 11:51 PM

#

fluid jackal anyone here running a mac cluster?

mac clusters are a fun project, but I got tired of the latency issues when sharding larger models across nodes

fluid jackal May 12, 2026, 11:51 PM

#

analog fern mac clusters are a fun project, but I got tired of the latency issues when shard...

Yeah this seems to be the standard response. Tempting but think it's not worth the time/money at the end of the day

analog fern May 12, 2026, 11:54 PM

#

Exactly. I just couldn't justify the mac tax

dawn cosmos May 12, 2026, 11:57 PM

#

Instead of macs you could get cluster of gx10 or ryzen Ai + Max 395 systems.

But even those will be outdated in an years time defeating the ROI. Hence I always state thay for now getting credits or paying for subs is best as you work out your token appetite

fluid jackal May 13, 2026, 11:39 AM

#

dawn cosmos Instead of macs you could get cluster of gx10 or ryzen Ai + Max 395 systems. Bu...

Yeah I already have a strix halo I barely use with my OC living on it because of a lack of bandwidth/power

#

Find myself with a collection of various AI/compute hardware but with minimal unification neglect the 6 3090s+6000 on one machine but that sucks up so much power it's only spun up when required.

Will probably move the 6000 to an always on system to better utilize

dawn cosmos May 13, 2026, 1:15 PM

#

fluid jackal Find myself with a collection of various AI/compute hardware but with minimal un...

If you have experience with k8s, you can build up a cluster which will provide some optimization for your disparate gpus https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/

fluid jackal May 13, 2026, 1:27 PM

#

dawn cosmos If you have experience with k8s, you can build up a cluster which will provide s...

this would help if it was all the same arch, but between Nvidia dGPUs, Jetsons, Strix Halos, Mac products, you'd still need dynamic images which is miserable to setup and maintain

#

maybe one day there will be a fully unified underlying translation layer that doesn't lost stupid amounts of performance (or at least one can dream 😂 )

dawn cosmos May 13, 2026, 1:31 PM

#

fluid jackal this would help if it was all the same arch, but between Nvidia dGPUs, Jetsons, ...

This is where k8s comes into picture- differnet nodes running diff arch/gpu can operate as single cluster unfoying all gpu, storage, etc

fluid jackal May 13, 2026, 1:33 PM

#

dawn cosmos This is where k8s comes into picture- differnet nodes running diff arch/gpu can...

it doesn't function that way if you have different architecture GPU/compute. If you have various nvidia dGPUs then yes, it can work that way. But if you have a mismash, best case is probably running Vulcan between nvidia and amd (unsure it works for intel or mac). even then you lose performance

dawn cosmos May 13, 2026, 1:46 PM

#

fluid jackal it doesn't function that way if you have different architecture GPU/compute. If...

You can't combine different gpus into a single unified gpu. But multiple GPU can be presented so that you can run multiple pods of ollama/anythingllm, etc. So with openclaw you can run different agent thay can use different ollama. Since everything is in a single cluster, it is seamless

fluid jackal May 13, 2026, 1:49 PM

#

dawn cosmos You can't combine different gpus into a single unified gpu. But multiple GPU can...

negative; not without the container images and configs needing to support multiple underlying GPU architectures -- which I am not aware of a package that fully supports all out of the box (as the image would be larger than anyone wants too support in a single image)

dawn cosmos May 13, 2026, 1:54 PM

#

fluid jackal negative; not without the container images and configs needing to support multip...

have you used k8s, do you know about how pods are run?

fluid jackal May 13, 2026, 1:55 PM

#

dawn cosmos have you used k8s, do you know about how pods are run?

yes; it's literally a core part of my job lol

#

have you?

dawn cosmos May 13, 2026, 1:59 PM

#

then probably you need more experience! if you have mutliple nodes each having a different GPU, you can run the node with specific container image with plugins and Ollama pods can use them. Hence, as a cluster you will have exposure mutliple gpu nodes. In openclaw (running as a gateway or seperate nodes) you can configure multiple providers from each of those ollama pods to run different agents. This way you setup is optimized to utilize disparate gpus. yes, you cannot comnbine as a single GPU and do a slicing

fluid jackal May 13, 2026, 2:04 PM

#

dawn cosmos then probably you need more experience! if you have mutliple nodes each having a...

Kubernetes is an orchestrator, not a magical GPU translation layer. It can schedule workloads onto different nodes, but the container images, drivers, runtimes, and configs still need to support each GPU architecture. Presenting multiple Ollama pods/providers is not the same thing as making mixed AMD/NVIDIA/Intel GPUs behave like a single GPU architecture type.

Please do your research.

dawn cosmos May 13, 2026, 2:09 PM

#

fluid jackal Kubernetes is an orchestrator, not a magical GPU translation layer. It can sched...

its not about k8s itself, it is those device plugins . - yes, I already stated the multiple GPU cannot be offered as a single GPU types, but pods can use mutliple GPUs seperately. Then there is a concept of paralellism which I have not even mentioned given that you need to understand above. Using paralelism you can run model across different GPU (yeah some would need simialar arch). So bottomline, the way you are describing that you differnet GPU are waste and non-performing, they are not. If you had good architecture knowledge of k8s, you could utiilize most of them.

fluid jackal May 13, 2026, 2:14 PM

#

dawn cosmos its not about k8s itself, it is those device plugins . - yes, I already stated ...

we are on the same page and thinking each other are not; your statement "GPU cannot be offered as a single GPU types" clarifies that.

as for the rest there was never an argument there -- just like trying to string multiple macs together there are still bandwidth issues between the nodes that render it not worth it at the end of the day when such cheap and secure compute is available online (even in high security settings)

At that point one is better off just balancing what they have with different services (k8s or otherwise) -- which is what I do (STT, TTS, rerank, embedding, VL, security, etc, etc, etc)

rough raven May 19, 2026, 12:59 PM

#

Hi guys, looking for hardware advice. Is it worth getting a MSI desktop with a rtx 5090 32gb if I can lift it for 3k USD?

fluid jackal May 19, 2026, 1:27 PM

#

rough raven Hi guys, looking for hardware advice. Is it worth getting a MSI desktop with a r...

if you can get an entire desktop with 5090, RAM, MB, CPU, PSU, SSD, etc for 3k, that's a buy and a great deal in right now -- though I am unsure your endgame (aka personally I wouldn't be trying to drive OC fully local with it, but would use it for lots of fun supplemental stuff for my OC like TTS, STT, embedding, reranking, VL, etc)

rough raven May 19, 2026, 5:58 PM

#

fluid jackal if you can get an entire desktop with 5090, RAM, MB, CPU, PSU, SSD, etc for 3k, ...

Thanks for the feedback , this is very interesting points you raised

fluid jackal May 19, 2026, 6:00 PM

#

rough raven Thanks for the feedback , this is very interesting points you raised

absolutely -- either way, right now it is a great investment for $3k; good luck!

full talon May 20, 2026, 1:33 AM

#

you can run most local models using Turbo Quant just fine on rtx 3090 which is ike 1k and you can put it in 0.5k used computer and be 95% there for local LLM Check ClawEval https://github.com/AIgenteur/ClawEval

mellow forge May 21, 2026, 2:55 PM

#

Bro I’m telling you lot, you don’t need a £3k GPU to start building AI stuff.

Everyone thinks you need one mad machine but that’s not the only way.

Build it like an organism.

One cheap PC does the routing.
One GPU runs a small local model.
Another cheap camera or old phone gives it eyes.
CPU handles logs, memory, scripts, Telegram, all that boring stuff.
Then cloud AI only gets used when the job is actually hard.

That’s the whole point.

You don’t need all the GPUs to magically become one big GPU. Most of the time it don’t work like that anyway. You split the work.

Eyes.
Brain.
Memory.
Hands.
Nervous system.

That’s how you build it.

You can start with an old office PC, a used GPU, Linux, LM Studio or Ollama, OpenClaw, Python scripts, and a camera. £300–£500 if you buy smart, maybe less if you already have parts.

It can watch a room, send alerts, run a small local AI, search its own notes, store logs, speak through Telegram, and only ask the cloud model when it really needs help.

Rich people brute force everything with one monster GPU.

Broke builders have to be smarter.

Use what you’ve got.
Split the jobs.
Make the system survive when one part goes down.

Don’t build one giant brain.

Build an organism loool.

rough raven May 22, 2026, 11:56 AM

#

mellow forge Bro I’m telling you lot, you don’t need a £3k GPU to start building AI stuff. E...

Inspiring !!!

daring rune May 24, 2026, 5:54 PM

#

mellow forge Bro I’m telling you lot, you don’t need a £3k GPU to start building AI stuff. E...

make sense, I run openclaw in rpi then local model in mac mini and things work absolutely fine https://dev.to/anup_sharma_86fa94612fe3c/i-built-an-ai-that-decides-which-ai-to-talk-to-running-247-from-my-living-room-211p

regal jay May 25, 2026, 11:43 PM

#

rough raven Hi guys, looking for hardware advice. Is it worth getting a MSI desktop with a r...

if only i had that kinda money 😭

dusk moon May 25, 2026, 11:52 PM

#

rough raven Hi guys, looking for hardware advice. Is it worth getting a MSI desktop with a r...

5090 alone goes for 3k (even as an openbox) Soph, sounds like a steal.

fluid jackal May 26, 2026, 5:10 PM

#

dusk moon 5090 alone goes for 3k (even as an openbox) Soph, sounds like a steal.

did I undersell it? lol

dusk moon May 26, 2026, 5:18 PM

#

fluid jackal did I undersell it? lol

Yes

fluid jackal May 26, 2026, 5:20 PM

#

dusk moon Yes

damn...should have lead with "Whole PC w/ 5090? Don't think, just buy"

#

I snagged a 96GB Mac Studio refurb....not sure why.... 😂

dusk moon May 26, 2026, 5:37 PM

#

fluid jackal I snagged a 96GB Mac Studio refurb....not sure why.... 😂

96 GB, say less 😄

dusk moon May 26, 2026, 5:37 PM

#

fluid jackal damn...should have lead with "Whole PC w/ 5090? Don't think, just buy"

My thoughts exactly.

#

Does it power on? Yes, Deal

crimson sparrow May 27, 2026, 7:57 PM

#

HI, im looking at getting into openclaw, not sure how to go about it. I have my main desktop at home with a 7800xt in it (im aware this could come with extra steps.) along with a 2009 macbook pro and a latitude 5410 in the mail. I looked into what I want openclaw to do for me, which would be to use my local desktop's compute power to run the llm and be able to message openclaw from my phone or interact with the web ui from my laptop away from home. How would you all go about this? I read the macbook can be used to integrate imessage without having to pay. Does anyone know if this idea is possible?

fluid jackal May 27, 2026, 8:13 PM

#

crimson sparrow HI, im looking at getting into openclaw, not sure how to go about it. I have my ...

are you planning on using a subscription or are you trying to be fully local? fully local might be a bit more than painful without at least a 20-30B parameter model ( I would personally not even fathom it)

if you do a subscription (OpenAI, MiniMax, etc), I would probably avoid the 2009 Macbook pro still unless you want to play the "will it work!?" game on hardware that's 15+ years old with only a couple of cores. I would install it on your latitude or your main desktop depending on how you're feeling it should work okay on both with a subscription.

as for your imessage, yes technically it can do iMessage ...but...I don't think it'll work here because your macbook is just too old and will lack support to install what's needed

crimson sparrow May 27, 2026, 8:15 PM

#

fluid jackal are you planning on using a subscription or are you trying to be fully local? f...

Plan was fully local and have the main desktop run everything and be able to remotely work with it as a chat or web ui for any device, and I know open core legacy exists but idk how well it will work. Only reason I brought up the MacBook was because I read you needed one for iMessage capability if you don’t want to pay

uncut sage May 28, 2026, 8:34 AM

#

Hi, I tried to set up a local openclaw agent on my pc. Specs: 32gb ddr5, RTX 4060, Ryzen 5 7500f. I don't really want to spend money. I set up Ollama's qwen 3.5:9b and it was working fine for a little bit, but now it's just replying "NO" to all my messages. I mainly want to use it to set up connections in Notion and Obsidian to track progress of things, and help me with my career in cybersec. Does anyone know why it may not be working, or what model I should run?

austere turtle May 28, 2026, 2:57 PM

#

uncut sage Hi, I tried to set up a local openclaw agent on my pc. Specs: 32gb ddr5, RTX 406...

Just replying No is weird
Where are you interacting with your agent is it Discord, telegram or where?

brittle hamlet May 28, 2026, 8:42 PM

#

uncut sage Hi, I tried to set up a local openclaw agent on my pc. Specs: 32gb ddr5, RTX 406...

I had issues with the default context size on ollama, I needed like 131k context size for it to work

uncut sage May 28, 2026, 9:24 PM

#

austere turtle Just replying No is weird Where are you interacting with your agent is it Discor...

i tried telegram, stopped working. then web ui, then tried discord

uncut sage May 28, 2026, 9:25 PM

#

brittle hamlet I had issues with the default context size on ollama, I needed like 131k context...

ahh okay, what model did you use?

austere turtle May 28, 2026, 9:30 PM

#

uncut sage i tried telegram, stopped working. then web ui, then tried discord

I was thinking making some sort of prompt or corrupted memory is just forcing it to say no

uncut sage May 28, 2026, 9:47 PM

#

austere turtle I was thinking making some sort of prompt or corrupted memory is just forcing it...

yeah maybe corrupted memory, i’m gonna delete everything and reinstall. i reset my settings and re did onboarding last night but it didn’t do anything, so i might just start from scratch. thanks

uncut sage May 28, 2026, 9:48 PM

#

austere turtle I was thinking making some sort of prompt or corrupted memory is just forcing it...

do you think qwen 3.5:9b is powerful enough to just automate things into notion and obsidian?

austere turtle May 28, 2026, 10:01 PM

#

As far as it has enough context window to remember everything which many local models lack (or maybe my machine lack power to)
The qwen 3.5 is a very powerful model for coding and light task also use a better distilled version so you can get good output

uncut sage May 28, 2026, 10:11 PM

#

yeah okay cool, what do you mean by better distilled sorry?

austere turtle May 29, 2026, 12:00 AM

#

Distilled models are basically smaller models trained using outputs or knowledge from a bigger stronger model.

Some distilled versions are done better than others, so even if two models are both “Qwen 3.5 9B distilled”, one can perform much better depending on what it was distilled from and how well it was trained/tuned.

So I meant using a well-made distilled version gives you better quality responses while still being lighter/faster to run locally.

austere turtle May 29, 2026, 12:01 AM

#

austere turtle Distilled models are basically smaller models trained using outputs or knowledge...

This is just really optional tho, usually done by the Chinese more

#

That’s why deepseek give good result for half the cost

uncut sage May 29, 2026, 12:25 AM

#

ahh okay gotcha thank you so much

brittle hamlet May 29, 2026, 5:15 AM

#

uncut sage ahh okay, what model did you use?

Currently using qwen3-coder-30b 131k context 8k output tokens

#

Two actually, the other is ollama/qwen3.6:35b-a3b-nvfp4

uncut sage May 29, 2026, 5:29 AM

#

brittle hamlet Two actually, the other is ollama/qwen3.6:35b-a3b-nvfp4

what are the specs of your pc?

brittle hamlet May 29, 2026, 5:29 AM

#

uncut sage what are the specs of your pc?

Mac mini m4 64 GB

uncut sage May 29, 2026, 7:21 AM

#

ahh okay

shy kite May 29, 2026, 8:22 AM

#

Guys, is there an open-source project that tailors LM models for OpenClaw usage?

iron stump May 29, 2026, 8:47 AM

#

who was the guy who has his own setup

uncut sage May 29, 2026, 9:51 AM

#

@austere turtle I deleted everything openclaw related and downloaded it back then set it back up and still getting the NO error. I'm stuck. I even tried using openrouter and used a free one and it instantly said I was out of tokens.

#

And the Ollama model works fine by itself

austere turtle May 29, 2026, 10:44 AM

#

If your openclaw in docker or running it normally?

#

And is your ollama running the server that connects to your model

#

Try this let’s confirm the server is running

curl http://localhost:11434/api/generate -d '{
"model": "qwen3.5:9b",
"prompt": "hello"
}'