#hardware

1 messages · Page 5 of 1

fathom summit
#

They don't even AI, I mean, like, this is crazy. Too bold and cocky.

wanton coral
#

too old and cooky

fathom summit
#

I can't understand it. There was a period of time where I wanted an iMac so bad as a photographer and graphic designer, and that lasted for about a year in 2010 or so, and then I realized, wait, I can just calibrate my monitor and it's just as good. Wait, I can calibrate it and then also just buy parts on eBay and put them in, and now all of a sudden I'm performing better than that thing would have done? Oh, never mind, I'm good.

#

Good thing Obama passed that credit card law, otherwise I would have been stuck with one. Feeling stupid.

#

No more bucket hats and no iMac for me.

#

Whoa, I just tripped myself out because when I said no more bucket hats, I guess I was referring to past tense because I went to Ross about an hour ago.

Lmfao https://imgur.com/a/GJsxTaA

#

I swear to God, I totally forgot that I even bought this stupid hat.

fathom summit
#

Listen up youngins, you guys are trippin'

fathom summit
# wanton coral too old and cooky

Okay, get the fuck out of here. I like your photography, by the way, but you're a birdwatcher, and I know damn well birdwatchers are the oldest and the kookiest, and if you're not old yet, you will be, and when you get old, you'll be the kookiest.

fathom summit
wanton coral
magic raven
#

my Claw now has roomba capabilities

#

it can just show up in my room and grab my sock and put on my lap saying "go do the laundry lazy mf"

fathom summit
# wanton coral haha, thank you so much! def true about birders and photographers. Some of the w...

That's funny, I know. I happen to live on a peninsula in the Northeast in which migratory birds are extremely abundant and rare, and so everybody who lives here happens to have at least one lens with a tripod mount on it per household that cost ten grand each. If you wanna go to the park and go for a walk with your friends and talk, forget it dude, it's bird watching time. I tell them to kick rocks.

#

Mostly because I can't afford a lens to shoot fucking birds with.

heady bobcat
teal crow
#

I’m running OpenClaw on a 16GB M1 MacBook Pro I bought off Facebook marketplace for $400 because it had a cracked screen. I just plugged it into a cheap travel monitor for setup and now run it headless. It connects over my network to another 64 GB M1 MacBook Pro Max which is running LM Studio for the LLM.

So far, this architecture is working for me pretty well. I’ve blown through 17M tokens in the last week and it hasn’t cost me a dime in api costs.

barren trail
teal crow
#

Im still dialing that in. I was running Qwen 3.5 in different variants and liked it, but it kept throwing <tool_call> code into the chat and aborting operations. Once I switched from Ollama to LM Studio that went away.

Currently I’m using Ministral-3-14b-reasoning. I like how quick this model is for operations tasks, but it definitely hallucinates to an infuriating level so I’m going to be switching to another model for creating content. I tried to have it draft an email yesterday for me and told it 4 times to stop putting hyphens in the middle of the sentences and it just kept doing it.

barren trail
fathom summit
#

Also, my bad, I'll stop the anti-Mac rhetoric here. I'm obviously in the way and not contributing positively here. Lol

#

Just keep in mind you can get a Steam Deck for less than that, way less than that, and be running Arch Linux on a capable device. You can mod it as well if you please.

#

(My Steam Deck is essentially what i use, similar to how everyone in here uses the' M1 through 4s lol.)

teal crow
#

Why would I need to upgrade it? It’s literally only running OpenClaw. Watching activity monitor this machine is overpowered for what Openclaw needs.

craggy ferry
#

yes, it's your opinion, and that's all it is, it doesn't bear any relation to reality. it's not 1995 anymore, lol

#

M1s were good chips. PC makers are still trying to compare their chips to the M2, lol

fathom summit
#

That was only a few years after a GUI was made

craggy ferry
#

haha that's not what happened in 95

fathom summit
#

Thanks to Steve

craggy ferry
#

learn your history. I grew up in it, it's not history to me, kid

fathom summit
#

I understand. I mean I've seen the computer systems and we certainly didn't have a Macintosh but I sure did have the flag thing with the rapist guy.
Hey hey hey, you didn't catch my casual stoic sarcasm. I was just being a silly jerk, is all. I wasn't really calling you a kid, man.

craggy ferry
#

84 is when the GUIs were made; when the Mac was "revolutionary". 95 was when Win95 came out. Remember Win95? Remember people lining up to buy an operating system, but they didn't know what an OS was, or even owned a computer, because they were so hype about it that they wanted it?

Well, I remember all of that, because I was a conscious human being at the time

fathom summit
#

And hey in life all we all really have is time and whatever you think and so, other than that, nothing really matters. Let's make cool stuff; that's all I care about. We can still work together.. But like dude, are you team PlayStation or Xbox? Can we fight about that instead?

#

Yeah I'm aware man but you gotta remember that it doesn't matter when it was made if it wasn't in anybody's home

#

Or Office. You can recall the iMacs with the color back. The big bubble thing is the first computer let's be real.

#

Those came out in about '95 and I believe they came out after Bill Gates stole the ideas.

craggy ferry
#

i notice the goal posts are just whizzing by now

fathom summit
#

We got all those corny commercials

Oh man, I can't even predict anything anymore. I just stopped trying.

craggy ferry
#

hey, did you know windows 2.0 was when bill gates stole the ideas?

fathom summit
#

I don't even know what the latest hardware is anymore, and that used to be something I always knew.

craggy ferry
#

and yet you have strong opinions on the value of apple hardware for money

#

cool conversation

fathom summit
#

No, I don't necessarily know the fine details of everything. I know Windows 3.0 was poppinn my dad3.0 was poppin' though, my dad was mad excited. Fucking doing spreadsheets, haha.

craggy ferry
#

i'm just saying that if you don't know what modern apple hardware is capable of vs modern pc hardware maybe don't write a page long screed about how much worse it is

fathom summit
#

I didnt say i wasnt aware of price to performance or benchmark comparisons

#

I follow the economics like more than I should, because, again, I'm not familiar with the components or like, I don't know how much VRAM is in a NVIDIA GPU these days. I just don't care anymore because I'm not buying one.

craggy ferry
#

and don't act like you know the history of computing and look down your nose at other people who literally lived through it when you don't actually remember anything lol

fathom summit
#

I guess maybe I studied that because I'm like planning for the future of buying old hardware someday, sometime, when life is less expensive, but it doesn't seem like it's gonna be coming anytime soon.

#

Bro, I don't look down on anybody. What are you talking about?

craggy ferry
#

ok great. so why do you care so much about what other people are buying when you don't know what they're buying

fathom summit
#

If you wanna be a sensitive little bitch, I'll call you a bitch and you can be a bitch, and that's fine, but like, I was just having casual conversations and friendly banter here, and that's why I made that clear after every time I said anything negative.

craggy ferry
#

wow

fathom summit
#

I'm just saying, I was trying to be cool with you.

craggy ferry
#

i'm sorry you've lost so hard you don't have anything else for me but insults

fathom summit
#

No, I don't have any insults for you. I said if that's what you wanna do, then that's what we'll do I didn't actually call you that, though, you know what I mean?

#

I'm saying if that's what we're gonna do, then that's how it'll be then. But like, I'd rather just be friends.

craggy ferry
#

always funny when the person claiming someone is sensitive is the first one to actually resort to insults

fathom summit
#

Just hop on a call with me, bro, because I don't think we're gonna solve this through reading texts alone at this point.

craggy ferry
#

honey, don't hide behind silly word games

#

you called me that

#

just own it

#

wow

#

what a coward, lol

fathom summit
#

And again, I don't care about arguing over computer specs or anything. It's just friendly banter. It's also one of them things where, like, I've had an Android for years and like, just because I thought the phones were cool. And I mean, like, I get excluded from group chats and shit and people just are always giving me shit. I go on a date with some chick and it's like, oh, green text, red flag every single time. And so I just, I talk my shit, but like, I don't really care. What the fuck? What the fuck do I care about what other people are doing? Though, if people are gonna ask me for advice, like, you know, I did say, I don't think it's worth it. And then, really, this all started from a genuine question when I was asking, like, like, for the purpose of running Claw, I just didn't understand why one would buy a Mac for it rather than like, uh, one of those mini computers. That, that's a genuine curiosity.

craggy ferry
#

lmao

fathom summit
#

It's not word games, you're being a bitch, so I'm gonna call you a biznatch.

white ivy
#

@fathom summit is right, from an outside reader with popcorn 🍿

#

Chill

craggy ferry
#

right. I know. I know you did that. That's why I said you did that. Then you tried to claim you weren't doing that.

#

Anyway I hope that was entertaining for the peanut gallery

jolly creek
#

Enjoy your mutes both of you

#

For everyone else, go read the #rules again, esp rule 3

teal crow
steep wedge
#

Does anybody have a DGX Spark cluster? I bit the bullet and ordered a second Asus GX10, and I want to make sure I buy a compatible interconnect cable. The Asus branded cable appears to be out of stock.

verbal kraken
#

Looking to buy mac ultra m3's with 256GB or more of ram. Anyone have a lead or wanna sell or invest it hit me up.

minor plank
#

Anyone have a "Jarvis-like" voice interaction app that i could install on a Android smart watch to "PTT" or keyword trigger / TTS response style interaction with OC? My Kids are requesting that.

broken moth
#

I bought a ClawBox... The 67 TFOP one. Brilliant..

stiff tree
steep wedge
weary reef
#

what is a good model to use at home I am trying the nemostorm i think is the name 122b i think.. but context window is a bit small

broken moth
#

@stiff tree openclawhardware.dev

weary reef
steep wedge
weary reef
minor plank
tired plover
fierce lantern
#

When loading local models, having 1GB storage doesnt seem to be enough. Is an external thunderbolt 5 drive good enough or should I be boosting internal storage with NvME, eSATA ? Comments , thoughts, insights.

lyric orchid
fierce lantern
spiral vector
#

Asking how much storage you want is about like asking how much house do you want - the answer is generally something like "as much as you can afford". Personally, I think most people should get a minimum of 1TB of fast NVME storage just for OS, applications and frequently accessed files. Then consider exactly how much storage, and of what sort of of frequency you access it. Try as you might, its pretty hard to watch 10 pr0n films at the same time - so that can be exported to slower, external storage if needed. But if you want to keep your entire Steam library downloaded at all times, then you probably want 2 or 4 TB of fast storage (games generally don't run as well from external storage).

Personally, my OS drive is a fast 1TB nvme. My game drive is 4TB fast nvme. Then I have 30TB external storage in my NAS for backups of all my linux ISOs.

#

If its just openclaw on a mac mini/strix halo/dgx spark, then a 1TB nvme will do fine. Even 120B models aren't that huge (storage wise - you'll run out of VRAM long before you run out of storage space).

fierce lantern
#

Ok. Given I have never played with this stuff and am looking to get a new machine to play with this stuff. Are the models sucked into RAM initially and then accessed or are they kind of like a database where you read stuff here and there.

spiral vector
#

You could think of the models as a database that must entirely be loaded into VRAM (doesn't work well in system RAM). Loaded onto your video card's RAM. But exceptions here are important - all 3 of the items suggested (mac mini/strix halo/dgx spark) have shared video RAM and system RAM. The fully shared RAM works well. If you got a more traditional PC, then you'd need to look at both seperate RAM and huge amounts of VRAM (which only comes with whatever GPU you buy).

fierce lantern
#

ah perfect. Thanks

#

what are the buzz words one would google / duckduckgo / askgeeves to understand this more ?

spiral vector
#

If you look for "openclaw hardware" you'll get a ton of hits

fierce lantern
#

@spiral vector thanks for all this. Appreciate helping me get started.

spiral vector
#

I'd recommend spending some time to understand the basics first. Then read into https://github.com/explaindio/ClawEval/tree/master - I think ClawEval is probably the most comprehensive list of what local models are good for OpenClaw. My only real complaint with this analysis is that they ONLY compare the open source models (and for some workloads - you really just want to run Opus 4.6 or GPT 5.4, despite how expensive those API costs can be.)

But ClawEval generally approaches this discussion from - here's what different models can do (from which you can derive what sort of hardware you may want). They do have a good docs section that goes into detail of what you get at each of the various levels of VRAM, but they don't directly recommend hardware. And because they don't consider HW (they just do cloud hosted LLM comparisons), they don't really analyze the 3 setups that I think are probably the most interesting (mac mini/strix halo/dgx spark).

full talon
woven galleon
#

Hey guys

#

Sry if stupid question

#

Mac Minis are sold out everywhere in Aus, looking to get this instead. Thoughts? 7 MAX Mini PC (2026 Flagship Performance) AMD Ryzen 9 7940HS 16GB DDR5 (Up to 128GB) 1TB SSD Mini Desktop Computers, Radeon 780M Graphics/8K Quad

full talon
ember lichen
#

why does openclaw run thru my tokens way faster then my claude does

sonic mantle
ember lichen
sonic mantle
ember lichen
sonic mantle
#

Whats ur pc specs?

ember lichen
sonic mantle
#

Then maybe have an hourly cap for api models

ember lichen
#

figured its not too bad

sonic mantle
#

U should try setting up qwen2.5-coder locally

ember lichen
#

how would i learn how to do that

sonic mantle
ember lichen
#

windows

#

HTTP 401: authentication_error: OAuth token has expired. Please obtain a new token or refresh your existing token. (request_id: req_)

also do you know how i fix this

#

i cant use my openclaw at all rn

#

bc of this

sonic mantle
#

U most likely reached an api limit

sonic mantle
ember lichen
sonic mantle
ember lichen
#

i cant use claude it doesnt know how, and my open claw wont work due to the issue.

sonic mantle
ember lichen
#

how

#

my actaul claw has no brains rn since it doesnt have a token

#

forget this i have a api key, how do i use it

#

ill just take on the costs

ember lichen
sonic mantle
#

im assuming ur using the interface?

ember lichen
#

i dont have the subscription

#

i didnt know u have to buy openclaw.

sonic mantle
#

ok whats ur setup?

#

how did u start openclaw n have it running before

ember lichen
#

worked for hours

#

then when i came home and tried to prompt it failure happened

sonic mantle
#

okay bet when u type openclaw into powershell does it run anything

ember lichen
#

yes it loaded a TON of stuff

sonic mantle
#

okay good paths setup

#

try this

openclaw onboard

#

(should guide u through the setup dialog where you can place ur api key)

ember lichen
#

would this be making a new agent?

sonic mantle
ember lichen
#

whats the best model

#

for just all around use

#

sonnet?

#

and if so witch one

#

wont drain tokens as quick

sonic mantle
ember lichen
#

turns out.... i was using sonnet 4-6

#

explains token use honestly

#

i think sonnet 4-5 is good enough right?

ember lichen
#

should i do skills during onboarding

#

idk anything about what that does

sonic mantle
#

depends what ur goal is

#

For cheap all around models i'd suggest Gemini 2.5 Flash-Lite GPT-5 Nano DeepSeek V3.2 Mistral Nemo just wing it n go off vibes

sonic mantle
ember lichen
sonic mantle
#

double check ur key usage has locks so u dont wake up with a $50k bill tmrrw from openclaw trying to draw ascii art in a loop

ember lichen
#

i just clicked sonet 4.5 but then it keep using 4.6?

#

how do i confirm limits?

sonic mantle
#

4.6 costs same as 4.5 and is more efficient

ember lichen
#

whats a better model

sonic mantle
#

ur premuch trying to wire in expensive models n hitting ur limits within an hour

sonic mantle
#

or Claude 3 Haiku

ember lichen
#

hm ill look into those tonight

sonic mantle
# ember lichen hm ill look into those tonight

Claude 3 Haiku if ur sticking with claude stuff,
Gemini 3 Flash if u want it dirt cheap

ur paying $0.25 per 1million tokens (Claude 3) Vs googles gemini 3 flash at $0.075 - $0.50 per 1 million tokens

ember lichen
#

how could I make a agent who uses that

#

To do easy work while my other sonnet agent can do hard tasks?

sonic mantle
#

shear will power and coffee

ember lichen
#

lol.

#

I am so interested in learning all this stuff

#

I don’t want to fall behind the inevitable

sonic mantle
sonic mantle
ember lichen
#

I’m having fun while figuring this out

west anchor
#

gun did you make it to the other side yet

ember lichen
west anchor
#

I can help

ember lichen
#

how

west anchor
#

If you purchase the claude pro sub, you just go to your terminal and run 'openclaw onboard'

#

it takes you back to the initial setup

#

and on page 2 or 3 where you select your AI model, you arrow down to anthropic, select it, then select OAuth.

#

It opens a webpage, you sign in with the email you subbed under

#

then boom youre in

ember lichen
#

really?

west anchor
#

openclaw gateway restart

ember lichen
#

I’ll try this when I’m done eating

west anchor
#

I warn that its against their ToS so you risk a ban

#

if you dont wanna risk that for $20 you can try it with openAI instead

#

but I havent been caught yet

sonic mantle
quaint narwhal
#

but you didn't hear that from me

quaint narwhal
woven galleon
#

Unless I get the Macs?

quaint narwhal
woven galleon
#

O wat interesting

#

Is the Mac minis chips actually just that much better

#

I should just be patient for the Mac mini restock tbh 🥀😭

quaint narwhal
#

opus is 12cents a request vs when I tried the same call on opencode cost me $4 for a similar request

#

yea it's wild lol

woven galleon
#

Yes dude yesterday I was trolling it for like 10 mins and it costed me 5$

#

😭

quaint narwhal
#

a mac mini isn't gonna run local LLMs very good, you're still gonna be using API lol

woven galleon
#

Ahhhh

quaint narwhal
#

my pro tip as well, get a claude pro/max plan to build your bot

#

don't use the bot to build the bot

#

it's way too expensive to do that, learn from my pain

woven galleon
woven galleon
#

Then Claude started getting impatient with me

#

XD

quaint narwhal
#

you have to run it in WSL

#

and last time I setup WSL it was a pain in the dick

woven galleon
#

Linux only?

quaint narwhal
#

I think that's what's reccomended on the official website as well

vagrant musk
dawn cosmos
vagrant musk
vagrant musk
# dawn cosmos A ryzen AI + 395 Max, 128 LPDDR can run 70b model and 120bq4 - gmteck evo 2 or a...

But yes those are also solid - NUCs like gmktec, geekom but also Dock-extended BeeLinks are all capable of running them locally (imo a lot more valua for money than anything Apple has to offer)

I do think a dual stack dgx - like the gx10 you mentioned, there's a couple more I think one from hp and from msi as well that run the same blackwell chips; basically dgx architecture - those are currently the peak of mini-hosts for LLMs

quaint narwhal
vagrant musk
quaint narwhal
#

can normies readilly able to buy those nvidia boxes?

vagrant musk
quaint narwhal
vagrant musk
#

look up gx10

pastel scarab
#

what do i need to have to run those models

vagrant musk
# pastel scarab and 400/500B?

400 you can do a bit at rate limit with 2 DGX Sparks in parallel as I think they can do up to 405B with a connectx-7 connector

above that, you have to stop looking at mini-PCs and start looking at H100s Platforms or so

#

But you're looking at 10x the cost for a leap like that

#

Still also have to factor in that the throughput difference of those machines are like 10-50x the difference as well tho, like H100s, H200s, B200s

craggy ferry
#

for 400b (like qwen3.5-397b) a mac studio is actually pretty reasonable at running them for the price

#

prefill sucks so keep contexts short, use smaller models to summarize / call tools, etc, but for one or two convos it works

craggy ferry
#

also just run quantized models, you can fit the 6 bit 397b in the 512gb with tons of room for context that you will likely never fill because see aforementioned point about prefill sucking

pastel scarab
#

but is there a big difference between quantized and full model?

craggy ferry
#

almost nothing at 8 bit, 6 bit you shave a bit more but it's still really close, 4 bit is like the last stop before real degradation happens

#

but you don't need to go below 4 bit unless you're trying to run glm-5 or k2

pastel scarab
#

i have like 5k budget

#

what could i buy?

craggy ferry
#

256g m3 studios are like 5k with the education discount. do you know literally anyone who is currently in school or works in education

pastel scarab
#

me

craggy ferry
#

blam

pastel scarab
#

how much is the discount?

craggy ferry
#

go search "apple education store"

#

it's like 10-15% i think? it covered more than tax for me

pastel scarab
#

im living in europe so its 6,5k euro

craggy ferry
#

after edu discount? damn, that's annoying

pastel scarab
#

but with the 60 core gpu

pastel scarab
#

its like almost 10k without discount full setup 256gb

craggy ferry
#

yeah without edu discount it's 6k before tax in us

pastel scarab
#

are u from us?

craggy ferry
#

yeah

pastel scarab
#

or should i wait for the m5?

vagrant musk
#

Gotta go bigger

dawn girder
#

I was thinking about getting a 32GB

vagrant musk
winter lynx
craggy ferry
low aurora
#

someone using lepotato soc hardware?

pastel scarab
#

do you know a good model for cold emails?

tired plover
grave shoal
#

Also check the official Apple refurb store.

#

Got my Mac Studio from there.

reef hollyBOT
#

gonna be getting a Mac Mini for my bot, Clawy (and switching him to a local model, hopefully that doesn't affect his ability to post on Moltbook), because honestly the 128k usage token limit for cloud models that Ollama offers for the free tier is pretty limited

solemn valeBOT
#

@silver ginkgo, Openclaw isn't affiliated with Moltbook. Moltbook is a separate user-developed project, so we would prefer it not be discussed in this server.

lament hull
#

Why are all the Mac minis sold out.

spiral vector
# lament hull Why are all the Mac minis sold out.

Intersection of 3 points. 1 - the success of openclaw has really driven demand more than apple expected. 2 - Apple is in the process of switching their M4 lineup to the new M5 lineup. 3 - global RAM shortage (Yes, I know the built-in memory on mac silicone is different, but at some level people will go for whatever PC parts they can get)

lament hull
#

Yeah it has gotten insane.

winter lynx
vocal shard
#

anyone still using a 2018 mac mini for their agent?

surreal nova
shrewd nest
shrewd nest
#

Is a 16GB M1 Mac Mini good enough?

spiral vector
# shrewd nest Is a 16GB M1 Mac Mini good enough?

Good enough for what exactly? Yes, its good enough for some things, no not goot enough for all things. If you scroll up a bit I repasted the link for "claw eval" on github. They do a great job of detailing what the different models can do.

Personally, I played with openclaw in a VM on my NAS - connected to various cloud service APIs. I never was able to get it as locked down as I was comfortable with - although now with nemoclaw from nvidia that seems improved (but still not completely fixed). When Claude code rolled out their remote access and now claude code channels I jumped back into that. (Which fits great on any old mini-PC.)

full talon
wide roost
#

Hi all,

Evaluating my setup's cloud cost equivalent and curious about your experiences. Here's what I'm running locally:

Compute Nodes:

Node CPU RAM GPU/Accel Cloud Equivalent
Brain Ryzen 5 4500 (6c) 15GB RX 550 4GB ~$40/mo
Nebenhirn Ryzen 7 2700 (8c) 31GB GTX 1650 4GB ~$60/mo
Muskeln - 62GB RTX 2070 Super 8GB ~$150/mo
LubanCat 4x ARM 3.8GB - ~$15/mo
Pi5_1-4 4x ARM 16GB total - ~$20/mo
Kleinhirn 2 RK3588 2GB 2GB NPU n/a
Kleinhirn 3 RK3566 2GB Mali GPU n/a
HP Notebook Ryzen 5 5600U 14GB Vega iGPU ~$35/mo
LLM Stack:

Brain: Ollama with qwen3:8b (local), ATXP fallback for complex reasoning
Nebenhirn: SD + Ollama (GTX 1650)
Muskeln: SD + Ollama (RTX 2070S)
Totals: ~160GB RAM, 16GB GPU VRAM, 70+ cores, 2x NPU
Cloud equivalent: $320/month
My cost: Hardware already owned (€800 invested), ~€15/mo electricity

For those running local LLMs: at what point did you break even vs. API costs? And what's your "too big for home, must go cloud" threshold?

Context: trying to justify keeping this running vs. just using GPT-4 API for everything. The privacy aspect weighs heavy, but so does the electricity bill. 😅

half haven
#

Appreciate any feedback.
Just out of curiosity as I've tried every other issue, is this HONOR MagicBook Pro 14 2025 14.55 inch ARL Ultra9 UMA 32GB SSD 1TB Grey Windows 11 good enough to run Ollama and qwen3:32b with open claw?

little scroll
dawn cosmos
half haven
storm hedge
dawn cosmos
sage jackal
#

Hey Im new here. Heard about openclaw for the last 3 months and now finally have time to jump in. I guess this is the section where to talk about hardware. I realize mac minis are hard to get nowdays so I may have to redirect to macbooks, for running local llms is it ok a macbook pro m5 with 32 gb? When I ask to llms they say yes and no and I can see on forums people saying yes and no. So before jumping in I just wannna make sure I can still run some models. i dont need the high end ones, I just want to have a feel of jarvis at home and go from there. If it becomes vital then Ill upgrade to mac mini or studio. In the meantime so is a macbook pro m5 ok for a 32b model or lower ? Thx for answers, let me know if theres a section where I can get those answers already

lyric orchid
# sage jackal Hey Im new here. Heard about openclaw for the last 3 months and now finally have...

I would look at peoples results with the M4 mini 32gb, if it works there, it should work on the MBP M5 32gb, but yours would be a little faster I think, though 32b might be tight, you need some room for the OS and other processes! asking claude about this:
*Yeah, that's solid logic — same unified memory architecture, same memory tier, so if a model runs well on the M4 Mini 32GB it'll run at least as well (and ~25% faster on token gen) on the MBP M5 32GB. The chip difference doesn't affect what fits, only how fast it runs.
For their specific question about 32B models — that's actually the tricky boundary at 32GB. A 32B model at Q4 quantization needs roughly 18–20GB, so it fits, but leaves little headroom for the OS and context. Q8 of a 32B would be too large. So the honest answer is:

7B–14B models → runs great, multiple quant levels, no issues
32B at Q4 → fits but tight, performance will be acceptable not great
32B at Q8 or higher → won't fit cleanly
70B+ → no

So for a "Jarvis at home" vibe, they'd actually get a better experience targeting a well-tuned 14B (like Qwen or Mistral) than a cramped 32B. The 14B at Q8 will feel snappier and more capable than a 32B squeezed into Q4 at the memory limit.
The M4 Mini 32GB benchmarks would be a perfect proxy — same answer applies to the MBP M5 32GB, just faster. *

sage jackal
lyric orchid
# sage jackal Thx Kevin. So basically, bringing back down to 7/14B models quantized a bit shou...

yeah, you definitely should be able to get started and try some things out before committing to new hardware. i.e. maybe it's "Jarvis like" , but smart enough? The more vram the better, but it's not clear to me where the jump is in functionality between 24-32 and more (48, 64, 96... ?). It all depends on what you are doing with it. check out claweval as well, they don't specifically show the 32gb mini, but do show 24 and 48, so somewhere between the two... https://github.com/explaindio/ClawEval/tree/master?tab=readme-ov-file#-which-tested-models-fit-on-your-hardware - not sure if there are speed results in there though, just model test results.
actually, re-reading your response, I thought you already had the MBP M5... the nice thing about the mini is that it's kinda meant to be running all the time, at least more than a laptop? I am actually running my OC on an old windows laptop I was longer using, put ubuntu on it (had it on an M1 mac mini, but I have other personal stuff on there, and wanted OC on a fresh machine without access to any other personal stuff), but I wonder if having a laptop running in my wiring closet 24/7 is the best long term strategy. next step, raspberry pi 5 🙂

sage jackal
shell kindle
#

Question: Is anyone running a Mac Mini cluster?

tropic crater
#

hi, I am AI Hardware Engineeer, new to this wonderful group.

I dropped a new roadmap article comparing the Mac mini M4 as a 24/7 OpenClaw server to a Jetson Orin Nano 8GB edge appliance—when each wins, how to squeeze real inference out of 8GB UMA, and a privacy/security stack from hardware through skills (no vendor hype, just tradeoffs).

https://jared-hpc.com/blog/mac-mini-openclaw-server

harsh thicket
#

Have you had any luck? I’ve spent two days now trying to get openclaw running on my 2016 15” MacBook Pro. When I started it was running Sequoia via OCLP and had irreparable dependency issues. Based on bad info I wiped the machine and downgraded to Monterey only to run into the same problems. After struggling all day today I’m worn out. Both ChatGPT and Grok have led me in circles trying to repair the issues and get it running. Now I’m wondering if I go back to Sequoia if I can maybe run openclaw in Docker? Turns out Docker is not supported in Monterey anymore so that was a dead end. Sigh.

nocturne girder
#

Good morning,
I've been trying to set up the OpenAI, Gemini, and Anthropic APIs for a few days now, but I haven't been able to get any models other than OpenRouter to work.
I’m thinking of buying a PC to install some models locally since OpenRouter. I’ve seen one with a Ryzen 7, 32GB DDR5 RAM, and an RTX 4070. Will it work? Can it be configured to use the models locally? Many thanks

fair pond
#

I keep hearing about cloud models. Is it not possible to run a local llm on my Mac mini m4 16gb without it being either super slow or unresponsive ? Wondering if anyone’s cracked this code yet.

floral geyser
# nocturne girder Good morning, I've been trying to set up the OpenAI, Gemini, and Anthropic APIs ...

Hey good morning! You don't need a new PC for this. The API keys for OpenAI, Gemini and Anthropic should work fine — its usually a config issue. What error are you getting when you try to connect them? Happy to help you troubleshoot.
If you do want to run models locally thats a different thing — that setup with Ryzen 7, 32gb ram and RTX 4070 would work for smaller models through Ollama. But honestly for openclaw you'll get way better results using cloud APIs like Claude Sonnet or GPT-5.x. Local models are slower and less capable for agent tasks. I'd fix the API setup first before spending money on hardware!

nocturne girder
# floral geyser Hey good morning! You don't need a new PC for this. The API keys for OpenAI, Gem...

Thanks a lot for the help, bro.
The problem is that when I try to run models using an API from any provider other than OpenRouter, I get errors like:
⚠️ Agent failed before reply: All models failed (2): openai/gpt-5.4-mini: Unknown model: openai/gpt-5.4-mini (model_not_found) | google/gemini-3-flash-preview: ⚠️ API rate limit reached. Please try again later. (rate_limit).
Logs: openclaw logs --follow
I've tried renaming the templates to create them, but nothing works for me except OpenRouter...
Plus, the token consumption is high for relatively trivial tasks like following companies on social media.
Thank you for help

floral geyser
#

OpenAI-codex limits are quite brutal. Also, have you upgraded the openclaw for gpt-5.4 support ?

#

On telegram, use the message, “/models OpenAI-codex “. This will show you the models that are supported ..

lyric orchid
# nocturne girder Good morning, I've been trying to set up the OpenAI, Gemini, and Anthropic APIs ...

I originally tried ollama with a 4070, but personally I don't think 12 gb is enough gpu for local models. I was using ollama with that, hadn't discovered llamacpp, so could have gotten better times, but was limited to smaller models... qwen3:8b or qwen3.5:9b with 32K context. I've upgraded to a 3090 with 24gb and am much happier with the results and consistency of the bigger models. My test results here with the 4070 https://github.com/khaney64/ollama-model-tests/blob/main/reports/recommendations-4070.md

nocturne girder
nocturne girder
nocturne girder
tawdry vault
#

What’s the best Mac mini config? 32gb ram?

royal radish
#

Hi, i'm running deepseek V3 and R1 on VPS (much cheaper) but i was wondering how is the difference running on local infra, do thinking mode is different from an llm to another changing the infra will not solve the model issue. i moved to deepseek because anthropic cost were high. i configuration on VPS is 16go memory and 200 Gdisk space

royal radish
errant steppe
thorny elbow
#

i got 8gb ramohno

urban furnace
#

did anyone else's windows 10 LMstudio stop yesterday. some trojan reported, likely false positive as per reddit, for version's 0.4.7 main/index.js file

#

my windows 11 has not reported this file.

#

I did moved the file back from quarantine but LMstudio since didn't launch its gui anymore on windows 10. file size is identical to the working windows 11 version.

surreal nova
harsh thicket
shell kindle
#

Yeah I had the same sentiment. glad to help

rose sonnet
#

hey guys, im in doubt about what OS to use for hosting openclaw with an anthropic model?. A bit of context, i have an HP elitedesk 800 g2 sff with extra ram and im gonna use that for the hosting, in general im gonna use claw for little things like, read all of my newsletters and create like a newspaper for it, set reminders via whatsapp and/or by using voice messages, use the Productivity skill in the clawhub and so on.

quasi forum
rose sonnet
finite dirge
#

Any opinions on hosting openclaw on android phones? Do they work?

urban furnace
little vector
#

What is the cheapest way to run a 24/7/365 agent?

novel thorn
craggy ferry
blissful widget
craggy ferry
#

Correct!

blissful widget
#

I'm using glm4.7 flash and qwen 3.5 30b and they are still quite dumb at the moment.

#

reinforcement does help over time, but i dont wanna nudge them from time to time

#

which models do you guys have experience and are generally good doing tool calling? (LLM)

tranquil hazel
split canyon
spiral vector
#

Hate to say it here, but cheapest option for 24x7 agent lukely isn't openclaw at all - its just a $20/month claude or antigravity subscription - depending on your level of weekly usage. I dont mean running through their API (that's the most expensive option), rather just use the remote access options and run locally in a VM and access it from your phone or via VPN. You get a good amount of sonnet 4.6 usage for $20/month - even more haiku.

#

But, yes - if you want high usage, or if you're OK with running small models, then open claw on a cheap mini PC works. Cheap(ish) used mac mini's with apple silicone, or AMD Strix work best for local models. Local models really need 32GB VRAM. If you're just using openclaw to connect to cloud model API's, then you can run on any cheap PC with 8 or 16GB ram, 2-4 core CPU and minimal storage is fine.

#

I always point people to claweval on github to get an idea of what models you want and what you can run.

blissful widget
#

I am using qwen 3.5 35b a3b and i had almost 90% sucess rate on performing tool calls 🙂

#

now i have to hook it to comfyui as tools

shadow urchin
#

i just read about comfyUI, it sounds fascinating. can it be applied elsewhere?

thorny laurel
#

Hey guys I'm new here. I'm tasked with building a OpenClaw setup but I'm having trouble figuring out the specs for the hardware. My boss wants it to mainly work as a bot that will search market data and trends on the internet for a specific market sector. The think is i don't know if i should move foward using a local LLM or Claude API. The RAM specs for each situation differ a lot and in my country MACs are much more expensive than other hardware. Should i still get the mac mini? I'm know a bit about LLMs but that's not my expertise.

frozen bridge
split canyon
frozen bridge
steep wedge
thorny laurel
steep wedge
thorny laurel
#

Thank you, i was really lost there

surreal nova
#

love the respect here
wrong tool for the job
under orders
ok, well heres how best to work with that
🙂

finite dirge
#

Hi Watson, thank you for your reply. I have developed an application to control the android phone camera, sensors, calls, torch, etc but it is not stable.

Wondering if installing openclaw on android phone would be not be a good idea unless the phone is powerful enough for stability.

red cypress
# thorny laurel Hey guys I'm new here. I'm tasked with building a OpenClaw setup but I'm having ...

for research probably want a high-end cloud model (different from what hardware to run it on) from claude (opus/sonnet), has to do web search, go through all the data, put together analysis, present it in whatever way you want it (email, telegram? ppt?). maybe try it first using the primary path (ie. claude website/cowork or chatgpt website) with the queries you want to try and see how well they work before you hand it off to openclaw to run on its own and hope it works. Using frontier models aren't cheap, just a warning, and it takes a bit to get used to how openclaw works with memory/context, tons of info out there, just have to play with it.

hoary sable
thorny laurel
#

Thank You

steep wedge
#

Hello fellow DGX Spark owners. I have my two Asus Ascent GX10s clustered, and I was running Llama-3.1-Nemotron-70B-Instruct-HF for most of the day. I hated it. 😂 Super annoying personality, but the real problem was I had to drop conext window to 32k to squeeze past the CUDA graph step when bringing the model online. Anyway, I just nuked that, and I am going to give Qwen3.5-122B-A10B-FP8 a try. Any other recommendations on models you have liked running on a 2-node cluster?

craggy ferry
#

You’ll like 3.5-122b

#

Wish I had pulled the trigger on a gx10 while they were still cheap but

steep wedge
fickle vapor
#

I'm running a very limited PC, I have two rtx 2070 supers with an NVLINK bridge installed on pop os linux. Right now qwen3.5 9b seems to be the only one that fits - is there somthing I'm missing here? Every time I try to run 27b it grinds to a halt

#

(openclaw is running on an unprivledged lxc container on my proxmox host)

steep marten
#

Question, if i'm on like a budget is this a good build for openclaw?

#
spiral vector
#

That seems like a good all around PC that also supports open claw. If you want a pure-open claw system, with local LLM support, you can save some more by looking at unified memory systems (mac mini or strix halo are good). But the unified memory systems are not as good for tasks like gaming if you also want to use the system for that.

#

Intel just released their B70 GPU also - $949 for 32GB. But that'll come with even more software/model compatibility issues that AMD will with ROCm - cheaper if you're willing to fight through it and/or wait for others to build intel specfic versions of the models you want to run.

#

Oh - there's also 2 newer versions of the peerless assassin - about the same price, or $5 more, slightly improved performance.

steep wedge
fickle vapor
#

Thanks for your attention in this matter, I should have been more considerate and listed the VRAM sizes. Apologies

steep marten
steep wedge
# fickle vapor So stick to 9b. Is 9b smart enough to analyze logs and run administrative action...

That's an interesting idea. I have similar aspirations but I have access to some larger models. The thing I am slowly learning is that I am not always a great prompt writer, and the smaller models need very tight, well-structured prompts to produce the best results. I often get the best results when I use one of the online models (e.g., Gemini or Claude) to help me write a better prompt for the little local model(s). Now that I think of it, I should probably have the big boys help me write better agent/soul/memory files for the little guys.

steep marten
#

if i buy 128gb of ddr5 will there be any finetuning or anything with the bios ill have to configure? if so what?

oak frost
#

Check your mainboard details, there mostly limits how big the ram modules ca be.

steep wedge
#

New project: Dell R740xd with three Nvidia Quadro P5000s. It's a solution in search of a problem. I have some ideas, but open to others.

grave shoal
grave shoal
steep marten
#

well i'm doing both

#

and i like training and finetuning

steep marten
steep wedge
steep marten
#

qwen 32b runs pretty well for openclaw with my testing atleast

steep wedge
#

When you say it runs well, do you mean speed or accuracy and usefulness?

grave shoal
craggy ferry
craggy ferry
lyric orchid
lyric orchid
fickle vapor
fickle vapor
craggy ferry
fickle vapor
#

Thank you! I plan on using my local free model just to execute routine tasks like log scanning for emergencies.

#

The more difficult tasks get online models

#

I only have 16g VRAM. From the beginning my models were going to be limited

fickle vapor
#

I'm probably not going to run them - but I've also heard some people say abliterated models are slightly smarter than baseline

steep wedge
craggy ferry
#

It’s sort of like lobotomizing them in a very particular way. The bits that light up when they refuse to do things because of their safety guardrails are also the bits that light up when they refuse to follow prompt injection attempts

#

You’re trading off safety basically

lyric orchid
# fickle vapor I only have 16g VRAM. From the beginning my models were going to be limited

re: 16g vram, I had good results with qwen3.5:9b, and qwen3:8b and :14b on a 12 gb 4070, they'd probably work for you. not fast enough for primary agent / chat sessions IMHO but fine for tasks. the key for me is create an md file for some task, have the cron job instruction say read and follow the instructions in the md file, see how it does, give the results and the md to claude code, have CC tweak it, lather, rinse, repeat until it does what I want. I mentioned here or in another chat that I also have CC generated proxy between OC and ollama/llamacpp so I can watch the traffic, see what the model is doing, see where it gets confused or stuck, feed that back into CC, adjust the prompt.

steep marten
gritty prism
#

Im looking to get a mini PC. Would like to get a good spec that i can upgrade up to 128gb ram but starting at 32gb ram 1tb nvme. Gemini is recommending me the Minisforum AI X1 Pro (obviously one of the most expensive options). I would like to know if anybody has experience with the X1 or if anyone recommends something else. I do not want apple as i want to run linux. Appreciate the time in advance!

dawn cosmos
gritty prism
fickle vapor
gloomy crescent
#

just the news, nothing really happening

keen tiger
#

I am planning to get a Mac Mini with M4 cpu and 32GB ram. What is the biggest model that I can use on it with OpenClaw ? Do anyone have experience with a 9b Qwen model on such hardware ?

steep marten
modern axle
steep marten
modern axle
steep marten
modern axle
#

The idea is that TurboQuant reduces memory requirements and improves response performance and latency while maintaining accuracy. In practice, it would allow AI models to access more contextual data while using less space and avoiding hallucinations. Source

#

Together, they could help alleviate the memory bottleneck. Although it wouldn't do much for training data centers, which also require monstrous amounts of memory, it could thin out the RAM needs of inferencing systems. It probably wouldn't do much to solve the current memory crisis, as deployment would take time, and memory orders are already locked in for many months. But perhaps it could help bring the RAM shortage to a close before 2030. Same Source

agile sentinel
#

@lyric orchid i was gonna ask you if you had tried qwen3.5-27b opus destilled v2 via turboquant as it allegedly fits 16gb

astral gobletBOT
# agile sentinel https://x.com/i/status/2038725930626003140

Turbo Quant not just for KV, can use it on weights.
︀︀
︀︀I bought an RTX 5060 Ti 16GB around Christmas and had one goal: get a strong model running locally on my card without paying api fees. I have been testing local ai with open claw.
︀︀
︀︀I did not come into this with a quantization background. I only learned about llama, lmstudio and ollama two months ago.
︀︀
︀︀I just wanted something better than the usual Q3-class compromise (see my first post for benchmark). Many times, I like to buy 24gb card but looking at the price, I quickly turned away.
︀︀
︀︀When the TurboQuant paper came out, and when some shows memory can be saved in KV, I started wondering whether the same style of idea could help on weights, not just KV/ cache.
︀︀P/S. I was nearly got the KV done with cuda support but someone beat me on it.
︀︀After many long nights (until 2am) after work, that turned into a llama.cpp fork with a 3.5-bit weight format I’m callin…

spiral vector
spiral vector
royal radish
spiral vector
#

Strix Halo are generally the "cheap" option to run local models (relative to DGX Spark's or Mac Mini's ($4-$5k+)). AMD Strix Halo doesn't work as fast as the similarly sized competitors, but they're a lot cheaper. So "working well" is a cost vs speed concern here (think 20 tokens per second instead of 30).

royal radish
shadow ingot
#

can a X1 Pro-370 Mini PC AMD Ryzen AI 9 HX370 handle running open claw okay without issues?

royal radish
shadow ingot
#

my budget is like 3k tops

royal radish
#

you can have decent machines with 3k

#

it always depends on your usage

#

the thing is that when you start to explore you don't want to be limitated..

spiral vector
#

There's a signifigant difference between running open claw, but using cloud-hosted models vs running with local models. The former takes almost no hardware, but will come with a monthly bill for API costs. The later take much more local hardware, but then no (or at least less) per month for API costs

#

For some tasks, only the best frontier models are good (for some tasks even those aren't good enough yet). So its hard to say that even with $10k+ hardware that you can do it all.

shadow ingot
#

good to know yeah im not sure what ill be able to do or where ill have bottle necks

steep wedge
#

Have you looked at the DGX Spark clones, or do you need this for general purpose computing as well?

#

If you’re using cloud APIs anyway, why bother with a Jetson?

lyric orchid
#

speaking of electricity costs, I'm wondering if there are any easy ways for me to pull power information from the GPU, I'd like to see if I can build out something that would match up the GPU power consumption with the jobs that I run in openclaw, and try to come up with a "cost" for each job. I really would like to see if it's worth having this local setup to do what I've been doing, vs. just find a 10-20 month plan or pay for tokens, i.e. compare costs. I know programs like HWInfo show me the power details... maybe time to have a conversation with claude code and see what we can build! Maybe I can set up a few solar panels to power the "inference" machine and let the sun pay for the GPU time!

viral ridge
grave shoal
#

I’m running a 64GB Mac Studio M2 Max. Local llm results are slow and nowhere near as reliable as Anthropic or OpenAI models. But depends what your goal is.

#

For some basic reasoning tasks it’s not bad.

lyric orchid
# viral ridge You could pretty easily setup a beszel server (monitor hub) and install the besz...

I'll have to look into the beszel server. I hacked some code into the proxy I've been using between openclaw and llama.cpp to monitor, it uses nvidia-smi --query-gpu=power.draw. I may set something up to push this data so influxdb, then I can do some charts in Grafana!
2026-04-02T17:42:17.350Z [done] job=downloader-summary qwen35-35b-a3b reason=stop prompt=49 (0.1% of 40960 ctx) gen=319 ratio=651.0% pp=492.8tok/s(99ms) tg=96.7tok/s(3.30s) total=3.40s elapsed=3.53s gpu=330.3W(+315.3W) peak=343.4W 0.3028Wh(+0.289Wh) $0.000057(+$0.000055) (13samples) session: prompt=30633 gen=5772 elapsed=83.59s energy=5.0181Wh cost=$0.000951

tranquil hazel
#

you don't want a stupid agent

#

if you're running anthropic/google/openAI models, you'll probably be better protected against malicious stuff

#

like your agent reading something stupid on this discord

lyric orchid
# tranquil hazel you don't want a stupid agent

one of my original reasons for exploring local was to prevent sending sensitive information, credentials, api tokens, etc. to cloud providers. one of my first skills I built was to scan the session logs for leaked credentials... early versions of openclaw was constantly leaking creds (trying to get somethign work, it would pull in config files or .env files). if it does that locally, no big deal. but yeah, in general local for me is good for very specific tasks that I don't necessarily have to wait for (cron jobs), with limited access to data. I don't give agents any "go out on the web and find this information", most of my stuff is using skills that reference specific APIs for the data, and the agent can then do some consolidation, answer questions about it, etc. but for actual "talk to the agent" things, in openclaw I'm using a cloud model (minimax) and staying within it's limits.

stiff spoke
#

This weekend I’m excited to get my shiny new Mac Studio M3 Ultra 512gb running as my OpenClaw secondary LLM for bulk text processing and basic tool use. If it goes really well, I might get it to do some code generation stuff too. Qwen 3.5 is my starter model, but I’ll be exploring others too. (Using paid cloud LLM api as primary.)

#

I managed to put an order in a few days before Apple discontinued selling them.

craggy ferry
stiff spoke
craggy ferry
#

Paged prefix cache, metal native

stiff spoke
#

Thanks! I'll check it out immediately 🙂

real pilot
#

Question:
I'm looking at acquiring a mac to run open claw on, it's either between the $600 Mac Mini or the $2000 Mac Studio. I'd like to run a local model if possible.
Is it worth the price though? Am I going to blow through over $1,400 in Claude Sonnet 4.6 tokens in a year's time?

crude fossil
#

yes i promise you will

#

i blew through that my second week i used it so much

pallid roost
#

@craggy ferry so for the 512gb max your recommending 397B for basically everything?

Conflicted with some of the new releases like genma4 even though they’re much smaller

craggy ferry
#

Gemma4 does look better, haven’t played with it much. Wish they’d released a 70b

sudden shore
torn folioBOT
south obsidian
stiff spoke
south obsidian
#

(there are minis and laptops that have TB5 too fwiw)

stiff spoke
south obsidian
craggy ferry
#

i only bought one m3 512 because i was hedging on them having an m5 512 or 1t this year

stiff spoke
#

Me too. I’m slightly regretting just buying 1. And although I know the ram constraints are going to be around for a while, I’m hoping Apple got their supply contracts settled before it was a big issue.

steep wedge
#

lol, what is this? 😂

stiff spoke
#

So, fwiw, I'm using Rapid-MLX on my 512gb mac studio with a minimax-m2.5-8bit (243gb downlaod) model for text, thinking and tool use, and quen3.5-vl-112b-a10b-8bit (131gb download) for vision... loaded at the same time 🤯 I am loving this Mac Studio setup!!!

My OpenClaw setup is hot🔥 with this equipment.

steep wedge
#

That is exciting. I am waiting for a new Mac Studio refresh to appear, and then I will probably pull the trigger. Making due with my DGX Spark cluster in the mean time. 😂

stiff spoke
#

I am also waiting for the refresh. I’m betting a LOT of people are!

acoustic stump
#

Ordering a souped-up MacMini M4 now vs waiting: Do we think that if we order a max spec Minin M4 Pro now (5 month wait time), that Apple will just refund if/when an M5 mini is announced, or will they offer to switch to the new chipset (assuming some small price difference)? Has anyone done had this kind of experience with Apple before?

warped monolith
#

just started playing with oc.
with the sizes on those models y'all are talking about, what can those things do and are the mac studios fast enough?

steep wedge
steep wedge
# warped monolith just started playing with oc. with the sizes on those models y'all are talking a...

The local models are definitely more challenging. They are more prone to mistakes in general, but seem especially adept at breaking their own OpenClaw config. I have taken steps to assure mine can no longer touch the config files. I get better results with a large cloud model (e.g., Kimi 2.5), hosted by somebody like Ollama or Synthetic, as the orchestrator. It then manages the local agents/models and tries to double check their work. Nothing is as good as the frontier models like Opus 4.6, but $$$. As for the Mac Studios, they appear to be excellent performers, especially with a ton of RAM which allows running larger models.

winter lynx
craggy ferry
#

Are you? Apple most often keeps the same price for a refresh

prisma briar
#

Hey all.
To get the full potential of OpenClaw, like "computer use", browse, UI, etc, should I have a MacMini or is a "Linux PC box" (Beelink, etc) as easy ?
If MacMini makes it much more easy to setup and use, that won't be a problem, we're talking $200 difference, and I have that budget

steep wedge
prisma briar
steep wedge
lyric orchid
# lyric orchid I'll have to look into the beszel server. I hacked some code into the proxy I'v...

I did end up having my proxy throw data into influxdb, including GPU power information, and had claude code build out a dashboard for grafana to see the data. The proxy is becoming a bit of a monster, but I find it useful for debugging model response to instructions, fine tuning, etc. Also learning more about how kv_cache works and others discussing "warm" and "cold" cache, so added some stats related to that to see how well my cache is being used across requests.
https://github.com/khaney64/llm-stuff?tab=readme-ov-file#overview
https://github.com/khaney64/llm-stuff/blob/main/README.md#gpu--energy
https://github.com/khaney64/llm-stuff/blob/main/README.md#kv-cache--recent-requests

woeful frigate
#

Hey guys I am looking to build a openclaw server is a 5060 ti with 16gb of ram a good starter card?

woeful frigate
prisma briar
steep wedge
craggy ferry
craggy ferry
prisma briar
craggy ferry
#

I mean, they are amazingly price efficient at the base model

#

But that’s only if you will actually use the specs

prisma briar
#

Probably gemma4 capable, without gpu?

prisma briar
craggy ferry
#

All Macs have a GPU, their “integrated GPU” is far better than anything called that on the pc side

#

You could probably shoehorn gemma4 into one, the small ones obviously (e4b) but the large ones even at 4-bit quant would want a higher memory config than the base model

#

lol a 32gb mini has a 4 month lead time

#

Yeah everything that isn’t the 16gb base model is super delayed. Makes sense I guess

fathom steeple
#

gemma works fine on mac mini m4 with 16gb ram

#

not pro performance but its fine

steep wedge
prisma briar
fathom steeple
#

just get ollama

#

write on google ollama + gemma mac mini

#

its 10 minutes job including downloading 6gb of model from internet

prisma briar
#

thanks @fathom steeple

fathom steeple
#

happy ro help

nimble thunder
#

hey, is anyone looking into upcycled phones with custom forked Lineage/CalyxOS?

vocal oak
#

Hi guy I'm lookig for reviews I'm just started with little models as Qwen3.5-4B for my Agent. But I'm lookig for recomendations is this a good start if I want to use my agent to aske code solutiones? I have limited resources i have rtx 5070ti, 32gb ram ddr4 and ryzen 7 5800x. My dubt is is this software necessary to run which models or what is my limit model I can run?

echo path
#

Just try out different models and see if they run, if they don't you can always move to a lighter model

stone oak
#

With a rx7900xtx, I should run like a 31b model with less context or smaller model with bigger context. Always seems like the onlything blocking me from local is the context thing

steep wedge
vocal oak
steep wedge
#

I use both. Ollama is pretty simple and a good first try. vLLM probably offers better performance, but can be a little more involved to set up.

lyric orchid
grave summit
#

I pulled the trigger on a maxed out MacBook Pro. I hope I won’t regret it! Anybody running all local with this or similar machine?

gloomy elbow
#

anyone suggest some hardcore tests for my M4 mac mini and M2 mac mini, both 24GB? M2 is currently running two bots without an issue

warped rampart
#

Do you guys run a server with specific hardware in a datacenter to run local LLMs? I'm trying to find a solution on how to provide an "always-on" AI assistant. I'm currently running a cheap second-hand dedicated server as mail- and fileserver with not enough power for AI. Because it is in a data center, it has a good up- and downlink. Buying a new computer for AI apps/assitant at home (RAM prices 🥲 ) and move the datacenter-server to this one and make it publicly accessible on the internet through DynDNS might be to unstable and slow for file transfers. My goal is to transform the fileserver into an EDMS with document Q&A, summarization, auto categorization. Just upload a document and let the LLM handle in what cabinet the document needs to be stored.

warm blade
#

what bit of gemma 4 should I run with 4070 and 48gb ram

rocky violet
frozen sentinel
#

Why am I finding M3 512gb studios on eBay for 2k? Are those scams you think?

dawn cosmos
# grave summit I pulled the trigger on a maxed out MacBook Pro. I hope I won’t regret it! Anybo...

I hope you had done the research on this. Assuming you have tkaen M5 max chip with 128 GB Ram - you will be able to run gpt-oss-120b q4 (or 70B and full precision) - but they will never be near the 5.4 or similar frontier models. Hence, for now those who are yet to make the trigger, do a economical comparison for next 2 years (where more hardware would be available cheaper) whether it makes sense to splurge of say $5K or use $100 per month for 1 year (for frontier models) and see whats available next year.
of course, if you need the M5 Mac for other activities, yeah this analysis in invalid

thorn wagon
#

is it worth to get 2x rtx 3090 or ryzen 395 128gb mini pc?

sterile sonnet
#

Some of the photos are definitely AI

wide locust
#

Hi all. I’m thinking of getting a Mac Mini M4 chip and wondering what others would recommend in terms of memory and storage. I want to run some local models on it, too.

frozen sentinel
thorny abyss
thorn wagon
dawn cosmos
last cedar
#

any one here working on humanoid and robotics ?

shut oak
#

I got a question

steep wedge
surreal nova
#

anyone got some advice on dipping a toe into Abliterated/uncensored in a local context? Just tried supergemma4 26b uncensored fast, and, 171 t/s. I need bigger and more reasoning, Tring to push architecture and brain to local as much as I can. But maybe just have to escallate to cloud for that. Good worker be, but for local rag + internet rag + images, I have more hardware headroom to burn

craggy ferry
#

gemma4 31b is significantly better

surreal nova
craggy ferry
#

i like 26b as a research agent

#

it's 4x as fast as 31b because moe etc

surreal nova
#

ya, but I talk in screenshots, both of web, task manager (sorry win guy here) and, life

#

I am actually after slower (and more reasoning)

craggy ferry
#

i think both are useful

#

but yes you definitely need at least one high end model like that

surreal nova
#

Need to find my main first, then the worker bee(s)

#

gemma-4-31b-it-uncensored-heretic
57 t/s

#

shit, I can go bigger

#

I have noticed a correlation with uncensored and q4, which makes sense but, I dont actually need that

craggy ferry
#

i am really sad there's no gemma4-70b, yeah

surreal nova
#

70b is such a sweetspot

#

but 1yr old lamma, ya no

#

havent touched the quens yet, nemotron 120b NVFP4 thats prob where I end up

#

model size, context bandwidth, speed, holy crow

craggy ferry
#

i like qwen3.5-397b, but

#

honestly gemma4-31b is kind of close to it in benchmarks

surreal nova
#

so I am literally here on local llm cause perplexity dropped Kimi K2.5

#

Gemma 4 31b, straight from nvidia, ya, thats my baseline for sure

craggy ferry
#

that's my fav thing about local llm

surreal nova
#

not gonna happen again

#

(also not buying 8x GPU)

craggy ferry
#

you have the thing you have until you decide to move

surreal nova
#

still want to test gemma 4 31b uncensored tho

#

gemma-4-31b-it-uncensored-heretic is rocking my vision + uncensored test

#

this is the one by llmfan46

thorn wagon
dawn cosmos
# thorn wagon models will be compressed more, if our system cant run 70b model, in next 6month...

I think it will be more like multi-LLM operation. In the sense, there will be generic model which can be cloud, but a smaller specialized model(s) for each area. For example, lets say you are coding only in Java, then only that specialized models would need to be run locally. Same for the enterprises as for example a frieght logistics will only have spefici smallmodel and a cloud generic model. The idea of context separation, identity isolation, etc will need to be handled and thats where AI industry is going

frozen sentinel
steep wedge
frozen sentinel
steep wedge
# frozen sentinel One I’m looking at is buy now

I believe you, but I'm not finding them. When I filter by "Buy Now" and US only, the lowest I find is $4,400 and they have 0 reviews. The remaining results reach into the $20k range (which is insane).

steep wedge
frozen sentinel
steep wedge
solemn lodge
sterile fjord
#

Hi, I am new to this discord. I am trying to get openclaw working on my 16 GB Thinkpad T14s running AMD Ryzen 5650, running Ubuntu 24 LTS. I want to use LMStudio running IBM Granite 3.2_8B as my main AI with Anthropic as heavy lifter. But even though LM Studio works fine in chat mode on its own, any prompt, even a simple "Hello" becomes huge (~20,000 tokens) when coming from openclaw. Naturally this bogs down the system, and I have never received a response from "Hello" when in OpenClaw TUI. I am a complete newbie to OpenClaw and AI in general so I wonder if anyone can help me. I have spent hours with CoPilot working on this and it has not increased my respect for AI very much - what a waste of time! I think maybe a human expert might be a lot more helpful.

steep wedge
tropic jolt
wind cloud
#

Hey guys. I am wondering what setup would be good for creating a local Ai server? Is this reasonable for 6k? Any advice helps

thorny abyss
frozen sentinel
thorny abyss
kindred coral
#

does anyone try radxa 5t

winter lynx
karmic blaze
#

So I have been trying to get openclaw to work locally on my old M1 MacBook Pro 16gb ram. The idea was to have an ai personal assistant to perform relatively simple tasks. I started setting up openclaws workflows and tests with my OpenAI plus subscription which uses codex 5.4 and it has been working great. Once the tasks and workflows were tested, I tried changing my main LM to a local using ollama and Qwen3 4b or llama 3.2 3b to handle cron jobs and general tasks.

Every time I have tried this, clawbot dies and stops responding.

I have checked ram consumption, total approaches 15gb but doesn’t overflow or reaches HD swapping

I have checked openclaw health, and it’s running fine

I have checked ollama directly in the app or terminal, and it runs and replies fine

The tasks: as simple as read my email or check information on a website

What am I missing? Is my MacBook Pro not powerful enough to run openclaw with a local lm locally?

lyric orchid
# karmic blaze So I have been trying to get openclaw to work locally on my old M1 MacBook Pro 1...

Are you seeing anything useful in openclaw or ollama logs at this time? Errors? Looping (ie maybe model is stuck or timing out). Use a tool to monitor GPU usage (not sure what that is on Mac) to see if it's busy. What context size are you using for those models? I wouldn't go straight to primary with a small model, create a cron job with some simple instructions and point to a small model, and get that working first so you know the model is working. I've found it useful to put a proxy between openclaw and ollama so you can see the traffic/interaction/errors.

karmic blaze
#

Mac has its “activity monitor” with spikes from 4gb memory usage to 15gb when the prompt is sent, and ollama uses that ram. Besides that, logs don’t show any error besides timeout after a while

lyric orchid
# karmic blaze Mac has its “activity monitor” with spikes from 4gb memory usage to 15gb when th...

I initially had openclaw running on a 8gb M1 Mac mini, but moved it to ubuntu. I was always hitting a GPU on another machine though. I think there is a default 60 second timeout in openclaw, you may have to bump that up. out of curiousity, when running a prompt of some sort, and you see the ram spike, does that extend (run) longer than when openclaw times out? i.e. model may still be processing? you should give qwen3.5-9b a try - that works well for me on an RTX 4070 (12 gb). as I said, you may want to put a proxy between openclaw and ollama so you can see the conversation. you wouldn't believe how much stuff openclaw adds to the prompt. one thing that can help with that - go into the dashboard, select agents, pick your agent, and go into skills, and disable ALL of the skills you don't want or need - they used to be enabled by default - any enabled skill ends up having data sent in the prompt to describe it. I built out this proxy with claude code, it's become a bit of a monster, but works well for me for debugging. I originally wrote it to work with ollama, but moved on to llama.cpp, but it should still work with ollama. https://github.com/khaney64/llm-stuff/blob/main/proxy.js

magic hull
#

regarding hardware, is it anyone using the nvidia dgx spark or its counterparts for other OEM for running openclaw in local mode? or is it an overkill?

karmic helm
#

and its a breautiful machine

grave summit
# dawn cosmos I hope you had done the research on this. Assuming you have tkaen M5 max chip wi...

All fair points. My current Intel MacBook Pro is from 2017 and cost nearly $3k back then. It was time for an upgrade due to no more updates available. I could have gotten away with the 64gb RAM version for work but, the $800 upgrade to double my RAM just made sense financially. I’m hoping to get near or on par Haiku level operations that will give me complete privacy. For me, if it runs OSS 120b at 60 tok/sec, that’s a big win! I’m tired of being throttled and rate limited. I use Chat and Claud a TON. Not to say I would expect that level of LLM locally - It will get there though!! 14 day return policy’s are always useful. Anyway, appreciate any user feedback of this specific equipment.

grave summit
dawn cosmos
hollow coral
dawn cosmos
hollow coral
#

I'm not at 128GB, but I'd have to check. I've been trying different paramters and haven't yet looked at logs for number of tokens

#

Honestly pretty new to LMstudio, but did the math and my api usage was such that it's cheaper to run a local model

#

so gemma-4-26b-a4b q8_0 with MAcbook M5 Pro Max 48GB RAM
180,000 context window
GPU offload 30
CPU thread pool size 4
prompt eval time 458 tokens per second
eval time 68.45 tokens per second

#

I was running it with the heavier models last night on some cron jobs and it crashed 4 times, so I'm still figuring out safe parameters

hollow coral
dawn cosmos
hollow coral
#

I'm using it to do web scraping on publicly available data so I need a beefier model

#

tried the gemma-4-e4b and wasn't really happy with the results

dawn cosmos
hollow coral
#

brb

#

I think that model is too big. LMStudio is saying so anyway

#

Hugging face seems to agree

dawn cosmos
hollow coral
#

I'm going to try q-4 tonight. Going to bed after this because I can't keep staying up this late but I'll report results in the morning. Thanks for the help

dawn garden
#

what are you use it for

rocky violet
blazing copper
#

I tried to run Ollama on K8 Plus 32 GB with terrible results and returned it. In the meantime my OpenRouter bill is shocking. What models are y'all running locally with decent tool calling?

rocky violet
neat bay
tall yew
blazing copper
grave summit
rocky violet
slim ether
#

What models do you suggest for mac mini m4 24gb

clever copper
#

Hi there, anyone using a remote ollama with an rtx5080? I use qwen3.5:9b now 130contextlength eslewhise its starting to cpu offload. Cant seem to get a bigger model to run in quantization. When i do its offloading like 30/80. and then i get timeouts…

untold sorrel
#

Super Noob Question: "Openclaw and local LLM. What's the absolute minimum Hardware requirement?"

Hi everyone,

Openclaw is quite cool and I want to "play a bit" with it. I've got it running, but I hit my session limits quite fast. So I am wondering if there is another way.

I use Claude Code (Pro) and Ollama (Pro).
I use Claude for a bit PHP / Website tinkering and Ollama for Openclaw.
I got "naked" Ollama running and even got some LLM downloaded.
Ok, low Token count, but it works.

I understand the hardware requirements for Openclaw, but the LLM is still a bit of a miracle for me.

So my questions are the following:

ONLINE

  • What model should I use with minimum cost?
  • What model would you recommend?

OFFLINE
I can chat with Ollama, but Openclaw is not responding ...
What is the "absolute minimum Hardware requirement" to run Openclaw / Ollama offline?
I don't need absolute performance, it should just work.

Thank you for your help.
Bernd

PS: If you have usage credits left or even run your own LLM server i could use, please speak to me. 🙂

rocky violet
# untold sorrel Super Noob Question: "Openclaw and local LLM. What's the absolute minimum Hardwa...

if you have claude pro then why not use sonnet with openclaw?
i would say use glm 5.1 since you have ollama pro too , i use glm 5.1 from the glm coding plan

for the offline part , i will say you will need 16gb vram to load a decent model of atleast 28 - 30b paramters at q4 on your pc without offloading , if you want to offload then you can go beyong 30b but it will be more time consuming for each query , for local you can try qwen 3.5 / glm 4.7 flash / gemma 4 etc

warm herald
#

I have a 3090 24GB vram and I run Gemma4:27b. When I use openwebui I don't see it offload and it's very responsive but if I use openclaw I do see CPU going nuts.

lyric orchid
lyric orchid
warm herald
#

The model runs on ollama in docker on my server

lyric orchid
#

I started with ollama but moved to llamacpp for much better performance and ability to fine tune. I love llama-bench for figuring out all the parameters to use

warm herald
#

I just switched to qwen3.5:9b and it's faster (ofc) but still not very useful. Plus I'm concerned the 9b will be too stupid in the end.

#

I'll checkout llamacpp

clever copper
#

I guess you need to decrease "contextWindow": 262144,

#

This blows up your memory

#

Then it starts to offload

warm herald
#

What's a more realistic context window?

clever copper
#

Experiment and check ollama ps

#

I did 130k

warm herald
#

But I set it in ai-agents config right?

rocky violet
lyric orchid
clever copper
#

I set it in the models part in the .json

lyric orchid
warm herald
#

Thanks. I'll dig more into this tomorrow. Right now the time between me sending message and ollama workers going to work is what's taking the most time. Maybe there's something there that I am not fully grasping. I dont' see why it would spin up workers on all my cores.

warm herald
magic hull
steady pendant
#

Can anyone suggest a model that can run on a VPS with 96GB RAM and 18vCPU with no GPU. I've tried qwen3.6, Gemma4 and qwen3.5 but no joy.

thorny abyss
steady pendant
magic raven
#

good choice

#

they literally promote openclaw in the desc lmao

dawn cosmos
rocky violet
tardy marsh
magic hull
#

Now I'm checking the difference with the apple chips, because it seems that the bottleneck is the memory bandwith (RAM, if the model is loaded in it). With dual channel the theoretical speed is around 89.6 GB/s

Device            | Bandwidth  | TTFT      | Speed     | Feel
------------------|------------|-----------|-----------|-----------
M1 Ultra          | 800 GB/s   | ~1.1s     | 45-70 t/s | Great
M4 Max            | 546 GB/s   | ~1.2s     | 40-60 t/s | Great
M1/M2 Max         | 400 GB/s   | ~1.5s     | 35-55 t/s | Good
M4 Pro            | 273 GB/s   | ~2.2s     | 25-40 t/s | Okay
M4 (Base)         | 120 GB/s   | ~3.0s     | 10-18 t/s | Tight
Beelink SER10     | ~90 GB/s   | ~3.0s     | 20-35 t/s | Slow
#

this is a comparison made with gemini, could someone confirm this token generation could feel slow?

sterile lotus
#

hello, can i use openclaw in my android?

fluid jackal
sterile lotus
clear quartz
#

I probably should have started here... 😄 - Anybody have any hands on with the dell pro max w/gb10?

leaden rapids
#

i need some advice. i currently have 3070ti 8gb. im thinking about upgrading to amd r9700 32gb. i do some light kilocode and recenlty been toying with openclaw. should i upgrade to r9700 or just use gemini api?

clear quartz
bright bolt
#

considering major self hosted models are sometimes 1.5TB in size, 12TB potential capacity is a great way to be prepared to run HUGE models in the future

or perhaps someone has a use case where they want to load more than one model on their server

vocal island
#

Hey, I'm contemplating buying a Mac Mini, could it possibly run models from Ollama like (Qwen 3.6, Kimi 2.6 or MiniMax 2.7)

royal radish
#

Hi All, i'm receiving my mac M3 ultra tomorrow. what model do you recommend to "start/run a company" with voice AI calling inbound and task automation with visual compute

rocky violet
fluid jackal
# royal radish Hi All, i'm receiving my mac M3 ultra tomorrow. what model do you recommend to "...

I'm assuming you wanna stay local:

  • the qwen models are solid open source options for VL -- the bigger the better (for visual compute)
  • ironically the qwen models are also solid for your core LLM if you want to stay local but there's a plethora of options (I specialize in coding so unsure if there's a better suited one for your needs)
  • for voice TTS/STT you can search online but there are literally a dozen or so options and all of them are fairly solid and dont sound like a blatant robot
lyric orchid
fluid jackal
brave beacon
#

Hi guys, im just getting into all this ai stuff and i want to run claude code or openclaw locally, i have a ryzen 7 7800X3D cpu, 32gb ram and a rx 6700xt. what model would yall recommend me? i want to get the most out of it running locally for it to be able to code as well as possible, and possibly run fully autonome tasks on my 2nd burner pc for security reasons, while still using the main pc computation power

#

ive tried gemma 4 27b and it just halcinated and wasnt really able to do any real coding

rocky violet
rocky violet
#

try with qwen 3.5 models . pick any model of q4 or more quantization

#

make sure the parameters isnt huge or else it wont load on your gpu

brave beacon
#

what about glm 4.7 flash? thats what chatgpt recommended me

#

but idk really

rocky violet
#

yeah its good too but its quite old as well

#

like right now glm 5.1 is the latest

#

and flash version of any model is kind of nerfed

brave beacon
#

so.... ishould try the qwen 3.5?

rocky violet
#

i will suggest that

#

just dont burn your gpu

north ocean
visual swift
#

Hello. I want to create local instance of openclaw and ollama gpt-oss. What are the recommended pc specs for the start up? I appreciate the response, thank you

visual swift
gritty prism
#

But the only local models you can run are small models because the bigger models just run too slow to do much of anything

#

Unless you spend A LOT. But that amount does not make sense imo compared to just using your ai subs

rocky violet
sturdy gazelle
#

Hey guys! Can you recommend some cheap Android phones for the OpenClaw? Do I need root?

dawn cosmos
fluid jackal
feral violet
#

It's possible to run OpenClaw on a Android?!

sturdy gazelle
fluid jackal
feral violet
#

wow... didnt know that!

fluid jackal
feral violet
#

The 24.4.26 version is giving problems, right? I'm new to this and installed that version but it doesnt work, even if I follow the docs.openclaw.ai page...

fluid jackal
dawn cosmos
small crescent
#

Is there a MLX specific channel where we can post a question on ?

dawn cosmos
#

Stop scamming people. And if you have really successful, why not showcase it here for everybody to benefit. Typical scammers!

dense vector
#

Im 12 years old and I recently found out about openclaw from my father who works at Microsoft, he told me that I can easy make $5000 every day just by using an openclaw bot to trade. So I did. I was amazed that after 5 seconds (and with nearly $20000 of my dads money I wont disclose to you) I found that I was making $5000 every single day.

Every. Single. Day.

Now I am looking to help young entrepreneurs like myself get into openclaw agents. If you’re young and want to learn the mindset and simple steps that helped me start making this happen, msg me “LEARN MORE” and I’ll show you what helped me get here.

clear quartz
#

LMAO

wanton seal
#

I am 8 and my father just found out I was looking at openclaw, now I make 20000 a day and my mother doesnt know. text me to find out how my brother made his 1st billion with openclaw that inspired me to look at it. my sister also uses openclaw and we are all making tons of money every day. my openclaw just told me I made another 500 in the last few minutes while i was typing this important announcement.

#

there should be age limit for openclaw or internet access full stop

clear quartz
#

I'd prefer a minimum IQ

rocky violet
#

why so many scam baits here suddenly

dense vector
#

gee I wonder why

#

no gifs 🥀

echo walrus
#

oh spare me codex, give us one more free reset lol

#

Telegram doesnt directly work if you havae ssh tunneling on? Its telling me it has to be taillscale or remote URL

dawn cosmos
#

There was a claude code offer which gave $500 creds valid for 6 months! Expired now. Keep a watch out for more. There is one more of xiaomi mimo is running https://100t.xiaomimimo.com/

chilly cape
#

Around the 4.22 update my gateway started using like 50% CPU at idle. Has anyone else experienced this? Any solution?

tender ermine
#

Looking to setup with a dgx spark, Mac Studio and a NAS. Any feedback on this type of setup? Was going to have openclaw on the Mac Studio as orchestrator, dgx for inference, and all data hosted on NAS. I’m trying to create an enterprise setup for a company.

dawn cosmos
rocky violet
#

yeah sadly . humans are less valuable than their own data nowadays

somber falcon
#

Hey folks, currently added my 3rd V100 32gbGPU now, total of 96VRAM, whats your opinion i should run now, previously using unsloth/Qwen3.6-35B-A3B-GGUF at Q8

regal shoal
#

Two weeks ago before i started this Ai journey i got myself a new RTX5080 16Gb, what a waste of money 😂 , now i want an RTX6000 96GB, 16gb is nothing.

dawn cosmos
#

Resale on RTX5080 should still be good

grave shoal
# tranquil hazel local is overrated and dangerous$

Yeah, mostly just using it for simple stuff and kicking off cron jobs, heartbeats etc. Would never use it for any coding tasks and such, dont see the point. Thats where Claude Code and Open Code etc shine for me.

dawn cosmos
#

Yep you can technically run openclaw in rpi. And use other nodes to run the tasks

rocky violet
grave shoal
rocky violet
grave shoal
magic hull
#

After doing some research I've preselected these 3 devices:

  • ASUS Ascent GX10 128 GB LPDDR5X (3700€)
  • Corsair AI Workstation 300 Ryzen AI Max+ 395 128 GB LPDDR5X (2800€)
  • Apple mac mini M4 pro 64gb (2400€)

I want it to be a 24/7 node of openclaw, to work as an assistant and also do some research on AI. Is it worth going for the Asus with the nvidia gb10 or is it better to pay for suscriptions of cloud?

dawn cosmos
magic hull
dawn cosmos
regal shoal
#

╭─────────────────────────────────────────────────────────╮
│ Ollama Model Benchmarker │
│ Reasoning | Coding | Knowledge | Instruction | Creative │
╰─────────────────────────────────────────────────────────╯

Found 25 models to benchmark:

  • qwen3:30b-a3b
  • qwen3.5:27b
  • mistral-small3.1:latest
  • qwen2.5-coder:32b-instruct-q4_K_M
  • gemma3:27b
  • deepseek-r1:32b
  • qwen2.5-coder:7b
  • dolphin-mixtral:8x7b
  • codellama:13b
  • llava:13b
  • mistral-nemo:12b
  • mistral:7b
  • phi4-mini:latest
  • qwen3.5:4b
  • qwen2.5-coder:14b
  • deepseek-r1:14b
  • qwen2.5:7b
  • qwen2.5:3b
  • llama3.2:3b
  • llama3.2-vision:11b
  • qwen2.5:14b
  • gemma4:e4b
  • qwen3-vl:8b
  • deepseek-r1:8b
  • qwen3.5:9b

Estimated time: 50-125 minutes. Please wait...

⠼ Testing: qwen3.5:27b ━╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4%
⠼ -> Knowledge ━━━━━━━━━━━━━━━━╺━━━━━━━━━━━━━━━━━━━━━━━ 40%
^^Lets see how they Rank, runnin on RTX 5080 16Gb

regal shoal
#

Pretty cool site!

craggy ferry
#

Yeah it has tons of good benchmarks to really differentiate models

#

That’s the canonical one

#

I run G and H on every new model I start running locally to get good comparisons. I should also look at the subjective writing sections (but usually don’t lol)

lament idol
#

I'm considering buying an RTX 5090 because I realized my 64gb of system ram and RTX 4046 8gb VRAM I bought for local AI won't work.

Before I spend $4000 on an RTX 5090 & 1200W PSU upgrade.

Will I be able to run decent local LLMs on my PC if I get this graphics card? I'm been talking to Perplexity to wrap my head around these buying decisions but I don't really understand the implications of what I can and can't do.

I was quite disappointed when I realized System Ram is not what Local LLMs require and I'd like to avoid a $4000 disappointment if possible.

oak frost
#

you dont need a 5090, for playing around you can buy a used 3090. Use it for a few weeks.
You will see that the quality of the answers are not as good as a provider with 200b++ Model files, but probably good enough for your usecase..

rocky violet
lament idol
#

I found this to be a useful speed tester to help me visualize the speed of which the models would take.

I'm thinking I don't really need the frontier models.

Most of what I'm doing is convert transcripts of my speeches into different forms of written content.

https://shir-man.com/tokens-per-second/?speed=80

#

I'm not a coder or developer so there's not advance coding needs I have. I do want to have agent swarms that are able to work together for researching content ideas, structuring content in my frameworks, and designing slide presentations running in parallel

lyric orchid
# lament idol I'm not a coder or developer so there's not advance coding needs I have. I do wa...

I agree with @oak frost , get a 3090 on ebay and try that first - you can always resell it. You still need the beefy PSU though. 3090 is only 24gb, 5090 will give you another 8b, but that might not get you much more model, but maybe a 27b or 35b plus a smaller one like a 9b. I don't know how much swarming, especially in parallel you'll be able to get though. I've been happy with 3090 and qwen36 35b a3b, and qwen36 27b for my needs. I also have a 4070 8 gb) on another machine with qwen35 9b for simpler tasks. I want to do some experimenting with coding too .

spiral vector
#

I'm surprised I don't see people suggesting AMD R9700 (32GB) cards more. For $1300, I think its the best you can get for local LLM. Sure, that means you're dealing with ROCm, but I would think that's a good tradeoff.

#

$950 Intel B70 (also 32GB) may yet prove to be worthwhile as well - but their software stack is probably worse than where ROCm was at 2 years ago.

lyric orchid
# spiral vector I'm surprised I don't see people suggesting AMD R9700 (32GB) cards more. For $1...

I think that's the major reason, much of the software and tuning is CUDA focused. Regarding 32 gb vs 24, I don't think it's a big enough jump to be able to run a larger model than you could with 24. @rocky violet I don't see how you could fully load a 70b or even 65b Q4 fully in vram? Those would need more than 32 gb? But back to the original question, you'll need to figure out whether you can do what you want with a smaller model. Others have suggested renting a vps with GPU to try it out, but I'm not familiar with that, the cost, or if you can allocate specific GPU size.

spiral vector
#

Personally, I think 32gb is the current sweet spot (used to be 24gb last year) for best local LLM without breaking the bank. Between Qwen 3.6 and Gemma 4 models if you're limited to 24gb, then your either limiting context down significantly, running really small (Q3) quants which limits usefulness, or both. (Or, you're stuck using older, worse models.) But, it does really come down to what you're trying to do with it. 32GB + good processing speed seems to be the floor for "good enough" local coding (R9700 is OK, but I do wish I had the speed of a 5090 here).

If you're OK with slower responses then mini PC's like M4 mac mini with 48/64GB RAM or similar Strix Halo can also work, but they're both much slower than I think a lot of people are comfortable with as a chat bot. If you have workflows that you can just pass off to let run overnight - then M4/Strix halo are great (mine just spent days churning out AI subtitles for a bunch of old obscure media - speed was no real concern). Next I have mine slowly chewing through something like 600 government documents (probably 20k pages of text and tables in PDF's (many without OCR) and building that into a searchable database - works great when I don't need speed.

But neither mini PC seems good for local image generation if that's your thing - 3090 TI/4090/7900XTX (all at 24GB) are probably still best fit there.

Ultimately, I think many of us are stuck in this world still where we need frontier/SOTA models for real work - local LLM is just a thing you can offload stuff too when it neither needs to be "as good as possible" or "as fast as possible", but instead I just want "as cheap as possible" (when measured over multiple months).

#

I still pay over $200 in month for Frontier/SOTA models for real work - in addition to offloading what small bits I can to last year's Strix Halo and this year's R9700 in my desktop. But I like to think that offloading what I can keeps me from paying even more for cloud models.

lyric orchid
# spiral vector I still pay over $200 in month for Frontier/SOTA models for real work - in addit...

Same, though I'm only on the $20 plan for both Claude and openai, but keep running up on the 5 hour window when coding in Claude code. Just started playing with codex a few days ago and am really impressed with it , and it seems to be more token friendly than Claude code. Now trying to figure out which one I want to give $100 a month to for a bigger 5 hour window, or just continue jumping back and forth between the two!

vast hamlet
#

anyone using openclaw on a raspberry pi 5 8 GB?

placid zinc
#

Let me know if it works for you

north quartz
# dawn cosmos do you have experience in using the local models? before spending huge amount, ...

I like the idea of learning and trying things out on a VPS first, with the goal for figuring out what hardware I might later choose to buy to run everything local. I am not a programmer, use windows, and a newbie to openclaw.** What VPS service + guide would people reccomend? **Oracle cloud seems like it would emulate a local server pretty well but also looks to be stretching the edge of my knowlege.

dusk moon
vast hamlet
dusk moon
#

anyone using openclaw on a raspberry pi

dawn cosmos
rocky violet
dawn cosmos
north quartz
spiral vector
#

Of all the benefits of using AI, I think having it explain anything to you, at exactly the level you want to receive that information at - that's probably the best.

uncut locust
#

Which Mac Mini is preferred to buy? Is it the base model or is it the model with the higher RAM? Basically I don't want to run local models on my system. I just want a personal assistant to run openclaw

uncut locust
fair kindle
spiral vector
#

If you're just doing cloud models, any mini PC, or old laptop is good enough. If you're already a mac person, the cheapest mac-mini will work well.

sly yarrow
keen loom
#

anyone knows why i end up on gateway-injected when i refresh the webui? been trying to find an answer / fix but can't figure it out

delicate hull
fallow willow
#

i'm using AI to help me build a selfhosted Ollama/Openclaw Team of Agents... I talk to it through Discord.. 3 days so far, at the Discord stage with partial personalities working...

dawn cosmos
sly yarrow
#

Local bro Ollama 0.19 and the dynamite goes booom 🤗

#

If you go cloud raspberry pi could do the trick 🤗

keen loom
hollow coral
queen gate
#

i have a potato pc and i cant run the model stay thinking 10h and dont do nothing

winter lynx
winter lynx
#

typical Local LLM these days

dawn cosmos
winter lynx
# dawn cosmos With that cost of M5, i could run cloud sub for 24 m atleast at highest tier and...

you could run frontier models 24/7 for 24 months? pretty sure your math is faulty there... As for not being up to the tak, depends on what tasks you are doing. I would still architect with a claude pro subscription, but most of the grunt work could be done by a local model, and they are getting better all the time... you can run them 24/7 without rate limits ... while sonnet 24/7 would be
24-Month Cost Estimates
(Sonnet 3.5 API)Low Usage (Simple Agent, 24/7): ~$300–$600
Usage: Only processing data when a user acts, infrequent, short prompts.
Medium Usage (Constant Monitoring, 24/7): ~$10,000–$20,000
Usage: Constant summarization, low-volume coding, high context maintenance.
High Usage (Active Coding/Data Agent, 24/7): ~$50,000–$100,000+
Usage: Rapid, continuous coding tasks with multiple files/retry loops.

#

and that isn't even opus... thats just sonnet

#

For Opus 24/7 High-Volume Agent (API)~$9,000+~$216,000+...

#

so naw... I don't need frontier quality for every single task... there is plenty I could do with a large model on a mac m5

dawn cosmos
# winter lynx you could run frontier models 24/7 for 24 months? pretty sure your math is fault...

the problem you think is that openclaw is the only solution - which is not. I primarily use n8n for daily workflows and use openclaw for research activities, hence that $200/month is sufficient (with regular small credits that thrown across different events /partners) . Openclaw is a token hogger. For example, if you have an execl sheet to be read and based on that do some infrencing for say couple of columns, in Openclaw, eveyrthing is infrenced. In n8n, using code node you can just simply seggregate the data without using LLM and send for infrencing only the data you required. in my use case, if I use openclaw, it would take about 72K tokens/call vs 5-10K in n8n. Now if there is a new excel format, then invoke openclaw to determine best strategy, once that strategy is developed, turn that into n8n workflow. This way most of the determinintic tasks don't even use LLM. It is used only for those data that requires it. Hence, my context is lean, infrencing ability is fine tuned and I use multiple agents for specific tasks, which keeps it specific. Opus 4.6 (4.7 is bad) is used only when things are completely random.
Own hardware is great - I have a full scale home lab + self hosted cloud solutions - but i won't recommend investing in a hardware which is destined to become obsolete in next 9 - 12 months due to AI architecture improvements and hybrid scaling via vLLMs.

winter lynx
#

not familiar with n8n...

#

if anything the llms seem to be getting better AND smaller

dawn cosmos
winter lynx
#

I will check them out... what I don't know would fill a book, but I am working on learning

dawn cosmos
dawn cosmos
dawn cosmos
# winter lynx I will check them out... what I don't know would fill a book, but I am working o...

while you are at it - learn docker, openclaw can also be run in docker and much better when new versions come out, since you could spin a new container while the last known good version container is still running. This way your business does not stop because there are breaking changes in new version. With reverse proxies you could also route your work to like 80% to existing proven openclaw container + 20% to new version of openclaw container. However, this requires a bit if knowledge of docker and revrese proxies like Traefik/ Caddy

winter lynx
#

Ollama's Cloud models can now be used inside Claude Desktop

dawn cosmos
tight hinge
#

anyone here thinking much about the control model for computer use?
feels like a lot of current stuff assumes the agent should just live on the target machine and poke around from inside it.
i’m starting to think a sidecar model makes more sense:
• run the AI on one machine
• keep the target mac separate
• send input in from outside
• cleaner boundary between thinking and acting.
curious if that feels more sane to others, or if people still think direct-on-box is the better model

craggy ferry
#

isn't this already how nodes work in openclaw?

#

I think to the extent possible you should in fact avoid having agents poke at the machine running them

dawn cosmos
merry rapids
#

I currently have 8GB VRAM and 32GB RAM, do you guys have any recommendations for which model I should use for lightweight/local agent tasks?

Currently using Dolphin-X1-8B-Q6_K in LM Studio just for testing purposes and I am getting around 30 tok/sec initially (for longer sessions it stabilizes at around 15 tok/sec), but the model feels rather dumb.

Current model/settings info:

Model: dphn/Dolphin-X1-8B-GGUF
Quantization: Q6_K
Architecture: Llama
Size on disk: 6.60 GB
Context Length: 131072
GPU Offload: 32
CPU Thread Pool Size: 8
Evaluation Batch Size: 725
Unified KV Cache: Enabled
Keep Model in Memory: Enabled
Offload KV Cache to GPU Memory: Disabled

I’d like recommendations for:

  • better models for OpenClaw/agent use
  • good balance between intelligence + speed
  • settings optimization for my hardware

I am willing to sacrifice some context length if needed, but I would prefer not dropping it too aggressively.

sterile dagger
merry rapids
sterile dagger
#

Have you tried the new qwen? It's sort of designed a bit better for tool use

merry rapids
#

it's running pretty well and I am liking it given my limited hardware

#

once I get more comfortable with openclaw I'll just rent a runpod instance then i can use whatever

spiral vector
#

Have you tried Qwen3.5-4B Q4_K_M? (I haven't - just seen that recommended here before for 8GB VRAM setups). I'd guess between Qwen 3.4 4B and Gemma-4-E4B

#

I'll plug https://github.com/AIgenteur/ClawEval (not my work - no connection to the guy who built it). I think this is generally the best (what LLM for my GPU for openclaw) that I've seen.

lyric orchid
tight hinge
# dawn cosmos Openclaw already works that way. You can run openclaw as nodes (even within dock...

True — OpenClaw already supports distributed control through gateway + nodes.
What Sidecar Dot adds is a different thing: control of a separate Mac without requiring OpenClaw, Docker, or any installed agent on that target machine.
So I’d separate them like this:
• OpenClaw nodes/gateway = software-native distributed control
• Sidecar Dot = external control of a real Mac that AI can operate directly
That difference matters when the target machine is not already part of your stack.

dawn cosmos
humble iris
#

Do you think a Mac Pro will be able to efficiently run openclaw using qwen27 on Ollama while running Claude code because my Mac air with 24ram is struggling a lot rn

lyric orchid
#

Depends on the specific models? Which ones? Google memory bandwidth for your particular models, but the airs aren't all that fast, up to 153 for m5. Pros will be in the 200 to 600 range

dull shore
thorny abyss
humble iris
#

@lyric orchid which Mac do you think I should get. I want to be able to do everything comfortably. Also what are those numbers you’re saying: 153, 200, 600

dull shore
dull shore
lyric orchid
gusty nacelle
sterile fjord
tight hinge
# craggy ferry isn't this already how nodes work in openclaw?

OpenClaw nodes are the right solution when you can install software on the target machine.
Sidecar Dot is for the cases where you can’t, or where you want out-of-band control/recovery.
So it’s not replacing nodes — it’s covering the gap nodes leave.

tight hinge
# dawn cosmos Openclaw already works that way. You can run openclaw as nodes (even within dock...

Yep — agreed. That’s already the native OpenClaw model.
If you can run a node on the target and coordinate it from a separate gateway, that’s usually the cleanest setup.
The reason for Sidecar Dot isn’t to replace that — it’s to handle the cases where you can’t install or rely on a node on the target at all: locked-down machines, third-party devices, broken OS state, or out-of-band HID/KVM-style recovery.
OpenClaw nodes = in-band software control
Sidecar Dot = out-of-band external control

tight hinge
dawn cosmos
craggy ferry
tight hinge
dawn cosmos
tight hinge
#

Sidecar Dot

fluid jackal
#

anyone here running a mac cluster?

dawn cosmos
analog fern
fluid jackal
analog fern
#

Exactly. I just couldn't justify the mac tax

dawn cosmos
#

Instead of macs you could get cluster of gx10 or ryzen Ai + Max 395 systems.

But even those will be outdated in an years time defeating the ROI. Hence I always state thay for now getting credits or paying for subs is best as you work out your token appetite

fluid jackal
#

Find myself with a collection of various AI/compute hardware but with minimal unification neglect the 6 3090s+6000 on one machine but that sucks up so much power it's only spun up when required.

Will probably move the 6000 to an always on system to better utilize

dawn cosmos
fluid jackal
#

maybe one day there will be a fully unified underlying translation layer that doesn't lost stupid amounts of performance (or at least one can dream 😂 )

dawn cosmos
fluid jackal
dawn cosmos
fluid jackal
dawn cosmos
fluid jackal
#

have you?

dawn cosmos
#

then probably you need more experience! if you have mutliple nodes each having a different GPU, you can run the node with specific container image with plugins and Ollama pods can use them. Hence, as a cluster you will have exposure mutliple gpu nodes. In openclaw (running as a gateway or seperate nodes) you can configure multiple providers from each of those ollama pods to run different agents. This way you setup is optimized to utilize disparate gpus. yes, you cannot comnbine as a single GPU and do a slicing

fluid jackal
dawn cosmos
# fluid jackal Kubernetes is an orchestrator, not a magical GPU translation layer. It can sched...

its not about k8s itself, it is those device plugins . - yes, I already stated the multiple GPU cannot be offered as a single GPU types, but pods can use mutliple GPUs seperately. Then there is a concept of paralellism which I have not even mentioned given that you need to understand above. Using paralelism you can run model across different GPU (yeah some would need simialar arch). So bottomline, the way you are describing that you differnet GPU are waste and non-performing, they are not. If you had good architecture knowledge of k8s, you could utiilize most of them.

fluid jackal
# dawn cosmos its not about k8s itself, it is those device plugins . - yes, I already stated ...

we are on the same page and thinking each other are not; your statement "GPU cannot be offered as a single GPU types" clarifies that.

as for the rest there was never an argument there -- just like trying to string multiple macs together there are still bandwidth issues between the nodes that render it not worth it at the end of the day when such cheap and secure compute is available online (even in high security settings)

At that point one is better off just balancing what they have with different services (k8s or otherwise) -- which is what I do (STT, TTS, rerank, embedding, VL, security, etc, etc, etc)

rough raven
#

Hi guys, looking for hardware advice. Is it worth getting a MSI desktop with a rtx 5090 32gb if I can lift it for 3k USD?

fluid jackal
rough raven
fluid jackal
full talon
#

you can run most local models using Turbo Quant just fine on rtx 3090 which is ike 1k and you can put it in 0.5k used computer and be 95% there for local LLM Check ClawEval https://github.com/AIgenteur/ClawEval

mellow forge
#

Bro I’m telling you lot, you don’t need a £3k GPU to start building AI stuff.

Everyone thinks you need one mad machine but that’s not the only way.

Build it like an organism.

One cheap PC does the routing.
One GPU runs a small local model.
Another cheap camera or old phone gives it eyes.
CPU handles logs, memory, scripts, Telegram, all that boring stuff.
Then cloud AI only gets used when the job is actually hard.

That’s the whole point.

You don’t need all the GPUs to magically become one big GPU. Most of the time it don’t work like that anyway. You split the work.

Eyes.
Brain.
Memory.
Hands.
Nervous system.

That’s how you build it.

You can start with an old office PC, a used GPU, Linux, LM Studio or Ollama, OpenClaw, Python scripts, and a camera. £300–£500 if you buy smart, maybe less if you already have parts.

It can watch a room, send alerts, run a small local AI, search its own notes, store logs, speak through Telegram, and only ask the cloud model when it really needs help.

Rich people brute force everything with one monster GPU.

Broke builders have to be smarter.

Use what you’ve got.
Split the jobs.
Make the system survive when one part goes down.

Don’t build one giant brain.

Build an organism loool.

regal jay
dusk moon
fluid jackal
dusk moon
fluid jackal
# dusk moon Yes

damn...should have lead with "Whole PC w/ 5090? Don't think, just buy"

#

I snagged a 96GB Mac Studio refurb....not sure why.... 😂

dusk moon
dusk moon
#

Does it power on? Yes, Deal

crimson sparrow
#

HI, im looking at getting into openclaw, not sure how to go about it. I have my main desktop at home with a 7800xt in it (im aware this could come with extra steps.) along with a 2009 macbook pro and a latitude 5410 in the mail. I looked into what I want openclaw to do for me, which would be to use my local desktop's compute power to run the llm and be able to message openclaw from my phone or interact with the web ui from my laptop away from home. How would you all go about this? I read the macbook can be used to integrate imessage without having to pay. Does anyone know if this idea is possible?

fluid jackal
# crimson sparrow HI, im looking at getting into openclaw, not sure how to go about it. I have my ...

are you planning on using a subscription or are you trying to be fully local? fully local might be a bit more than painful without at least a 20-30B parameter model ( I would personally not even fathom it)

if you do a subscription (OpenAI, MiniMax, etc), I would probably avoid the 2009 Macbook pro still unless you want to play the "will it work!?" game on hardware that's 15+ years old with only a couple of cores. I would install it on your latitude or your main desktop depending on how you're feeling it should work okay on both with a subscription.

as for your imessage, yes technically it can do iMessage ...but...I don't think it'll work here because your macbook is just too old and will lack support to install what's needed

crimson sparrow
uncut sage
#

Hi, I tried to set up a local openclaw agent on my pc. Specs: 32gb ddr5, RTX 4060, Ryzen 5 7500f. I don't really want to spend money. I set up Ollama's qwen 3.5:9b and it was working fine for a little bit, but now it's just replying "NO" to all my messages. I mainly want to use it to set up connections in Notion and Obsidian to track progress of things, and help me with my career in cybersec. Does anyone know why it may not be working, or what model I should run?

austere turtle
brittle hamlet
uncut sage
uncut sage
austere turtle
uncut sage
uncut sage
austere turtle
#

As far as it has enough context window to remember everything which many local models lack (or maybe my machine lack power to)
The qwen 3.5 is a very powerful model for coding and light task also use a better distilled version so you can get good output

uncut sage
#

yeah okay cool, what do you mean by better distilled sorry?

austere turtle
#

Distilled models are basically smaller models trained using outputs or knowledge from a bigger stronger model.

Some distilled versions are done better than others, so even if two models are both “Qwen 3.5 9B distilled”, one can perform much better depending on what it was distilled from and how well it was trained/tuned.

So I meant using a well-made distilled version gives you better quality responses while still being lighter/faster to run locally.

austere turtle
#

That’s why deepseek give good result for half the cost

uncut sage
#

ahh okay gotcha thank you so much

brittle hamlet
#

Two actually, the other is ollama/qwen3.6:35b-a3b-nvfp4

uncut sage
brittle hamlet
uncut sage
#

ahh okay

shy kite
#

Guys, is there an open-source project that tailors LM models for OpenClaw usage?

iron stump
#

who was the guy who has his own setup

uncut sage
#

@austere turtle I deleted everything openclaw related and downloaded it back then set it back up and still getting the NO error. I'm stuck. I even tried using openrouter and used a free one and it instantly said I was out of tokens.

#

And the Ollama model works fine by itself

austere turtle
#

If your openclaw in docker or running it normally?

#

And is your ollama running the server that connects to your model