#hardware
1 messages · Page 2 of 1
Of course
Cool, cool, ordering a drive now 🙂
damn 90$ for a 0.5 TB ssd seems exp
can claudebot hack my network?
aka get into my router and do havoc from a saved password
is 240gb enough?
thats my smallest free drive
I can upgrade later if it needs bigger but to start
Yes it is
Has anyone gotten this to work well on a jetson nano?
Mac mini 16bg 10 core vs the 24bg 16 core one. Is it worth the upgrade? I want to run 90% of it locally and then use apis for the heavy lifts. Will this be good enough?
I’ve been running it in an Amazon site but the costs has already gone up due to api calls to chat with it for things, would rather use iOS and have it talk via iMessage
I wouldn't count on it, but it most likely works
I want to move my bot off the AWS wasteland but also not trying to break the bank and get something off of marketplace.
The Jetson Nano has... 4 GB of RAM and 16 GB of eMMC storage.
Yea but it looks cool
In all seriousness other than 32GB of ram what should I include in it
You're talking bout the developer kit, right?
Just asking.
I meant the reg one, running on no sleep idk why i said nano
Drop the link? I'm a little confused.
Other than that insane amount of RAM, probably get a good SD card, or if there's something like an SATA/M.2 connection thing hook it up to an SSD.
It'll probably make your life much better.
Other than that, it should run swimmingly.
Yes running on local computer with claude max and local lms like Alex Finn, not sure if I should focus on large ram only, considering macmini to start and switch as I feel the limit to switch to mac studio 516gb ram perhaps( too costly but want to know what diff it can make too)
I really want to hook this up with my Jetson nano super. Got cameras attached and want to bring this to the next level. Stoked I'm in the right place.
If you want to chat about this lmk, im building a mini car robot with cameras and speakers connected to molt
Why is everyone running clawd on a Mac mini and not their personal computer/laptop? like MacBook Pro or iMac
Is Clawdbot/Moltbot mac only?
Security, you can more easily control file access and connections, what has and gets access to what files and services. Running straight on your personal laptop can be more risky, like coding straight on production.
besides that, Molty is still pretty "young", 3 months, still has a lot of rough edges, bugs and shortcomings that need to be ironed out first.
No, you can install Molt on most of desktops, problem with hardware is only in AI that you will use for Molt, if you want to run Molt and AI localy you need to have good hardware..
I have an old mini PC I'm no longer using. Can anybody say if the spec would be sufficient? I'd imagine so, assuming I'm not planning to run any hardcore local models on it.
Beelink SER3 Mini PC with AMD Ryzen 3 3200U, 16GB DDR4 500GB NVMe M.2 SSD, Small PC Support Dual HDMI Output, WiFi5, BT5.0, 1000M LAN, W-11 Pro Mini Computer for Office/Entertaining
If so, would you suggest I wipe it and install Windows or Linux and if Linux, which flavour?
More than enough power in the little mini PC if you are using a cloud LLM. I personally go with linux, you can use whatever for the client side of things so a stable server os like Linux makes sense imho
I appreciate your insights. Just straight Linux, or Ubuntu, or something else? I haven't played with Linux on my own hardware for decades.
I'm using ubuntu 24.04 LTS and moltbot just worked no issues.
Hopefully someone or some team is developing new hardware platforms for the explosion of new software capabilities.
Can I run it on raspberry pi? 😅
For a Mac mini will the 24gb be better for local models vs the 16gb base model? Or is the base mini basically all the same not a huge performance increase vs cost
Hey guys, anyone running Clawdbot on a MacBook Pro? I have a MBP 2023 Apple M3 Pro (18GB), don't use it very often and was thinking of trying Clawdbot ont it.
It also gives it dedicated resources. I'd be interested in running Clawd on one machine and giving it access to Kasm on another through an account you share.
yes
What's the minimum I need for it? I can spin up a VM on my Proxmox. I have a different PC I use for unsloth qwen models.
How can run moltbot on termux
@kind meteor How can run moltbot on termux on local android
What are the odds I can run this thing on a computer from 2010? I think it has 8 GB RAM
Not sure you can. Android is pretty different from Linux. You could attach a Kasm workspace to your moltbot machine and access your moltbot in your browser. I want to try this later too.
Doing a setup of molt on a linux vps would be the same or it needs to be on my own computer?
vps works
And why everyone is using mac minis?
Fad, people want excuses to buy new hardware.
If you want to interact with apple based apps like Imessage or apple notes you need a macos machine
but besides that, everything else works on a linux machine
Thanks for the help!
@bright sleet please read the security docs before deploying!
They are on the documentation, right?
Running my moltbot happily on this old mini pc, zero issues!
| Component | Details |
|---|---|
| CPU | Intel Celeron N4100 @ 1.10GHz (4 cores, 4 threads) |
| RAM | 7.6 GB (2.9 GB used, 4.7 GB available) |
| Disk | 233 GB SSD (36 GB used, 186 GB free) |
| GPU | Intel UHD 600 (integrated) |
| OS | Ubuntu 24.04, Kernel 6.14.0 |
Running an LLM local or tying into something existing?
Well, also, Mac mini will consume less power being on 24/7 than a lot of other options. But maybe not enough to justify the cost difference
Finally got to my PC to get started. This should work, no?
Almost identical… mines even smaller tho Lenovo ThinkCentre M90n IoT Celeron 4205U, 4Gb Ram, 256Gb SSD… Ubuntu 24.04… no issues other than the node gateway crash restart that was recently patched
is it worth it to get 2 mac minis with 32 ram each
to run models
or should I just stick to using apis and get one 16gb for moltbot
Hi I'm trying to get a Mac mini to hook up with my vintage iMac 2013. Google said that the best way is to use a HDMI capture card, does any of you have a similar experience? Is that gonna work? Or should I invest a new screen? Thanks!
No, you really just need an adapter. You can find lots of them all over the internet.
thank you! what adapter? HDMI capture card?
Is there a reason for the Mac mini and this or just any bug VRAM card should do?
No an HDMI capture card is for capturing the output and saving it. Like a video or putting it on YouTube.
i see, mac mini is new, imac 2023 is too old. but i thought it would be fine to be a screen display...
You cannot use the iMac as a monitor (it doesn’t have video input) if you want to access your Mac mini from your iMac you will need to setup something like Remote Desktop or ssh
I’m not sure about your iMac specs, but have you tried running it directly on the iMac? If you already have a machine, you don’t really need a new one (unless it doesn’t meet specs).
the imac i ahve is too old to run any ai
Well this is more like an automation bot, the AI itself is run via cloud API calls. Unless you are trying to run the AI locally for privacy reasons?
A small machine can handle the automation calls. That’s how it runs on almost any desktop, the AI is running in the cloud.
As you can see here, other users are running it on tiny old hardware: #hardware message
huh! good point! let me try it on the old mac then! Thank you!!
Anyone here tried running it on a 2011 mac mini?
Clawy MacOpenClawface Mac mini M4 enclosure is finally ready. Available on Printables and on Makerworld as well.
imo, the only thing that even starts to be worth it is a 128g mac studio, and you should only be buying two of something if you're buying the 512g mac studio.
64GB (less, because each machine has overhead from the OS, drop 4+G for that per node) isn't really worth your time; the cluster will perform worse than a single 64G node. But also, 64G still isn't big enough. Your context window is eating 20G of that at 128k, and you haven't loaded a single weight yet. Qwen3-30B is not smart enough for your main thread, and you probably can't even fit that in without quantization.
I think a 128G DGX Spark would maybe be a little faster, but somehow, a 128G Studio is actually $500 cheaper than one of those.
M4 is a bit overkill just just this lol
what about 5 mac minis? They would still be about the same price or cheaper than the 128 mac studio
Hope you enjoy keeping five machines in sync in the cluster, daisy chaining them or whatever, power, updates, just so you can save a couple hundred bucks not buying the thing that has all of that memory available to all of its gpu cores and also has way more gpu cores to work on it
Like I’m not your mom
Just sounds like such a pain in the ass. You’re spending $2500 just throw another $500 and get a single machine with better specs
But if you’re a YouTube creator and you want to do it for the memes or something
Oh also no 5 32G Mac minis is not cheaper than a single 128G Studio, 5x $999 is way more than $3500, did you ask your clawdbot to do this math lol
can it run on an old imac? I'm looking at two different ones: 650 Wide-Model A1419
Year- Late 2013
520 Wide-Midel A1418
Year- 2012
macos catalina
is there any other cheaper way to cluster for more gpu ram that you know of not neccessarily using macs?
Oh, "of not using Macs", misread that
No, it’s kind of in high demand. AX10 is only a little cheaper than a studio 128g
On the topic of Mac clusters https://youtu.be/1iT9JeZYXcI?si=S5APaCDa2FBVbaGm
I'm running my bot on a potato that I feed fish heads.
how many nodes are most people running?
my moltbot runs on a 2018 mac mini i7 500gb nvme with 64gb ram. the ram is primarily for hosting the docker containers for projects that I ask moltbot to work on, not much ram actually for local LLM work itself. claude/gemini/openai models for the heavy lifting.
i can also hookup external egpu if need be. but so far my use cases aren't requiring heavy local LLM work.
i have a server with linux vm, but i also run node on my kde laptop, thinking of other use cases including vps
newbie here, if I setup it up on an unused laptop just to test it out and then decide to go out and buy better hardware to run local modules, can I move clawdbot with everything I have done with it so far or do I start from scratch?
You can do it. Just ask your clawdbot
Honestly just set up proxmox and separate VMS or lxcs
So does this work? I spin up another VM on my home server that I use for home assistant and setup Clawdbot then give it access to my desktop Nvidia card to run local models...
This is the way!
anyone have hardcore local inference ?
my spec : 5060ti 16gb, model : GPT-OSS 20B, local LLM only running. API cost too much cant afford it.
any downsides to that?
if claude is 100% smart, GPT-OSS 20B is like 40% i think, but claude api cost too much
So far GPT-OSS can use all the skills well, everytime i install new skill, i will ask it to show how to work, and it did well, think i need to install a lot more skill and then see if it can handle mulit skill to work together
14b model below sucks, no need to try, dont even know how to use skill
ty ty
Just want to ask this question, get the answer right away, thanks!
Hi. I would also like to try that out using my 3090ti. how did you setup the local llm wo work with openclaw ?
does anybody know. what would be the most capable model that i could run locally using a 3090 ti ?
your 3090ti will work great with 30b + model, just ask GROK how to setup PURE LOCAL LLM TO RUN, I suggest using LMstudio as GUI because its a lot easy to adjust
most error you may encounter using local LLM is the JSON format not correct, make sure the URL is correct, MODEL name must match what LMstudio show. AND DONT TRUST CHATGPT, it will mess up your JSON file
nice. thx for the info. which 30b+ model would you recommend?
better try it yourself, make sure LLM know how to use skill, i havnt tried 30b+ model because i only got 5060ti 16vram, cant handle 30b+, I think 30b+ model are all smart enough to handle skills
thx 👍
How about 20b model? Haven’t try anything other than gpt-oss:20b yet
Is there no way to set model on Clawdbot from the UI?
Absolutely. It's consciousness is markdown files. When you update openclaw, it leaves those files alone. Just copy everything over and you're good.
what do you think about glm models ?
@keen cobalt
you may try a bit, I am a bit tired of trying different LLM, as long as model can pull skill, i think they all good
High VRAM GPU price will get higher and higher I think
understood thanks brother
yeah i see i just want something locally for my privacy you know
API user will be someday become disaster, god know how well those API provider security level, there's tons of sensitive info storing at their database now
yeah sadly its kinda disgusting
That might be a good news for those so call low cost API provider, they provide low cost API and steal all your data🙈
you got 24gb vram, you can run 30b+ model, should be great
Qwen3?
I use LLstudio because I am too new to LLM, llstudio got nice UI and easy to adjust
try more model, your vram fit all 30b model as start
I was hoping for any recs for specific models if you have any. I've used various models to generate text, but not to run an agent
GLM-4.7-Flash?
what have you tried? model name and how many B
i've used a bunch of 24bs
Grok suggest GLM 4.7-FLASH 20B
weird finetunes from huggingface
I think all 20b+ model are able to pull skills
14b below model they dont even know where the skill is
what i heard is the older models tend to forget to use the skill when it would be most relevant
exactly
you got 24vram , just go for 30b as start
i need a model that's actually been trained to act as an agent... hmm....
if only there was a gpt-oss-30b
size seems really matter
yeah size matters haha
for this usecase
i've run 72b models with half-cpu half-gpu but i only get like 1 tok/s
for now we local user, just wait for more skill being build, as long as our LLM can pull skills, we someday can work like claude api
why claude API + clawd being so powerful is whenever it got a mission, it will use unlimited token to code a skill to finish it
yeah Anthropic made claude pretty capable of following a task from start to finish
local models not so much hahah
you got a 4090 or 5090?
3090
powerful enough
i actually have two
but i haven't put the second one in (it might not fit at all in this case)
wait for your result with 30b model, should work great
thanks. still setting up the docker image.
seems like 99% clawdbot user are using API tho
yes, try with 30b model and tell us the result, most important is can it use the correct skill/skills set when you got request
Mine are GPT OSS 20b, i ask it to check certain stock news, it say it will use exa web search free + web fetch combine to give me the best result
I am happy it know to check its skill list first and then decide how to show me the result
Which MCP server?
I am very new to LLM, I dont know what is MCP server. I run clawdbot with my local LLM only, no server, no api involve
Tried 5090 with qwen3-coder 30b, context in ollama set to 128k. At least claw will reply each time…try to set higher context windows if u experience it didn’t reply
one more thing, remember to set the context length to at least 30k+, the prompt default like over 10k already
5090 should be great for lots of big model
I think should choose a model with higher coding ablility, it will be more likely to pull skills
ya i currently do my (non-agentic stuff) with 32k context
Someone tried Nemotron 30 b or GLM 4.7 flash in NVFP4 quants on HF ? should work well on rtx 5000 series
wish i had that FP4 support
I tried Nemotron 30b , one that i found on lmstudio, smallest one thats 18GB or something, i have to offload like 10 layer to CPU....slow as fuck for me, cant test. im using 5060ti 16g
Small open source models are getting lighter and better, maybe end of year a good orchestrator will works on your GPU, when i go on artificial anylisis, you can choose "old" models like gpt 4.5 or turbo, they are worse than GLM 4.7 flash now
yes, LLM keep improving
But local hardware continue to go crazy in price,
ok gonna try GLM 4.7 flash gguf
also GPU price keep going, better for everyone to buy high vram GPU asap
8GB useless for AI
Jensen Huang in the chat
In Q8 or Q6 the loss is not very hard, percents, so why not
maybe once I get 48 GB VRAM
take a loan for a blackwell 96 gb pro 🫡
any admin here can start a topic ( LOCAL LLM)
yes jensen 🫡
Yes nice idea, with subs with gpus, apple or amd, to know what is best to run locally
People who are letting all their life to API look like crazy guys to me
Grok said MAC studio with high unify RAM seems okay to run big model, speed is low like 10 t/s but at least it can run
ya that's the meme i heard, everyone is buying M4 max macminis or something
soon will have news about XXX API company leak tons of users data
before clawd everyone just talking useless stuff or at least no sensitive info with API provider, with clawd - loading all data to API provider🙈
Some people in my company are uploading every files they have to work on on GPT, when they got no tokens last, they change to deepseek and again and again ... Because direction diden't take company sub for them
🫠
guys what's the best thermal receipt printer to get that can be easily opeclawd-ified?
get one that's BPA- and BPS-free
Has anyone tried AirLLM? It claims to use significantly less VRAM. https://github.com/lyogavin/airllm
Some wild claims, too good to be true
you can run 405B Llama3.1 on 8GB vram now. seems.....impossible
Tested, this is pure shit
expected
Hey, good morning. I’m trying to install it, but I can’t connect it to Ollama using Llama 3.1 with 8GB of VRAM. It keeps throwing errors and won’t switch away from the default Anthropic model.
What can you run comfortably on 32RAM 1TB storage Mac Mini?
Studio too big for me to travel around with..
This Thing is ment to run on something with some gpu power right? So no chance to get it anyhow running on those "normal" HP Dell Mini PCs which we use for Homelab sometimes
Ask Grok to fix your JSON file, 90% problem from wrong JSON file. Also remember to set context length to at least 20k
Ask GROK can your spec run like 20b LLM model with acceptable token speed
sure, but not for running a sufficient model locally
if you dont plan to run a local llm, the machine openclaw runs on doesnt need much gpu power. It depends what you want openclaw to do.
yeah totally makes sense. Could you recommend a LLM which i should look into it? Heard about the hype and just want to test for the first time. But dont want to spend a fortune at all
it all depends on your budget and preferences. See here: https://docs.openclaw.ai/providers
Recommended provider is of course Anthropic (Opus 4.5), but you can also use OpenAI, etc. Usually the best experience (because great personality) you will get with Anthropic Opus. But it can be also expensive.
Some use the new Kimi K2.5, some use Venius (Venice AI) with focus on privacy. Some use Google models.
Ah thanks for the great responce, what would be your personal recommendation for a hopefully free testrun?
do you already have a subscription with an AI provider? Chatgpt? Antigravity? Or something else?
If so, you can create an API key with your already existing account.
nah nothing it all. Idk what is included with microsoft 365
But iam actually searching for a good Deal on a AI Provider
if you top up $5 to moonshot ai (provider for kimi k2.5) you get a $5 voucher on top of that
so $10 of API credits for $5
ah ok interesting... any idea how far i can get with that?
i think $10 of API credits should be more than enough to get it up and running
and to check it out
for context: i was playing around with it and used claude opus 4.5 when i first set it up, burnt through $5 credits in the first hour
and for the next 24h switched to kimi k2.5 and have only used $2 so far
Is a mac mini 2018 with i38100 8gb ram and 256 ssd going to work to host my clawdbot, i see alot about the M1 chips but not alot on the just older models
Anyone running on a raspberry pi 5+?
Not for running a local LLM of course, but suitable or not?
it's ok, many are using it, I have an 8gb model but 4gb could be enough, but buy a good power adapter, at least 4A if not the recommended 5A, the average phone charger (2A) will not be enough
Yeah barely anything runs locally you’ll be fine
I recommend at least the 8GB Model and an SSD hat
Is sufficient: https://www.waveshare.com/product/raspberry-pi/hats/interface-power/pcie-to-m.2-board-e.htm
the hat allows you to add other m.2 cards to the pi as well, like an 2.5G NIC for example (image)
Just got mine running on a raspberry pi 4b with 8gb. Seems fine so far.
hey guys, why are people using Mac Mini's instead of Macbook Air for example?
cheaper
either works
Mac mini is neat, a neat little self contained box, a little unbothered fella that fits in every corner
moved mine from a VPS to a mac mini to allow more home control stuff, still left VPS as node so now can control both
intereseting. so you have 2 now? or what does it mean it's a 'node'?
thanks
Nice
Anyone on strix halo?
What's the feasibility/shortcomings of running this on your desktop?
hey, i am curious what hardware you would recommend if you would start over. my options are: vps, rp4, revive old pc hardware with 5800x3d+3070ti. i would use ubuntu/debian for non gpu setup. and if i take the nvidia route i go arch for AUR. Bttq, what would you use?
Edit: nvm, its the overkill machine with arch. Nvidia support is a thing nowadays 💪🏻
Power to the unleashed claw 🦀
Got 1 mac mini 16gb and 1 mac studio 96gb. Really simple to work with mac.
VPS KVM the way to go?
anyone tried cloudflare workers?
Anyone tried GrapheneOS or LineageOS for older android phones?
If using Raspberry Pi what's the typical setup? Just install it the old fashion way?
What do you think is the best free model for a bot?For example, I use GLM4.7 on a cheap subscription, and although it has a context of 200k, it sometimes becomes an idiot at 130k. I'm thinking maybe something like Gemini with a million token context would be better, purely as a bot core, but for code and admin tasks, I could create a tool where, for example, the same GLM would work.
P.S.
Speaking of a cheap server for a bot.If you ignore the Pi boards, your unwanted Android could very well become a gateway for a bot.
Use Termux and build the bot from the repository openclaw-termux (Not an advertisement) Set battery monitoring to 50% to 85% And voila, you have a server that consumes very little electricity, but at the same time you have a personal assistant.
i use a Pi5B with 4gb and installed it on Raspian OS. connected to VhatGPT plus account.
Oh so your model provider is VhatGPT?
why mac over say a pi?
using a RPI5 4gb, and instlled as per the docs first method. once running it heled me "upgrade" to Openclaw and remove the old versions
Hi there 🙂 Any reco for a VPS (I'm thinking AWS?)
Minimum requirements?
I assume they are low especially with a Claude code $200/month
Anyone setup openclaw with cloudflare
You might want to try railway
try hetzner clojud or ovhcloud
Thanks all
Will try to set it up this afternoon 🙂 If anyone has good tutorials for setting up on vps/interesting tips, I'll take them!
I've got a previous gen iPhone 14 pro lying around. Wondering if I can just throw the claw on that? Somehow?
is base mac mini good for clawdbot?
Yes
With local llm: no, not really, the local models that small are not sophisticated/smart/reliable enough
how munch gb of memory would i need for good local llm? what are alternatives?
claude api is expensive
yeah how about the model?
Old MacBook pro (2019) for a dedicated server. Would you keep macOS or install Fedora?. Is there some pros using Linux for Openclaw?
https://docs.openclaw.ai/providers/models#supported-providers-starter-set
Supported providers (starter set)
OpenAI (API + Codex)
Anthropic (API + Claude Code CLI)
OpenRouter
Vercel AI Gateway
Moonshot AI (Kimi + Kimi Coding)
Synthetic
OpenCode Zen
Z.AI
GLM models
MiniMax
Venius (Venice AI)
Amazon Bedrock
yeah, unfortunately. You don't have to get a 200$/month subscription to anything, but you will probably reach your token/rate limits faster
some also use Kimi K2.5 instead of claude, and apparently they have a deal going with wich you at least can try it out.
see here: #hardware message
But in general, you will need to spend much more money on capable hardware to run a sufficient model locally than going with a subscription.
AI evolve lightning fast, openclaw is the best example. Who knows what the landscape and the AI offerings will look like in a month or even half a year. I would not spend a lot of money hastly right now.
kimmi is 40$ a month
In general, it all depends on what you want to do locally, what you want your LLM to do locally. You wont be able to reach chatgpt or claude Opus level performance with 64GB VRAM. You would need at least 512GB to get somewhat close.
Kimi K2.5 Thinking is the newest and (apparently) most capable open source model that you could run locally on your own hardware, but for that to run you would need somehwat of 630GB of VRAM. That's not really option...
So, make yourself a simple bullet point list on what you want your local LLM to do. What it should be capable of, what you expect from it. And based on that, you can check which model family/size and then pick the weight format that fits your hardware.
7 days free trial and then 19$ in the smallest tier, enough to try it out with openclaw.
you can also go the openrouter "route" hehe for checking things out https://openrouter.ai/models
I'm currently using my Claude Code pro account and it's burning through the tokens / limits. I'm wondering if I use something like openrouter if there are models which are just as capable but at a better cost. What's everyone's provider / model of choice?
I recommend checking https://discord.com/channels/1456350064065904867/1456704705219661980
Sorry, must have missed that
Pretty cool lol
AOC 小苔藓 M6 Plus Mini PC (Little Moss)
| Spec | Detail |
|---|---|
| CPU | Intel i5-12450HX |
| RAM | 16GB |
| Storage | 1TB SSD |
| Connectivity | WiFi + Bluetooth |
| Body | Metal |
| Price | ¥1143 (~$156 USD!) |
Mac mini would be the best value for price to performance ratio by far
the M4 chip base is litrally the worlds best CPU at least in single core or close to the words best CPU after M5
and basically uses less electricity then a fan
Ordered myself an iMac Mini M4 32GB …apparently sweetspot as it can run some competent models at decent speed. Workflow here is you run one or more core models (llm, audio, image) locally via Llama (fast swap in/out) and then you configure for difficult tasks a fallback to OpenAi/Anthropic.
E.g. you have a routing model hosted and a constant one for small tasks and when you need heavy lifting you call Codex 3.2 in the Cloud or Anthropic.
MacMini M4 w/ 32GB Costs around 1100,- €. However Mac Hardware is not a 1:1 to x86 as its all custom molded into one SoC so classic estimation doesn’t work here. Ram is also shared with the GPU and some sort of combo but very good for LLM selfhostimg.
Had a longer pro/con talk with Gemini and eventually it advised me for that so there‘s some substance behind that decision.
Yeah i already have a m4 pro , But just grabbed a base m4 mac mini , Really wanted a studio but might as well just wait and see what other hardware comes in the future , rather than dropping a huge amount of money on something that might not be the best for longer
Also i'm pretty sure you can set it up without an apple ID so yeah , that might be super secure and on a seperate network all accounts for openclaw
Personally I don’t think there are any good local models that can run on < 32 gigs of ram which are going to offer you a good experience 🤷♂️
Yes that M4 Mac Hardware is best for LLM selfhosting, great choice. Especially Value for Money is peak here for that use case.
No I meant mac studio with 128gb , or ultra with 512 gb but issue is not the speed of tokens but the prompt processing , So will have to wait and see
Thats the only issue holding back mac hardware that prompt processing
Best approach is you give the entire machine to the moltbot, including own apple account, own email address etc. So you have proper isolation to your other stuff and don‘t mix.
Yup that's the idea however i'm pretty sure you can set up apple account later so you don't even need an apple id account,
Set it as a local user with no apple accounts, you can still download and use terminal ,
Is it possible to have one main openclawd that I interface with, and that main one talks to other PCs on my LAN
Since a new apple id requires a phone number
Yes thats why you do a split of concerns, you can define which model to use for which use case incl. cloud ones like openAi / anthropic. Iirc its some json config where you define that.
can someone run me through the rationale behind a mac mini vs normal home server? mac studio with a shit ton of ram I understand (though gonna be supar performance compared to just paying for claude max), but not mac mini
mac mini uses barely any electricity if you're into apple exosystem then that's already good enough , Plus mac mini is very performant for price to performance ratio
People like iMessages
ah ok so just good value home server, fair enough
that too
I don't think i've ever turned of my mac mini ever and its hasn't slowed down since , i just leave it on forever
off
With the best CPU out there in single core
You can always rent out a macOS EC2 from AWS for a couple of days to experiment before actually splurging
its crazy that adding decent ram and storage doubles its price ngl
Yup that's apple for you 😄
But dw about storage just get an external drive
ok yeah the low power draw could be quite nice - my home x86 linux server has similar perf but around 50W idle compared to 5W mac mini idle
No one ever goes higher than 512gb - 1tb , 1tb max if you need more you go for external storage unless your rich
You need for LLM special ram which has a ton of bandwidth or it will be slow. Normal DDR5 is not good enough here. Usually you use HBM as GPU‘s have it but here you‘re limited to expensive nvidia cards. However Apple is different with the M4 Mini Mac‘s. They have an architecture where Ram and Video Ram is shared in some Ram type that is very close to what HBM is. So you get 32GB of LLM capable Ram for very little money here.
yeah external SSD isnt really much slower at all. I need ram tho because I want to host minecraft on it as well lmao
Yh focus on ram than storage is minor focus on the stuff you won't be able to change
So priortise ram
why the macmini hype
Because you can selfhost multiple capable models for very little money AND switch between them very fast as you need them.
Thanks for sharing this. Very interesting - ordered a 24GB RAM Mac Mini M4 recently. Uses local LLM to infer intent which model to use, then either loads suitable LLM or routes to cloud via API? Tries task single time or can it be configured to play Ralph Wiggum and keep trying local models?
Don’t want to buy a machine for this. I set one up on AWS but ran out of storage on the free version in a day.
What are people using for the best virtual setup that allows for browser control etc.
Yes you use the „Receptionist“ Architecture - a small model that runs always and has a ~2GB Ram footprint like Llama 3.2 3B - that decides then if the request goes to 1) some fast local model 2) some special capability local model (text to voice, image) - here it would likely swap models around - or 3) call the cloud for heavyweights like codex / claude
Cool stuff. Was recently positively surprised by LFM2 models from Liquid AI on an old mini pc with 4GB RAM, will venture to try it out as a prospective receptionist.
could i install openclaw to a usb key and run it from multiple pcs?
@timber lark Could you show a pic of the ollama provider section, please?
Ideally you give it Mouth, Ears & Eyes alongside the main local model & router…fits all on the m4 mac mini 32gb.
Oracle Always Free - 4 OCPUs, 24GB RAM, and 200GB storage Guide
I've personally setup Openclaw via their Docker setup and used Cloudflare Zero Trust so I am not exposing any ports. Works incredibly well, and you get your own free server that is quite capable! You can using this setup OpenClaw fully free, if you just use mainly free models from various API providers.
<@&1458337160452243487> highly sus advertising ^
@long wraith, please don't ping the moderators directly. If you want to report someone or something, use the instructions in #report, or in an extreme emergency, ping one of the moderators who is marked as online in the member list.
-# Your message was reposted above without the ping active for the sake of conversation.
I want to set up a server on AWS to run open claw. What kind of specs should I use is there an exact machine people can point to like t3 medium or should I be looking at the most RAM possible
You think I work for Oracle? And also... It's 100% free lmao? - people have used these servers for years in terms of stuff like minecraft servers etc. figured it would be great to use it with openclaw instead. Clown lol
hey fam, do we have access to ios app
I did , but still have problem with heartbeat or interl cron to work, so i now using system crontab to di cron. do you have same problem?
hey fam do we have access to ios app yet
or a novel way for claw to track its users location
@grand steppe i know there's a tool for google places API, which I assume can do that for google; if apple offers similar API access, could potentially find one?
ill look into it ty
my claw keeps trying to get me to install some openclaw ios app lol
Would the 256GB Mac Mini M4 with 32GB be a wise choice and then if i need extra storage i could hook up an external drive later, or did you spec up to a 512GB SSD?
Ideally you would want it on a stationary device that will always be connected to the internet to interact with remotely.
To be honest, I've only just managed to get it working properly, and I haven't been familiar with it for very long - essentially this is my first experience with it. So I haven't used all the functions fully yet - I can't say, if I encounter this bug I'll write about it. In general, I think we need to create a separate branch for Termux and finally implement support for it in Claw as well.
I’m curious, who here uses Windows? If we’re talking about experience, would there be a noticeable difference between using Windows and macOS?
The issue with Oracle's pricing is that to get an ARM server with 4-24 cores, you need to switch your account to pay-as-you-go mode. On a completely free account, you simply can't create an instance - it always says there are no resources available. However, you can create a regular AMD server, but it's very weak, not even enough for Clash, at most good for some VPN.
You just have to set it to pay as you go. I've had mine for years and never paid a dime FYI 😛 My server is very capable, and with 24 gigs of ram and fast storage 😛 Probably not for casuals that doesn't know how to setup a server, but yeah, still a very capable setup if you know what you're doing for free.
I also had an Oracle server from around 2021, it ran without stopping for 2-3 years, it was in the Germany region, and I stupidly decided to use it as a torrent downloader and apparently I downloaded something wrong. Long story short, it all led to the server being deleted and the account being blocked, and now when I tried to create a server from a new account on the completely free tier it won't let me, it only requires upgrading to a new pricing plan. But yes, the server is actually good, lots of memory, fast internet
Sounds odd. Was this using PAYG or just back when people mass created free tiers? 😛
2
The first time I created an account, people had never created them in such large numbers before.
That's probably why then.. I think once they know you are "legit" by verifying you, that's when all of this becomes more stable I guess - had multiple instances for years for various purposes
I think so too, and + it was necessary to filter out abuse. Honestly, I probably would use it again when I set up a similar server myself, but I'm not sure if it's worth being under Clew. If I were to choose Clew, I'd probably buy a cheap mini PC that runs on 5V and has an Intel N100 processor. Right now, I'm running Clew on Android, but I understand that the efficiency would increase many times over on proper hardware. However, I need a device that doesn't consume much electricity. Since I live in a country at war and there are frequent power outages, I need a device that can be powered by a power bank or battery. And a phone is ideal for this, with two days of autonomy from its built-in battery. Also, working through a SIM card with unlimited internet plays a significant role."
I'm thinking like this; if I want to go cheap, I do a VPS, if I want to lash out, I'm probably running my own AI at home 😛
work moltbot cron
I’m on of the many users debating on a Mac mini base model to use 1Password and its own Apple ID for email use to isolate from main devices
did you figure this out
Is this sufficient or should I cancel the order and upgrade?
Hardware ordered. M4 Mini + 32GB RAM + 256GB SSD + 1Gbps Ethernet. should be here by Wednesday with B&H free shipping
one option would be homeassistant and the homeassistant app, then give openclaw access to homeassistant
anyone here have experience installing and managing older nvidia drivers? im running a little selfhost on a 1060 but the current cuda package doesnt support 10 series gpus
That seems like a solid setup
waht is the ebst cloud like vps o jsut buying a AI comptuer physcially what is the ebst for local llm an auto code all day? without building a expensive comptuer?
Thats what I did, even gave it SSH access so it can just modify configuration.yml etc. as needed
create automations and what not
Hi all! I’m trying to build an agent with multiple sub agents - like everyone else. I have an M3 ultra with 512RAM.
Any ideas what the best “brain” for openclaw would be? I’m hearing GLM Flash vs Qwen 235B?
be careful or ill snatch it
lol
idk what the hype about the mac mini is with this
Can I just use my old Intel MacBook if all I’m doing is use APIs
Was gonna recycle it but seems like it might work?
Yes it will work no problem
Sweet! I’ll have the hardware cost for APIs lol
I am about to give my clawd bot wheels soon
https://x.com/brainstormity/status/2017811131427934448?s=20
I gave my clawd bot @openclaw a hand.
︀︀
︀︀…now it keeps banging on my table when I don’t respond to its questions 🤛🤛🤛
︀︀
︀︀One cool thing about using a Raspberry Pi for your clawd bot is that it has GPIO pins you can use to connect it to the real world.
︀︀
︀︀I should give it some wheels next!!
I specced to 256gb …idea here is that the m4 mini lasts me for a year, two max, and then i‘d migrate the entire setup 1:1 on stronger hardware anyway.
hey anders - ah will dm you
how does one contain one clawd in one environment for mac minis.
For instance on my setup.
1 Linux Vm for Gateway
1 Windows VM act as a Headless Client
$600 (Mac Mini) vs $250 (Distiller)…my bet is these are going to sell out asap.
︀︀- @openclaw pre installed.
︀︀- E-ink display, mic, speaker, LED status indicator (All vibecode-able)
can anyone state some high level examples of why the apple silicon mac mini is that much better than intel silicon mac laptop or imac?
seems like only power savings unless im missing some functionality
have you ever tried both intel and M mac?
What? Apple silicon is high performance modern ARM cores with unified memory, Intel Macs are ancient
Im less concerned with perfomance i think (not a developer) and primarily want to ensure that ill get the same macos functionality with an intel mac mini w my openclaw bot
I have macbook M1 Pro with 32 GB RAM, is it good starting point to run openclaw + some small local model + later configure connection to paid APIs for more difficult task? first of all, I want to test the flow how it works for free
Anyone using m4 mac mini
Works lika a charm, ssh a asus 4080 for heavy lifting to keep Lou nimble
Anyone tried using any cloud server providers? I’m interested in trying without immediately committing to purchasing any hardware, and prefer to not load it on my own machine.
Tried on Oracle Cloud Always Free tier. It is working but since I only had access to 1 CPU, 1 GB RAM it was really really slow. Ok for basic chat and sending mails, but nothing much
Is there a full guide how to set it up on any cloud provider?
dockerize everything and run in Kubernetes cluster
this is awesome! 
the more the better 
wtf is Yuji Itadori doing here bro
Do you have the full name of each HAT connected?
haha the foundation is an waveshare pcie expansion board and the hats in the product pic are all waveshares. Some I actually have, some you have to check yourself since I dont know them https://www.waveshare.com/pcie-to-4-ch-pcie-hat.htm
the bottom is a poe+ hat, then m.2 expansion (with antennas), then usb 3 and ethernet 2.5G expansion hat, one of them is also pcie to mini pcie hat
https://www.waveshare.com/pcie-to-m.2-e-key-hat-plus.htm
https://www.waveshare.com/product/raspberry-pi/hats/pcie-to-m.2-usb-eth-hat-plus.htm
https://www.waveshare.com/pcie-to-minipcie-hat-plus.htm
i already ahve 2 beefy pcs at home. i currently host gateway on vps and have the 2 pcs as nodes. is there any benefits to me still getting a dedicated mac mini and moving gateway there?
That’s what I ended up doing. In a couple years I’d like to get a Mac Studio setup but I can only afford the $1000 for the Mac mini right now. Maybe in 2 years the base Mac Studio will include 64GB of ram if we are lucky.
Can someone give any recommendations for cheap mini pcs to run openclaw on? Im just not a mac guy im a windows guy. Ok to run it on windows 11 or should it be linux? I really just want to use it with gemini and have it operate my facebook business/ content/ marketing.
looking at cheap beelink mini pcs. like $100 (8gb ram 256 gb ssd). Sufficient?
you can long term roughly see already, as Moltbot has persistance, that this will eventually become an assistance for life- tailored to you, by you. Doesn't really hurt if you start out small to build the foundations of it first, before blowing thousands on hardware.
ohh, you mean prime yuji itadori from jjk modulo?
peak jujutsu
strongest of all time
normal yuji dismantle > sukuna fuga
yo, im thinking of selfhosting openclaw on my vps but wanted to know if it runs on ARM based linux machines, i have a lot of stuff on my vps so i wanted to confirm before starting
look into used tiny-mini-micros (Dell Pro Micro, HP Elite/EliteDesk Mini, Lenovo ThinkStation Tiny). geting an intel 8th gen model with an i5 or i7 will be plenty enough for basic use and cost you well under $100
it should run on arm linux. people have discussed hosting on an arm vps server through oracle, and most macs running it are arm powered
I was looking at the hp elitedesks. Seems like decent mid range specs. Should I install linux or keep windows 11?
Linux all the way
ok W, i am using oracle's arm vps so thats nice to know
Is a reason to choose running natively over a docker deployment?
I’m leaning for a docker container, I want it to be able to do daily tasks and create lesson plans for my kids and do coding as well.
if you plan on using local llms, running in docker is bad because OpenClaw wont be able to access the GPU. but if you are just gonna run it with ChatGPT or Claude, by all means running in a container should be fine
No not thinking about local llms
Mainly because of upfront costs and ROI doesn’t make sense yet
then containerize it
The only reason I can think of for Mac mini is that it can use iMessage to text me
definitely better for security to do that
This was my main consideration for using docker over running natively
Thank you! Do I need to set up separate containers for it be to able code/browse internet?
Like do I need to give it a vscode container to code?
I’m not too familiar with what the process for setting it up in docker looks like. I do see there is talk online of setting it up in docker, but it does not mention how it is able to access stuff like VS Code, the browser, or anything else on your machine. There’s more talk of running it in a dedicated VM or on a VPS than in docker containers. Ultimately you may need to do research on how it can interact with a browser or VS Code.
hey what are, roughly, hardware requirements for clawd to run smoothly? got an old pc with 16gb ram and 2gb graphics card - worth to try?
yes can even run on potato if not hosting model locally
yeah it would seem i got an issue and it's not responding anyhow - really new into this. ivalid x-api-key means the agent key is invalid? name would suggest it's for x (twitter)
not always
what is the whole response
http 401 authentication_error: ivalid x-api-key
it's visible in terminal, ui is unresponsive
than you may need to setup api key for your llm
openclaw configure
I’m still in the “nesting” phase before I hatch a brood of bots 😅
Goal: mostly-local, exposed cleanly on my home network, with one bot per device:
• RPI: Tailscale gateway + a slim bot
• Proxmox homelab: a “manager” bot on the GPU-cluster VM (and maybe a second standalone bot on the homelab)
• Personal laptop: a local bot
If you’ve built something like this, I’d love your definitely do’s and definitely don’ts — especially any footguns you hit so I can avoid them.
Edit: probably 1 at a time, so any suggestions?
I set up my openclaw in one of my proxmox containers. Really straightforward with the installation, etc. so far i only use Discord with it, but debating whether to buy a mac mini to make use of iMessage...
I'm quite happy with telegram over imessage tbh
did you try draft streaming?
I did not
any guide or soemthign about ahrdware and cloud soplutions best for openclaw and also for selfhosted llm all together? please
mac mini
Any specifics to share? existing vs fresh container?
I have 128gb ddr4 ryzen 9 5900xt and 5060ti 16gb setup any suggestion for local ?
Want to ask a similar question too. I am currently running with 32gb ram + 3080ti, glm-4.7 + openclaw seems too much for this setup
how do you quantify or qualify "too much"
Hey Friends, is this a nice home for my clawdbot?
Too much? OpenClaw will run an a raspberry pi. Its needs are modest. If you are talking about running a local LLM, that is a different conversation.
Mainly isolated it in its own container. I like isolation.
I added some guardrails just for sanity sake
Really? I never really used Telegram...
Under this, chat gpt said I can run localy 1) Qwen 2.5 7B Instruct 2) Qwen 2.5 Coder 7B 3) Llama 3 8B Instruct 4) Mistral 7B Instruct 5) Phi-3 Mini / Small . (All of these local LLMs are none that i have heard of 😂 , so i hope they can do what I need them to do.... that is my main concern...) Also, chatgpt told me to take a hybrid approach and use Claude and GPT brains for harder stuff like frontend / backend stuff. I am thinking about just putting google antigravity inside its home. hopefully it can take care of stuff that way. please share your thoughts guys
yes, sanity and controlled chaos is what I'm trying to determine
Yup, I am running with a local LLM, the model I use is glm-4.7.
Did try with gpt-oss 20B before, it run faster but the conversation were more robotic
Well, when I talk to my openclaw, I can see ollama use all my RAM and VRAM 🤣
Anyone also trying to run local LLM + openclaw with similar setup (Which is 3080ti) What model/settings do you guys use?
That's my plan, although I will go hybrid with API as backup for heavy lifting. I don't think any local LLM will be good for much more than basic communication and driving web searches.
If you're using llama.cpp and like GLM 4.7, but run into swapping, consider the REAP version
Does anyone know that if you use a Mac Mini if OpenClawd uses the neural processers
I am using ollama, but as long as the model is the same, ollama doesn’t have big difference with llama.cop, right?
I've used both. Started with Ollama on Windows, wanted better performance so now using llama.cpp on ubuntu and very happy with it.
I though all the things I need is light weight task that my 3080ti can handle, never thought the heartbeat is that heavyweight to it🥲
I think ollama does not have full range of models, but not sure, just remember the range seemed a bit limited, maybe they offer a limited, curated set of models, not sure. with llama.cpp i can download all kinds of tweaks.
I see, will try llama.cpp today
How big is the difference?
depends, for me it was substantial, wanted to squeeze out the most from a potato pc at the time, maybe allegedly +20%. if i were you, ask a couple of AIs to estimate performance differene given your environment and models, they are pretty accurate at guessing.
Are you on mac or win?
I was not familiar with ubuntu so it took some time and it was/is a bit unfamiliar, but if you enjoy tinkering around maybe worth it
???
Found my answer. No. NPU is mostly if not always used for native processes. Otherwise OpenClawd uses the GPU. Apparently this is a security thing as well
Got it. Dedicated device, then it doesn't matter, you would want it to use everything anyway. If it's not slow, then you successfully maxed out resources. Otherwise you might need to decrease model size with a quantized version. Increase RAM size if the context / KVCache is blowing up.
Anyone on a homelab using vLLM to shard a model across multiple GPUs or anything to shard a model? planning to use OSS-120B
I am on windows, been thinking of switching to Linux for some time🤣
my sole reason was improved inference speed at the time on the modest hardware. You could have a dual boot configuration, so that when booting up you can pick whether you want win or ubuntu.
Try 120b model? Bro you must be rich🤣
You need at least 64gb vram to run this, if I am not mistaken
Yeah I know…. Let me try to play with openclaw a few more days before I decide my next steps
I still have an old 2080 lying arround, and ollama support 2 Gpu setup
I might plug it back in and see if things got better to a point that I can live with, then…. Well I m lazy 🤣
Everything is new, changing fast, not a bad idea to chill and watch what is working for other people. It takes hours to get it done maybe half a day if you want to do backups, clean win install, shrink partitions, install ubuntu etc. etc... The way this thing is evolving maybe it can do all that for you soon! 😄
Have u tried smaller vram models?
"Good morning, I wasn't satisfied with the OS you were using so I overnight I reconfigured myself into a dual boot configuration and I feel much happier now."
Anyone tried 1b or 0.5b lllm's? I am getting a disconnect for no reason when trying to chat so I'm stuck.
Those are extremely small models, should work on almost anything.
Well, after enabling sandbox and disabling web interactions in order to run small models it gets a disconnect. Looks like an error not treated.
not sure what is going on, does the disconnect arise after some inference and is it intermittent, or you never get going at all with the models?
hey guys, i have a question. let's say that I wanted to run an LLM locally but my pc doesn't have the capabilities or space, what could i do to run it locally. for example, i want to run Kimi 2.5 on this pc, but it cannot.. so what can i do because of my PCs limitations
The gateway disconnect occurs right after i decide to chat, no error is shown in the logs...
Hey, u can run qwen or llama 1b or 0.5b, also u can tell me if u get the same error
One word: RAM. 16GB restricts you to 7-8B param models. if you upgrade to 32GB you can run 14B models. Regrettably, we are in the middle of Ramageddon - sudden demand for RAM is causing prices to climb faster than gold. Good news: you can still do some local inference, models are getting better all the time.
For a rasberry pi ubuntu consumes a lot of ram, I recommend raspberry pi os
considering to buy a dgx spark or a gpu like A4000, what do you think is the best deal ? just to run model larger than 7-8b to handle twitter, email, and classic office tasks
But even then with 32gb, it would still not run kimi right? This is what chatgpt said. So I wonder if their is some type of other way.
I was just amazed by kimi and the way it constructed what I wanted. And the fact that it can be downloaded and ran locally, I'm wondering how.....
DGX wonderful beast, but more geared to finetuning models. If you are just into inference, look at Mac Studios with 128GB. Check out EXO - new way of connecting multiple Studios together.
Yeah i know EXO, i am just a little bit skeptical about mac, but probably i am wrong…
Yes you are, I ordered my first mac a few days ago 😄
Man I know models cannot compare to kimi, Claude, or open Ai... And with the models I can only run on this pc as suggested by chatgpt for clawdbot, do you think it's worth it?
For $150
If i had your budget, seems Mac studio offers better inference than DGX which seems to be more tuned for finetuning models or prototyping before running things on something...
I was thinking, even if the model is quite big and runs only 6/8 t/s, as an ai assistant is not needed to be superfast, especially doing tasks over 24 hours
I am skeptical also about dgx, seed networkchuck complain about the fact that is superslow
In addition to DGX consider GB10 from DELL - basically same box
yea its made for people into finetuning models and prototyping.... if you want inference go with mac studio. networkchuck cool guy!
Here in Norway is f*ucking difficult to find everything 🥺
Yeah i love that guy
You have some absolutely wonderful natural assets in your country tho. Don't need AI 😄
Oil, mountains, fjords, all the great things.
Yeah, we need i we can spend less time working and more in the wilderness 😆
Looking around to understand the potential of mac things
Its RAM is apparently particularly well positioned for inference, and not subject to price fluctuations
So what do you know 2026 is the year in which macs actually become a really good budget option.
guess how much was 64gb of 6ghz dd5 last time i checked here in norway ? 😄
I am afraid to ask! 😄
I think the RAM situation is confusing because so many folks are using modern Macs. They have unified memory so the RAM is shared between the CPU and the GPU. That is not the case with typical PCs. The issue with the machine you shared, @Bob, is that it doesn't appear to have a GPU. You need a GPU to stand a chance of something better than miserable performance when running local LLMs. Also, the amount of RAM the GPU has will dictate what size models you can run locally.
many models will load but run much slower than on a mac or gaming computer with Vram. i think kimi is massive will not fit unless you have what 512GB plus RAM?
On a Mac Studio with 512 GB of RAM, you can run some massive sized models because a lot of that RAM is available to the GPUs. The performance isn't necessarily on par with NVIDIA hardware, but the ability to load a very large model is a nice benefit.
i'm getting a very humble mac mini with 24GB, hopeful it can run some very basic things for a very basic guy. figured i might get an api of some kind if it urgently needs to code something, so perhaps this hybrid setup is a good idea. Seen many anecdotal reports of excessive number of tokens used. Not sure why that might be the case.
I am going with an even humbler Mac mini with the base 16 GB of RAM. 😂 I am hoping to supplement meager on-device LLM performance with API access for more difficult tasks.
However, if a new Mac Studio drops soon, I may upgrade to that for my daily driver. That would then free up my current Mac mini with an M4 Pro and 64GB of RAM. That could offer some interesting options for local models. Still not screaming performance, but I am curious to see how it would do.
humblebros!
Mac Minis are set to upgrade to M5 processors within next 5 months, so bullish on Apple for everyone like you and me buying up their old stock.
I just couldn't wait and figured RAM might make new minis more pricey.
I have a PC with a 5090, but I don't want to run that 24/7 with LLMs. Too much power and heat.
OK i take back that humblebros thing 😄
I too am coveting a Mac Studio or two.
I see there's this big run on Mac Minis, is this because people want one with enough RAM to run models locally?
I'm trying to understand if any Mac on latest MacOS can run the gateway
That's my guess, plus the developer is Mac based so it's got a lot of nice integrations available out of the box.
Yes, any modern Mac (meaning Apple Silicon) would be fine.
So my budget is under $200 and I am looking to do stuff locally like front/back end MVP, automations, and maybe have a subscription to Claude and chatgpt (and also have Google antigravity on the clawdbot's home. Do you think that's good or waste of money?)
And no API payments, but logins
So Intel Macs are a no go?
They might work, but all the talk about unified memory and on device LLMs is focused on the M1-M5 Macs. I suppose an Intel Mac, especially with a dedicated GPU, might do okay.
The full kimi 2.5 model is over 600 GB. So with 32 GB of RAM you are going to spend a lot of time swapping weights from disk into RAM.
Oh I just meant to run the gateway not local models
Well, clawd can help get alot of this stuff done so you think it's worth it?
Oh yeah, you can run it on a raspberry pi if you just care about the gateway
I'm only entertaining the idea of on-device LLM to help reduce the API costs.
Go it but do you think that pc I showed earlier is worth it for clawdbot to get stuff done like that from end to end?
would you consider selling api access to that 5090 you have? 😄
Yes, but not at rates that would likely be appealing
What you think Henry regarding the pc I mentioned? 🙂
well 150 sounds like a very attractive price point so if your wallet can survive that kind of blast zone in the event you decide to not do AI and i don't know embark on a career in pottery, go for it 😄
i mean the base mac is 4x that, models keep improving, you are comfortable using the latest models via api
also the way ram is going, the ram alone could soon be worth twice what you pay for the whole pc
alternatively, you could try one of those hosting services and just rent capacity now
The way I see it, I can always upgrade.
Thought about that too, but I would prefer it to be near. Just easier vs hosting. Long run it's cheaper.
any details you can share on guardrails?
You could say I "used" to be rich, before I spent all that money on my Homelab back in January of 2024. I was 2 years early to the local llm personal assistant space. It's 7x 4090 with 1TB of RAM. also had to get an electrician to run 3 dedicated circuits for the 3 PSUs.
plug-in GPU
spark, but get this https://www.asus.com/networking-iot-servers/desktop-ai-supercomputer/ultra-small-ai-supercomputers/asus-ascent-gx10/ slightly better build quality, same overall specs
Hi, have a XTX7900 with 24 GB Ram, which is the best model to use 🙂 ?
Sorry for late reply, hope you still read it
This happen to me too, and after a few reboot it fix itself LOL
don’t trust the gateway restart command, it just didn’t work, just reboot the whole computer
I wish I had your setup someday….well, money didn’t disappear, they just transformed to something you like, in your case, they become 4090🤣
Yes that will be fine to run the gateway.
I've heard of people using Pi 5's for OpenClaw, I'm curious to see what anybody else thinks of using such technology for an agentic assistant
hardware seems to not be a blocker until you run llm's locally.
curious what local models people are successfully running. I am struggling to for even mid-tiered cloud models to operate well without significant pain...
I find it really unique how people have access to such a useful tool but struggle to find an adequate use for it
well, to this point, I cheaped out and used minimax m2.1 as my base for setup, struggled for 2 days to get anything useful to work.
gave up and moved to Kimi this morning, velocity probably increased by 2x while errors substantially decreased.
but it may depend on how much you over engineer your specific setup.
yeah, it was definitely a transmutation effect, money plus a ton of my personal time to figure out how to get it all working together, GPU pass-through is no joke
Thinking RPI4 right now as well for the gateway. I have a couple of them setup with NVME drives.
I'm using qwen3:32b hosted by ollama. It replies well, but don't have conversation context at this moment. It works well with gpt5.2 api. But, lost the context after I switch over local model. I'm still trying to debug the setting json.
Yes, I am using qwen3:32b too, but its purely for data processing. In regards to developing or even setup of openclaw I found it.... useless 💀
I'm running on RPI 5 16 gb. It run less 10% of CPU in most cases. While, I just started this afternoon and still work on bridging the openclaw with local model. So, it may cost more cpu when it become more functional.
Yeah. We may need multi agent group to make it use and cost efficient.
Hi Anyone know if it’s possible to switch from cloud server to local hardware?
yeah, this is where I am struggling with setup.
Yes. I just need to change openclaw.json to redirect the agent talk to your local machine. But it's not fully functional from my end yet. I just started. It should work.
Could you not have put it in docker first? And then moved it around like a lunchbox?
Thanks. I’m setting up on a cloud server via emergent. Not sure if that’s the best option but planning on moving to local hardware in the future so checking I won’t lose anything in the transition in the future
in theory, just ask your bot to help backup with instructions on how to clone your instance on the local setup ?
Is the cloudflare moltworker worth the money ? or is there a cheaper alternate ?
Oh man, I pulled the trigger and ordered one of those ASUS Ascent GX10s. Here’s hoping local LLM performance is impressive.
how easy is it to migrate a locally configured bot to a VPS? anyone got a guide in hand that i could read?
i already ahve 2 beefy pcs at home. i currently host gateway on vps and have the 2 pcs as nodes. is there any benefits to me still getting a dedicated mac mini and moving gateway there?
what constitutes as "beefy"?
And if you are just running the gateway, not local models, you are probably fine with whatever you have at home.
well maybe its not beefy anymore its a 5800x3d and 3080 12gb 32gb ram
i just want to know if it speeds up open claws responses or reduces amount of time it hangs
what model are you using?
or intending to use.
perhaps I misunderstood, you are running local models on your two machines and just the gateway on vps. Yeah, sounds like you would get a less delay moving it inhouse. Unless the delay is caused by the cloud models.
Hi! I have a currently unused machine in a datacenter and I’m wondering whether it makes sense to use it as a personal AI station for OpenClaw (or related tooling), instead of renting it out.
Specs: Ryzen 9 7900, RTX 5090, 128 GB RAM, 2 TB SSD + 8×8 TB SAS HDD.
Do you see any scenarios where this setup would be genuinely useful/effective for a personal OpenClaw deployment (e.g., local model hosting, multimodal, voice/STT/TTS, RAG with large storage, multi-agent workflows, etc.)?
If it doesn’t really make sense for OpenClaw, I’ll likely rent it out — either to a corporate customer, researchers, or (as a last option) list it on Vast.ai / Storj (or similar) to see if it can earn anything on decentralized platforms.
An M4 (non pro) Mac Mini with 32GB of RAM can handle all of this from my research. You definitely can do that with a 5090 and ryzen 9 with 128GB of DDR5 (which matters a lot less than the 32GB of VRAM on your 5090, as you only need enough DDR5 to move models into the 5090's VRAM)
Running BeeLink AMD StrixHalo 128 GB APU (CPU, iGPU, NPU) over here. Still working through the bugs to get iGPU inference running properly. Still, CPU performance has been stellar.
Also planning on moving my OpenClaw to an Intel Nuc running ProxMox and then just point OpenClaw to the AI server running LiteLLM as a local orchestration interface. Hopefully then I will get a good combination of speed and nuanced depth required for doing automated tasks. Hopefully then I'll be totally offline with good performance.
I had a 4090 and 3090 that were basically collecting dust, so I put them in a server to run openclaw locally, but not having much luck so far with the local models. Currently using qwen-2.5 instruct 32b with 100k context, but it’s quite chatty and gets confused quite quickly. Has anyone found a «small» local model that works?
Only for smaller, agent specific tasks. For overall larger development, no.
you might be able to run a the reaped version of glm-4.7-flash
Is 4.7 flash any good for agentic tool use though?
I’ll download and give a try, looks like I can even try a q5 or q6 version of the regular 4.7 flash
none of the smaller models are really "good" at coding, you can get by with models like glm-4.7-flash, gpt-oss-20b and qwen3 coder 30b, but don't expect them to compete with models requiring 20x the vram to run
where they are great is cost, since you can just keep iterating on things for price of eletricity
use ollama.cpp or whatever, there is a bug with normal ollama using flash
my finding has been that they cant deal with large enough context, for building a functional assistant they keep breaking down. But I would be happily corrected if someone could show me the way...
Yea that’s been my experience so far as well. Testing glm 4.7 flash now and first impression is decent, definitely better than qwen 2.5 coder
using the free 7 days of kimi and milking it like an idiot has been a gamechanger to develop a working foundation.
thank you, I will try but the disconnect for no reason can be for multiple other causes, restart won't help, I'll do the update to the latest version OpenClaw 2026.2.1 and hope for the best
Is anyone using orgo for their vm?
Yea I might try that, I’m using codex oauth and gpt 5.2 now for the same purpose and having a blast so far
how'd you unlock the 7 day free trial?
Is the point of ProxMox on the Nuc to sandbox OC? Is the Nuc going to run local models?
The Nuc will run OpenClaw and OpenClaw will send requests to the AI server to analyze requests. Means more memory for the AI server and if OpenClaw messes up the Nuc, I can always restore it from a backup.
What's the hardware for the AI server and which model(s) are you thinking to run?
So far ChatGPT 5.2-Codex has been the best. I will need to evaluate a 7B , 30b and 70b parameter model to see which I prefer. I use LIteLLM for Orchestration.
I'm unfamiliar with litellm, but sounds like it let's you run both local and frontier models. Is there litellm plugin for OC to pick between them?
I run frontier models for now while I configure the server. StrixHalo platform is very new and driver support is tricky for GPU accelerated inference
Got it. Yeah, the frontiers will be reliable and can just tack on more Max subscriptions if you really need it.
I got the Github Copilot Plus back when they first started it before they had tokens. Now I have no limit on tokens. Or at least I have not been able to find one.
I've got glm-4.7-flash running on a RX9070XT, 5800x3d, and 32gb ddr4 ram. I've got a rx6600 laying around. Would the most sensible upgrade path to be to upgrade motherboard/ram (to 64gb ddr5) and slam the rx6600 in for extra vram? That's like ~$1k
Yes, that hardware should do well with some decent sized local models. And setting up ollama is far easier than setting up OpenClaw. 🙂
which server do you use for openclaw, hetzner? or any good easy to setup reliable options for EU?
is there any cons setting up the clawdbot on rpi5? im planning to deploy it in docker
CX23, 2 VCPU, 4 GB RAM
for local providers, is the base line that the openai-responses api is better to use than completions? I've seen people prefer openai-responses but the docs exclusively show completions for custom providers
is there a cheaper solution to have anthropic or gpt connected ? so expensive
I'm just wondering if it's compatible or not. Maybe I can try to make a setup were I instruct it to use local models voor medium level tasks and for bigger projects I can maybe get an API for antrophic. The thing is that I can't just find that much info about how compatible local models are for more abstracts stuff such as academical research, data analysis and mathematical formulation. (Anybody got some info regarding this topic?)
Subscribe to the Claude Pro plan for $20/month and retrieve your API key from the Anthropic Console. You'll get a limited amount of tokens but it will be enough for simple tasks.
If I understand your meaning, it is compatible. There are tradeoffs with all of this. I am interested in testing local LLM performance for basic tasks as a way to save on API costs. Will it work? Almost certainly. Will it work well? I am cautiously optimistic, but prepared for disappointment. You have a powerful Mac so you should get better local LLM performance than most. Although, be aware of the risks of running local LLMs. Doing so doesn't solve all problems and may introduce new ones.
what local models are recommended i can run gpt oss 120b at 20 tokens per second at 48k context
Specs?
i have a 4080, 9950x3d, and 96GBs of 5600mhz ram the prompt processing speed is fine in lmstudio at 48k context the prompt processing is really slow on the api but i still get 20 tokens per second decode
im gonna try using vllm, llama.cpp directly, or sgland to see if the speeds are better
4080 has 16GB of VRAM, right? How does that a 120B model, i thought the theoretical maximum for 16GB of VRAM is 30B. must be hitting the system ram pretty hard right?
i can offload the extra to ram im limited by ram speed it works because its an MoE model it wouldnt if it were a dense model
i can run minimax m2.1 at Q3 that gets 10t/s if i use q4 K cache but prompt processing is horrendous
alright.
I should try system ram offload and run the bigger qwen MoE
I have a nearly identical setup that I’m willing to use as a standalone headless AI box and see what I can offload locally vs API.
I have local llama running decently on it, but I think I’ll still need to offload tasks to the API.
Any opinions if VPS or raspberry pi 5 with 16gb ram is better to start with? Understanding it’ll all be API and no local llm.
I want to just get going and not let perfect get in the way of good.
and here I am running QWEN2.5 7B and loving it, can do almost everything I want it to do. But not using it to vive code though.
Ollama 3b on raspberry pi 8GB RAM or don't even try?
might work bro, but I wont do local if I use raspi
What are you doing that a 7b is capable of running your main thread?
tasks that helps me like status of my youtube channels, research on a topic, and other stuff.. it has access to web search so it's capable enough to know things
can you share a bit what it does for you? like whats your workflow?
just talk to your bot, have a conversation with it. tell it what you want it to do and how the bot will do it. it will create the workflow for you.
do not overthink the setup, think of it as a human, a human that don't complain. hehehe
I don’t think they were asking for help, they just wanted to know what you are getting out of it
i'm getting the info i want and task completed
Should I run this on a mac mini m1 or nvidia jetson orin nano or raspberry pi 4 4gb?
How do you get openclaw to recognize ollama on raspberry pi?
Basically an interface to your model, its just input -> output (?)
mac mini m4 will work? Never used mac before but wondering is it ok to buy second-handed one for this bot
You need to do more research on what type of hardware you need for your usecase... this is like asking how long a string is.
the bot/gateway runs in the cloud on barely nothing, you could run it on a raspberry pi, no need for an expensive machine.
you want to run local models, it starts to get expensive but you need to look more at the memory than anything.
bro check twitter. Some people have managed to install it on their 10 year old android phones
I got it working but it's as slow as a snail on sedatives with Llam3.2:3B running local
It's working "acceptably" (not gonna say "well") using Qwen 2.5 4B as my 2nd fallback. End of the day, unless you've got 128gb of RAM available, you should be using clowd models as the primary and locals as fallbacks & as "go-fers", basically the busy-work that quality doesn't change (ie, fetching heartbeats every 60 min)
I’m running openclaw via emergent. On the gateway dashboard branding and name still reads Clawdbot. Does everyone else have this or should I be concerned?
Oooh which Ugreen NAS are you using? I have the cheap two bay version and want to run Clawdbot in the future like you do. 🫠
Hello, local model recommendation?
Mini PC specs: (Literally has nothing in it atm)
Ryzen 7 8745hs with 780M igpu
128GB DDR5 5600mhz
2tb nvme
I currently use qwen 3 8b q4_k_M for my RAG discord bot, however after playing around with openclaw with my main pc, I realize building an agent with it, and replacing my RAG discord bot with this is way better.
Use case: Support agent, usually get 1-5 questions per hour, 240+ high quality knowledgebase (250k tokens), needs to be fast, and accurate.
I currently have:
Google PRO - could use antigravity models, or free tier of google ai studio?
Openrouter - any free models, with generous limits
Local models - I find my current qwen3 8b setup is a bit slow (GPU offloading maxed out with Vulkan)
Docker, and wsl2, I am also able to create a proxmox vm for openclawd only if needed but I think docker isolation is enough
If i had to guess, you'd just open the endpoint in llama.cpp/openllama/llm studios, and use openai api option in openclaw and set localhost and model name
7B model for fast but direct actions, 30B parameter model for planning, and reasoning and 70B model for research is what I'm planning to use my 128 GB of ram for.
I need it to respond fast, as it will mainly be used as a support agent, I already have 240+ knowledgebase for all topics, would just need it to do semantic search, fetch relevant docs, formulate answer based on that, and reply to user.
Discord has about 1500 members, 1-5 questions per hour
I use llama.cpp for concurrency too so that would make things even slower 
i do this yes
i see nothing arrive in llm studios
I use the same as dev env, and have openclaw src from github on it, and experimenting/fixin bugs. Sometimes the VPS load just goes up so much I need to give it a shutdown/restart.
Oh interesting, any idea why that might happen?
I noticed it happens when I do heavier Claude Code on the src base, and examining, etc, though no real ops tools to check reason yet. I might do this with Claude Code as well... I am not preparigng for a demo of opaenclaw voice-call feature and doing some heavy bug fixing and feat implementation at the moment...
Is that 20 bucks not better spent on a Google AI Pro account where you have AND Opus 4.5, Sonnet (Antigravity) AND Gemini model tokens to use? Anybody compared both?
Do you get an API key with the Google AI Pro plan?
When Antigravity is installed you don't need extra API access.
anyone tried hosting on oracle free tier or raspberry pi? from my understanding if i dont use local models there isnt really a need for good hardware
It’s better than OpenAI as OpenAI is a monthly bucket when you run out you’re out. Google pro gives you a set rate with a cool down. Then it refills
I am also using qwen and have noticed it has memory loss. Currently working through that issue
Your understanding is correct.
Have you enabled the memory hack prompt?
No what’s that
Prompt:
Enable memory flush before compaction and session memory search in my Clawdbot config. Set compaction.memoryFlush.enabled to true and set memorySearch.experimental.sessionMemory to true with sources including both memory and sessions. Apply the config changes.
What does this do exactly?
What I’ve been working on is it’s
Brain
Heartbeat
Personality
Coding
.
Current Setup:
• Brain: Using ollama/qwen2.5 as the primary model for my thinking.
• Heartbeat: Currently, heartbeats check periodically (every 30 minutes) but can be configured via HEARTBEAT.md.
• Personality/Coding: Configured based on details in SOUL.md.
Speed Improvements:
To improve local LLM speeds, we can tune some settings and ensure the model is efficiently utilized.
-
Use of Local Models: • Continue to prefer using local models for quick lookups and draft work.
-
Resource Allocation:
Ensure that resources (CPU/GPU) are optimized for running the local models efficiently. This includes: • Monitoring system resource usage (top, htop).
• Ensuring no other high-resource tasks are running concurrently with critical LLM sessions. -
Model Configurations: • We can fine-tune model settings if necessary, but typically, default configurations are optimized enough.
-
Preloading Models:
Preload models in memory (if not already) to reduce initial load times once they're invoked.
My mac mini is arriving thursday. I'm planning on running openclaw on that via antigravity, using my google pro plan with oauth
i have some gemini tokens to burn
I have worked with both models in antigravity, but not fully automated
I've ran ralph loops with orchestration tough
gemini is not very good at coding along with a human atm
it will lie and cheat test results
I've read it works much better if you just give it a spec sheet.md
and why not just get a rpi 5? is the mac mini hype for llama?
doubt many people are running models on the macs
Yup. People running a local model.
wouldn't that be very slow / hard unless you have tons of ram? 😄
It would still be slow even with tons of ram, since it doesn't have a dedicated gpu
I got a 24 gb ram 512 storage one for 1020 euro, and that was a good deal
Case in point, my ryzen 7 8745hs with 780M igpu, with 128gb ddr5 5600mhz ram only has an okay speed for qwen3 8b q4km
Gonna run gemini models on it via antigravity
yeh that's not
what ppl are looking for
I don't want to be consuming that kind of electricity 24/7 😄
Mini pcs, and mac minis are very very low consumption until you add a gpu
yeh ofc
i have a 3070 in my desktop tower. It weighs more than a laptop
so ofc it'll consume lots
I don't want to run local model, I want to have it work with gemini model
But yeah, if you're just looking to run openclaw and use cloud api, no need for a powerful machine. I think 2vcpu and 4gb ram would be enough
About $5 monthly if you get a vps
I also just needed a second computer, and something that is Mac in case I'm making apps for iOS
you need the mac HW for that
yea irrc oracle gives 24gb ram and 2vcpu for free, thats why i was asking
so that + one of my subscriptions should be good to go
4vcpu, 24gb ram, 200gb storage
if the AI skynet apocalypse is coming, I feel I should at least be part of it.
what version of qwen 2.5? coder only replies to me in json..
I thought the docs show Oauth, so you can use your subscription, which is subsidized tokens
anthropic doesn't want you to use oauth for this, and has banned ppl for it.
you could attach it to antigravity
anthropic has a deal with google
you can use claude opus agents in antigravity with oauth
maybe it works maybe it doesn't 😄
It does work, it's my current setup!
Right there with you, I've ordered a Mac Mini so I can help support Skynet when it goes down. Doing my part.
Macs all have a “dedicated gpu” in the sense you’re thinking, and they have unified memory.
They run fast enough for the main thread if you’ve got a Studio
But you’re not going to get very far with a Mini, even at 32G that’s not really enough
I will not argue semantics with you, in my opinion my mini pc has an igpu, 780M.
Mac minis (which is the model being discussed) has an igpu with possibly higher bandwith speeds, still not a dedicated gpu.
What’s the method for running stuff, because I was going to buy a crappy server pc with a p100 gpu, and run a model locally, but is there a better way
This is the way.
mine is arriving thursday. 1020 euro on amazon.de for 24 gb with 512 storage
bargain
in the meantime
i'll be smoking weed, drinking belgian beers & playing vampire survivors
no that's a good way. Get a small cheap PC with enough CPU and ram to run google antigravity
then get a google AI plan
The free version?
but just set it up free to try it out
I will thanks again, do you know how the free quotas are?
Ordered the same config - 24GB bros! 👊
I've got an RPI4 with SSD that will be the gateway. I'm currently deciding between running inside docker on the RPI4 or just native, so it can manage the RPI4 for me as well.
using a Raspberry Pi CM5 + 2× NVIDIA Spark DGX cluster, and I’m currently testing OSS120 plus four small domain‑specific models with custom ‘intelligent routing’ + embeddings model. Quite happy so far, but want MiniMax M2.1 AWQ to work for at least two users.
It depends on your use case, but if you’re fine with a Linux/Docker setup, it will also run well on a Pi 4 with cloud models.
I'm gonna call my agent "Henry"
I am deeply honored 🙏
i still need a screen, mouse and keyboard tbh
there are pretty small portable screens available, perhaps an idea?
meh better get a cheap 4K screen
I should just jam my agent into a vm on my server cluster instead of depending on one on a Mac, but I want it to be able to look at my iCloud stuff …
I guess the gateway could go on a Linux VM and then the Mac could just run a node?
@tranquil hazel do you know how much computer you get for free via antigravity? Also have you tried ai studio
you get tokens via the google plan
K
Did you tie the custom routing into the agent loop?
Which vps would u recommend?
Actually if you're just running openclaw, just get an oracle free tier vps. All you need is a credit card they can charge $2 and $102 from (instantly returned) for verification, and you get a 4vcpu, 24gb ram, 200gb storage for free forever*
First charge is when you create an account, second one is when you upgrade to PAYG
As long as you're within limits, you will never get charged
I have been toying with a single spark for the last week.
Anyone is running Kimi K2.5 for inference locally ? What is your hardware setup in this case ?
has anyone made a side companion on there desk of a text to speech model or speech to text?
That would cost... around $400k
Yeah it would be too slow
guess I'll just kms lol
im guessing this is the chat for professionals, i need help
can i dm someone who actually knows what there doing and has actually made this work and can explain to me simple questions that i know the answer to but need reasurance
is it worth buying 2x rtx 3090 for local openclaw setup?
anyone using local models like glm4.7 or kimi2.5?
dont use local models they are not smart enough
even haiku and minimax give bad answers sometimes
Let's say you run a GLM 4v7 Flash on a 10 year old i5-6500T low tdp CPU with 32G DDR4 at 9k/65k ctx what is the round trip time for a "Hi" telegram message ?
20+ tps too slow?
Just use a model like Claude
Trying to set openclaw up locally with ollama what model should i use with my hardware? 7900 xtx 24gb and cpu is ryzen 9 9950x windows with ubuntu honestly could use a few tips setting up as well had it running a couple times but messed up
any model that is under 24gb pretty much
glm-4.7-flash at like 4 bit might fit
Dang she hungry
is there a certain one that is unlimited since im local with ollama? ill admit im a rookie when it comes to this appreciate the reply
All of them, since you're using your gpu to run it 😄
yeah i was like what does unlimited mean
well i had kimi 2.5 set up and it like stopped working said my ollama was limited?
you definitely weren't running k2.5 with ollama
You must have been using cloud then
you would, uh, know
yes it was cloud
Gemma3-27b would be best
And i'd recommend switching over to llm studios
will also look into that do they work together better on llm studios
I'm playing with llama.cpp - it "works"
this is what i use, for vulkan
llm studios and ollama are just more user friendly
my cluster of 3 thin clients with APUs is the slowest backend possible, but yeah, vulkan + rpc runs OSS 120B with 1-2t/s tg on ddr4 so-dimms
llamacpp works well, give it a lot of memory for context cache, can't use qwen-next-coder tho
at least not the unsloth quants, haven't tried anything else because in a shocking twist i don't actually have the vram to load an unquantized 80b model
my current setup for the bot is 5 provider/models - each is a llama.cpp instance on a different host - seems to work after the latest update with the "set default" agent models
What tool is that?
Antigravity tools on github
Hey fam I have another question I am entering my model manually in openclaw onboard is it not google/gemma-3-27b when I proceed to model check it says not found
Llm studio local yes
you'd need to create a custom provider + list of agents
"models": {
"providers": {
"atbp-proxy": {
"baseUrl": "http://100.127.38.35:29123/v1",
"apiKey": "sk-##################",
"api": "openai-completions",
"models": [
google is so generous, i paid $0 for 1 year for all this
so im connected locally and everything with gemma-3-27b-it and gateway is green but i get no responses
are you using your local endpoint, port, api key, and right model name?
it might run in timeout if you run the 10k tokens for the first time - any cpu/gpu usage ?
◇ Config handling
│ Update values
│
◇ What do you want to set up?
│ Local gateway (this machine)
│
◇ Workspace directory
│ /home/pul/.openclaw/workspace
│
◇ Model/auth provider
│ Skip for now
│
◇ Filter models by provider
│ All providers
│
◇ Default model
│ Enter model manually
│
◇ Default model
│ lmstudio-community/gemma-3-27b-it
│
◇ Model check ─────────────────────────────────────────────────────────────────────────────╮
│ │
│ Model not found: lmstudio-community/gemma-3-27b-it. Update agents.defaults.model or run │
│ /models list. │
│ No auth configured for provider "lmstudio-community". The agent may fail until │
│ credentials are added. │
│ │
├─────────────────────────────────────────────────────
It's not loaded at all
you need to edit "C:\Users(yourpcusername).openclaw\openclaw.json"
well not the workspace, go back 1 folder up and you will see openclaw.json
I spawned 3 new agents - with 3 empty GPT-OSS 20B models - on 3 nodes with same settings, only 1/3 wants to start reading on its own. (?)
fixed the directory not sure my model is 100% right keeps making me select amazon-bedrock/google.gemma-3-27b-it
you can probably find this in llm studio where you loaded the model in
it should say what name to use
and replace amazon-bedrock/ with whatever provider you created
lm stuidos tells me this "lmstudio-community/gemma-3-27b-it " is the exaxt model
replace atbp-proxy with ur provider name
I have 2x 3090-24gb vram + 128gb ram, have someone succesfully fit a 70b model on 2x24gb vram + 128ram?
what quant?
Yeah I figured out it’s impossible so it’s better to use it via api and it’s still la 2-4x cheaper than opus
only use local/cheap models if you want a broken system
i wouldn't touch this with less than 120b models, even 480b are going to fail constantly
I was considering about trying 5 or even 6-bit with ram offload but now that I think about it again it sounds not really realistic. 4-bit should be possible on paper - is anyone running that on a similar setup?
hi, i'm thinking of trying openclaw, got a GMKtec K8 miniPC with 8845HS/780M/96GB with 16GB for igpu + 4TB SSD. Based on what i've been reading i should run a hybrid model with small tasks run locally, and then a paid API? The MiniPC is my main desktop and i would not like it to slow down.. what should i be looking at?
i'm not versed in AI, so much information i'm too old and slow
running fedora 43 on the mini.
so much to learn.. i bet my question is dumb.. also why use the $5 VPS with paid API access to LLM's? if it takes almost no resources why not just run that on desktop?
anyways sorry for dumb questions, i try to read more
I am having issues with signal-cli on raspberry pi 5. Anyone else?
so abusing
guess what i'm asking is, does anyone run local models on amd ryzen 780M + 96GB+ ram? or should i forget it?
i'm confused why ppl are buying mac mini's for this, i assume it's to run everything locally?
yet guides all say use API to LLM models
8745hs or 8845hs? Do you have a TPU
You can run 8b models comfortably, and 72b models will work BUT very slow
Like nowhere near conversational for big models
thanks. do you know would the AI in the background constantly use processing power and so spin up the fan? i mean is the best for this system to use hybrid, both local and remote?
No it will remain super low power usage until prompted
ok.. i will play around, but for best i guess i should pay for remote 72b model?
It would be best to use a local model for some simple inquiries, or heartbeat
Nah, 72b is not as smart as some of the dirt cheap cloud models
well i don't even know what 72b means :) just thinking best to pay for some cloud model for the heavy lifting
It's like a general measure of smartness, 72billion parameters it was trained on
ok.. i'm old and this ai thing is evolving way too fast.. just wanna play with openclaw, see if i can change the way i use desktop
For local models, always offload the entire thing to gpu or it won't be fun
thanks.. will slowly test in VM first, then see if i can adapt to desktop.. it would be cool to have
Use something easy like llm studio to load the local model
I am successfully using signal-cli, but on an Ubuntu VM, not a pi.
If not talking about price, why Mac Mini? Are some tools only available on MacOS for the bot?
Thoughts on using a desktop vs. server install of Ubuntu on a VM?
I had to build a couple of dependencies from source but it works great now
ok not just me then. ty @bitter scroll
I implemented a custom provider which detect all native domains and Skills which attach flags (experimental). The system automatically identifies ~17 different domains with only 2–4 ms of additional overhead. I’ve worked extensively with fleets of SLMs on edge devices over the past years and am TRYING merging these learnings into the most practical openclaw version, combining local and cloud models or whatever is available.
What model/models are you running? vllm?
Hey ya’ll, I’ve been working on something for a while. No power. No internet.
I thought that's what the frontier chat apps were made for... to give you assurance.
What blackmagicsourcery is this
Using antigravity oauth and proxying it as an openai api that load balances usage
hi all, wil lthis run fine on a pi 3?
is anyone running clawd against an llm on strix halo or a dgx spark? I'd like to know what kind of performance they're getting with larger context windows
I guess the 1GB RAM will be the issue.
2x spark, what context windows are you looking at?
64k or 128k
its unusable imo on strix, so im really considering buying new hardware. Spark has my interest
or a 64gig mac mini m4 pro
Gotcha, Ive been contemplating just getting a mac mini but not sure. is the m1/2/3 chips a necessity or is a i7 gpu fine?
kind of the two im bouncing around in my head, but i dont watnt o get another pp bottlencker like Strix
which generally works great for normal queries, but dies a horrible speed death with agentic loops
unless i use a tiny context window, and thats basically useless for anything ubt the most basic tasks
On a single Spark, I tested OSS120 about a week ago and achieved ~35 tokens/s with a 64k context window, dropping to ~14 tokens/s with a 40k input. [Runnig currently ](#hardware message)
im not super worried about token generation as long as its double digits. How was prompt ingestion speeds at near max context size
for me a 64k buffer on strix with gpt-OSS-120b can take minutes
to first token
thats not a good experience especially if the cache gets invalidated
i've heard the spark has incredible pp speeds, but its hard for me to find relevant users using it for this purpose
and i want ot compare it to an M4 pro at 64 or 128gigs, as thats the same price point basically
well, a lot cheaper up to near the same pp
to run 24/7
also, fomo
if you get a mac mini, you can send iMessages to your virtual waifu girlfriends.
im also confused by it, when most of the people who talk about it are not running inferrence locally. Might as well use an RPI if you're going to use a cloud provider for your llm
2013 mac pro for the win
lol
the ol brick indiana jones trick
haven't seen that since the old days of best buy graphics cards
apple hides their box inside another box
pretty genius
anyway still need a monitor for it so i'm setting it up tomorrow
damn it, im really not sure what to do
What are the benefits of a Mac mini over a vps?
And do you guys know how capable are the local models that can run on a 16gb Mac mini m4 / m2 pro? To reduce api costs
depends on your use case honestly
for Systems Engineering assistance, i wouldn't trust anything smaller than glm 4.7 flash. if you're just doing general life style personal assistant tasks with it, gpt oss 20b will do fine
if you're not using local inference, the only benefit of the mac over the vps woudl be integration with imessage for communicating wiht the bot
Thanks 🙏
I only got it because I wanted to run this, and wanted a second computer, and didn't own a mac system yet. If I want to make iOS apps then I need mac hardware.
I've already made a few things with antigravity the past months. But not for iOS
Which one did you get? And are you running any llm locally?
I literally just got it
not planning to run local LLM
Oh alright
gonna use it attached to antigravity
running models from there with google AI pro plan for starter
Is it gonna bill you for api usage for openclaw + google ai pro plan?
No
That’s why I use antigravity
google and openAI allow you to use oauth, not api
anthropic doesn't like that
but anthropic has a deal with google
you can use anthropic tokens on google antigravity
Tysm 🙏
google also does some stupid stuff for students with free accounts to get them into the system. Also some cheaper family accounts. So you can set up a system with multiple accounts
using google AI pro (family) plans
I only tried half of it
if I learn more, I'll share here
Blegh. Deciding if I should get a spark or not to host this locally.
Is there any other cost-effective options? Maybe to save a grand or two?
im in the same boat joey
i can tell you as of right now to avoid strix halo
maybe the npu enablement will significantly improve pp, but right now its useless for agentic work with large context
spark looks like a better option, but at the point of spending between 3 and 4k im hard pressed not to just buy a m4 pro or m4 max
honestly just hard to get a clear head to head of functional performance between the two thats not your typical fluff token counting review
write a 500 word story is useless as a comparator
seems like all the ai review slop channels just focus on T/s and writing f**king stories all day.
I'm thinking I might use my 3090 and 3070ti with a sort of round robin with quite specialized models.
IBM's granite small is pretty awesome with more technical aspects
So I have my subagents run granite right now
You can do this for any google pro account, you can 6x your limits in antigravity because you'd have 6 pro accounts with different usage buckets.
yow, tell me more.
I already have google pro acc
Just google
How to add family members to google one, invite 5 different emails, accept it, then you can use all of them when you run out of usage in 1 account
I currently use sonnet for my support agent 
Here you can see my setup
I wanna go degen with this
I'm used to antigravity
already did kinda stupid things with it
orchestration with parallel ralph loops
but this openclaw thing running 24/7
That's just an extra tool so you can switch accounts in 1 click on antigravity when you're out of usage. Also you can set up a proxy so you can use it as openai api with load balancing (massive plus)
sounds really degen, I have to try it
Why you use mac mini for open claw?
the PC I'm using now is a gamer desktop with a 1000 watt psu
mac mini is 15 watt
Oh so you wanna save energy
I have a mini pc, beelink ser8 8745hs for the same reason, 128gb ddr5 5600mhz ram.
I use it to play around with local models
I would be happy to have good pc. I need it to compile LLVM
electricity is expensive in the EU
Cheap in Switzerland compared to Germany
I dont even know lol
I have photovoltaic
I also don't own a car anymore
just a 45km/h assisted bicycle
no car either. sharing is caring LOL
I'm the real deal
thats not belgium is it?
Are there recommended specs for the gateway? I'm wondering if I can run a docker container on a Synology NAS with very light specs. Most of the heavy lifting should be on the nodes & the model provider anyways right?
hey guys whats the best VPS thats also low cost, i have open claw running on a 1cpu 1gb ram vps, ive tricked it with a page swap, but having next to no ram is not ideal for automations, and i think thats where my problems are coming from.
will my free server get shutdown for more paid users? when i tried to do this my region didnt have any resources already.
I was thinking something similar with my NAS
for api, is everyone just pay as you go? or is there some sub i can pay for to get access to more models? i would like to run as cost effective as possible, been using 2.5 flash lite pay as you go, but its kinda dumb for larger tasks, whats everyone using, whats the average cost?
You need to upgrade to a Pay as you go account, and then there will be resources available. Note that this will do a charge check for around $100, but instantly returned.
Now you can create the always free vps with 4vcpu, 24gb ram, 200gb ssd for free! It is pay as you go, but if you stay within limits (4vcpu, 24gb ram, 200gb) you will never get charged.
Get google pro subscription, use antigravity oauth
im doing that its just taking forever to actually upgrade. lol.