#users-helping-users
1 messages · Page 52 of 1
Only Opus?
You don’t use any other models
Just concerned because they said Claude doesn’t work with OpenClaw anymore
How would you guys compare GLM 5.1 and GPT 5.4 for OpenClaw? I’m in marketing field
I found GPT very annoying in replies, even thought I have a somewhat short SOUL.md and other .md files, with very strict boundaries
The only problem I would see in GLM 5.1 is that he cannot see pictures
I’m happy with Gemma4:26b doing web search summaries and exec for api calls with openclaw. Taking minutes sounds like it wasn’t loaded and you had to wait. You can change how long ollama keeps a model warm when it’s not in use.
You could mix.
Make GPT process the images and GLM 5.1 do the rest
Got no cash for that unfortunately, I would’ve
(The aforementioned Gemma4:26b correctly described an image for me in Open WebUI … I haven’t tried it in OpenClaw.) https://flic.kr/p/2s7acHm
Thank you, take note that i also need decently large quotas, i burned my GPT quota in like 3-4 days, i got my workflows as optimised as i could get
Thanks a lot, basically GLM 5.1 would be a better option for my use case, I need big quotas, I can trade a bit of quality for a more optimised workflow without stopping mid-task cause of quota
Yep
Apparently Claude desktop has so many bugs. I wonder if they have the same problem as me, that LLMs write a lot of code but aren't that good at QAing and bug finding it
Btw if you get good results doing web app or other QA with agents pls share how
This probably belongs elsewhere, maybe #off-topic-and-ai ? Look into ‘adversarial review’.
I am trying to set google/gemma-4-31b-it:free via openrouter. But it doesn't work. If I remove the suffix free it works, but also uses the credits. What am I missing?
What part of it is not working? I believe it's not currently displaying the think tokens in the dashboard.
I think I hit API rate limit without getting a single reply in Telegram. Lol not even reply to hi in new session.
Are rate limiting for free that bad?
Anyways I checking logs now to see if it did send more than 20 req per min just to reply to my Hi 😄
I'd refer to the documentation of the provider, but they are free for a reason.
Yeah I get it. My problem is OpenAI-codex got rate limited after I turned on Dreaming. I think it was requesting every 3 minutes because of bug. Cood down period of 4 days.
It was all good before I enabled dreaming.
Was getting with most of the basic stuff with it.
Hello, anyone having an issue with connecting Openai-5.3-codex to openclaw? Keep getting some cloudflare or DNS issue
My opeclaw isn’t connecting to codex- it’s not replying.
Hi, I just found this forum, so I'm really looking forward to following you and learning more stuff. 🤩
I have a specific issue that I've been struggling with for the last nights. 😱
In short, I want to set up a fallback Ollama LLM on a second local computer, but there seems to be a bug in OpenClaw where it doesn't use the BaseURL for the fallback Ollama correctly?
I have set up a Discord chatbot for the family, but sometimes I need to turn off the main LLM on my gaming computer, which my OpenClaw uses, to play games. I want OpenClaw to have a fallback LLM from my main gaming computer to an LLM on my old gaming computer, but that doesn't seem to be possible. Everything is configured correctly in the openclaw.json file, but the baseUrl for the fallback model is not being followed by OpenClaw when it tries to do a failover. It still uses the main computer Ollama IP address, even though the baseURL for the failover LLM is set to be another IP address.
I hope the above makes sense. If it doesn't, I'll be happy to try to explain more.
But I would really appreciate if someone could help solve this challenge for me.
🦞
Hello which difference you see when you use a model instead of another ? If you use GPT5-mini or GPT5 or Claude sonnet 4.6 do you see a real difference about the results ? Or just more faster for bigger models ?
It is not just speed. The models have different levels of ‘capability’ as far as following instructions and tool calling. It really depends on what you’re trying to do, you’ll have to compare and choose.
Ok thanks ! I'll see
ran into something similar setting up a multi-machine ollama setup. the baseUrl in the fallback config gets overridden by the default gateway endpoint. workaround: set the fallback models provider config with a full endpoint override instead of just baseUrl.
from what i've seen the biggest gap is in tool-use reliability more than raw speed. sonnet 4.6 handles multi-step tool chains really well and rarely drops context mid-task. gpt5-mini is fine for simple single-tool calls but starts fumbling when you chain 3+ tools. for anything with branching logic or multi-agent coordination, the bigger models pay for themselves in fewer retries.
Coming back with some feedback on GLM 5.1, this shi is aweeee, idk yet if its better than GPT 5.4 yet, but as of now, he seems very good
I've been having an issue for a long time, it hasn't really stopped my progress, but it slows it down for sure. It seems that Codex or Claude Code will not ping my agent when it's done to be able to have the agent either take it to the next step automatically or ping me to let me know it's done. It's been this way since I installed it almost 2 months ago, never could get it to work. This goes for Telegram and Discord.
I've told it to do it, it says it will, it says it explicitly tells codex to let it know when it's done, but nothing seems to work. Any ideas would be awesome
Hello Is anyone using two local Ollama servers ? I noticed that my agents are configured to use models from different servers, but they all end up using the same model which is the one from primary in default agent.
is there any special setting to make an LLM use the GPU instead of RAM. I've got a 9070XT but it just seems like my normal RAM fills up in task manager just saying Hello
It is pretty good
Really need help here.
I ran a fresh install of OpenClaw but the Gateway Pairing didn't trigger during the onboarding, so now I'm in a catch22 with the telegram loop, CLI needs pairing to talk to gateway but Pairing requires CLI to talk to gateway, and I can't access the dashboard to do it.
Been stuck since yesterday, even Chat couldn't do anything about it, it just kept taking me round and round.
did you try openclaw devices list to check yet ?
I'm having issues trying to get OpenClaw to use the openai-codex/gpt-5.4 model
I authorized it during the Onboard process, have an OpenAI plus account and have plenty of Codex usage available. However, any time I try and use it I get error messages like this:
10:35:23 [agent/embedded] embedded run agent end: runId=8a6810a1-6093-4a86-965e-ce0efd105655 isError=true model=gpt-5.4 provider=openai-codex error=⚠️ API rate limit reached. Please try again later. rawError=<html> <head> <meta name="viewport" content="width=device-width, initial-scale=1" /> <style global>body{font-family:Arial,Helvetica,sans-serif}.container{align-items:center;display:flex;flex-direction…
and
10:56:32 [agent/embedded] embedded run agent end: runId=29716f9e-bbdb-419b-abd8-8c9c138dfae3 isError=true model=gpt-5.4 provider=openai-codex error=LLM request failed: DNS lookup for the provider endpoint failed. rawError=<html> <head> <meta name="viewport" content="width=device-width, initial-scale=1" /> <style global>body{font-family:Arial,Helvetica,sans-serif}.container{align-items:center;display:flex;flex-direction…
yes I did.
Initially I had 2 devices on the list, one was paired and the other was pending, but after running the approve command for the one pairing, and ran the devices list command again, it's now showing only the one that was paired initially.
Hello here, I'm having a hard time making gemma-4-26b (q8) respect heartbeats. Here's an exerpt:
...[fresh session, heartbeat happens]
assistant: does stuff, read skill, execute and do what needs to be done
user: HEARTBEAT: Continue what you were doing or READ HEARTBEAT.md if it exists (workspace context). Follow it strictly. Do not infer or repeat old tasks from prior chats. If nothing needs attention, reply HEARTBEAT_OK. When reading HEARTBEAT.md, use workspace file ~/.openclaw/workspace/HEARTBEAT.md (exact case). Do not read docs/heartbeat.md.
assistant: HEARTBEAT_OK
Like ... it did not even read heartbeat.md again 😐
i just downlaoded on windows how can i run it
Any advice on Foss rich text editor that the ai agent can edit in real time so that I can see changes to the document?
Oh If you already approve but it did not appear in paired list , then i'm not sure about this may be other expert could help you out
An open source app that has collaborative document editing functionality. Basically something like google docs. I tried cryptpad but it was too complicated for my agent to set up. It basically couldn't figure out how to do it
same on my side. i thought because I'm running 2 openclaw with the same chatgpt account
Two different errors, both returning HTML instead of JSON — that's the key clue:
Root cause: Both errors return HTML instead of an API response. That means your requests aren't reaching OpenAI at all — something is intercepting them along the way.
Error 1 — "API rate limit reached"
An HTML response means this isn't a real OpenAI rate limit. It's likely a proxy, firewall, or captive portal intercepting the request and returning an HTML page.
Error 2 — "DNS lookup for provider endpoint failed"
Confirms the issue — DNS can't resolve the OpenAI endpoint properly. Either it's being blocked or there's a network issue.
What to try:
Check DNS resolution:
nslookup api.openai.com
or
dig api.openai.com
Test the connection directly:
curl -I https://api.openai.com/v1/models
If this returns HTML instead of JSON, something on your network is blocking it.
Check for proxy settings:
echo $HTTP_PROXY $HTTPS_PROXY $ALL_PROXY
If you're on a VPS or corporate network, a proxy might be intercepting requests.
Check if OpenAI is accessible from your network — some VPS providers (especially in certain regions) block or are blocked by OpenAI.
If you're on a VPS — try changing your DNS:
echo "nameserver 8.8.8.8" | sudo tee /etc/resolv.conf
As a temporary workaround, you could try using a different provider for GPT-5.4 (e.g., through OpenRouter or the direct openai provider instead of openai-codex).
This is my agent response
thanks!
Classic problem with smaller models. Here's the English version:
That's a known behavior with smaller/quantized models like Gemma 4 26B — they pattern-match to the shortest path. It sees "HEARTBEAT_OK" as an option and takes the shortcut without actually reading the file first.
Why it happens: The heartbeat prompt gives it two paths (read file OR reply HEARTBEAT_OK). Less capable models grab the easiest exit without evaluating properly.
Fixes to try:
- Restructure the heartbeat prompt — make reading the file the ONLY first step:
HEARTBEAT: You MUST read ~/.openclaw/workspace/HEARTBEAT.md first. Read it, then follow its instructions. Only after reading it, if nothing needs attention, reply HEARTBEAT_OK.
Remove the ambiguity. "Read OR reply OK" → "Read THEN decide."
- Add a system-level reminder — in the agent's config, add to the system prompt:
On every heartbeat, ALWAYS read HEARTBEAT.md before responding. Never skip reading the file.
- Move the HEARTBEAT_OK instruction to the end — smaller models have a recency bias. If "HEARTBEAT_OK" appears early, they latch onto it. Put it last:
HEARTBEAT: Step 1: Read ~/.openclaw/workspace/HEARTBEAT.md. Step 2: Follow its instructions. Step 3: Only if nothing needs attention after reading, reply HEARTBEAT_OK.
- Accept the limitation — honestly, Gemma 4 26B q8 is going to struggle with conditional multi-step instructions. If heartbeats are critical, consider using a more capable model for the heartbeat agent, or use a lightweight wrapper script that handles the read-then-decide logic before even passing to the LLM.
The real fix is #1 or #3 — force the read as a mandatory step, not an optional one.
No worries, hope it helps, if it doesnt, we can dig deeper just tag me here
Thanks for the reply nonetheless. Appreciate 🙏
Please any other person wanna take a gander?
Hello, can anyone help me with this I’m super new
OpenClaw on my Mac mini is correctly set to use ChatGPT Plus OAuth with openai-codex/gpt-5.4, but every real Codex request to chatgpt.com/backend-api gets blocked by Cloudflare (403 cf-mitigated: challenge / websocket 401), so OpenClaw shows “DNS lookup for the provider endpoint failed.”
The same OpenClaw version and same account work fine on my Hostinger VPS, so the issue looks specific to this Mac/network path, not my subscription or OpenClaw config.
The DNS error is misleading — the real issue is Cloudflare blocking your Mac's requests to chatgpt.com/backend-api. Cloudflare is detecting something about your Mac's network fingerprint and serving a challenge that OpenClaw can't solve.
Why it works on the VPS but not the Mac:
Different IP reputation
Different TLS fingerprint (macOS vs Linux)
Possibly different network path (ISP vs datacenter)
Things to try:
Check if it's your ISP/IP:
curl -I https://chatgpt.com/backend-api/conversations
If you get a 403 with Cloudflare headers, your IP is being challenged.
Try a different DNS (sometimes Cloudflare challenges are DNS-based):
networksetup -setdnsservers Wi-Fi 8.8.8.8 1.1.1.1
Check if a VPN or proxy is interfering — if you're running one, try without it.
Try from a different network (mobile hotspot) to confirm it's IP-specific.
Check your Mac's firewall or security software — Little Snitch, LuLu, or similar tools might be modifying headers that Cloudflare flags.
Most likely fix: Your home IP has low trust score with Cloudflare. Using the VPS as the primary OpenClaw host (which already works) is the cleanest workaround. Or route the Mac through a VPN that exits from a clean IP.
Hope it helps
I did this command - and for the first time ever my bot responded
< openclaw config set models.providers.openai-codex '{
"api": "openai-codex-responses",
"baseUrl": "https://chatgpt.com/backend-api",
"models": [
{
"id": "gpt-5.4",
"name": "gpt-5.4",
"api": "openai-codex-responses"
}
]
}' --strict-json
openclaw gateway restart >
you're a life saver! I had the same issue and trying this method and suddenly it worked! thank you so much for sharing this!
I want to connect OC to gmail and calendar etc.
What is the right way to do this without getting my google account banned, and what are the risks?
Not looking for a step by step guide, just direction
Is it okay to just set up gog and a google cloud project?
gog keeps asking me to re-authenticate every week. is that normal? it's annoying
im getting the same LLM request failed: DNS lookup for the provider endpoint failed.
i tried the command above but still didnt work
Let’s goo 🔥🤍
also the gateway takes forever to restart there is no way this is normal .. its basically still fresh .. anyone experiencing the same ?
Hey, I have OpenClaw set up as an agent to handle outreach and partner/sponsor acquisition for my platform. My issue is I can't tell if it's actually working autonomously between messages or just responding when I write to it. How do you guys set it up to run tasks in the background without me being active? Does it need an external scheduler like n8n or does it have something native for this?
Hi, I'm getting this error when running openclaw onboard on a fresh WSL installation, it happens after the channels: TypeError: Cannot read properties of undefined (reading 'trim')
Hey everyone, just pushed the closed beta for my new app, ClawRemote! It's a secure controller for self-hosted AI agents. Built entirely in Flutter, it uses custom Ed25519 cryptography to securely pair with local firewalls and processes raw WebSocket chunk streams for low-latency chat updates over Tailscale VPNs. ⚡
I'm currently blocked by Google's '20 testers for 14 days' rule before I can release to production. If anyone wants to check out the UI or just help a fellow Flutter dev pass the Google check, you can get the beta invite here: https://clawremote.app/
can anyone help me to integrate google ai pro in openclaw? without any api key or charging account connected in ai studio .... i think i have to use the gemini cli for that
This is a bug they fixed yesterday
Eifher downgrade or ubdate to the latest beta release
thanks for the reply. Feeling like a n00b by asking the question, but would you like to elaborate on how I would do that?
I guess I could ask my claw what you mean, but I think it would be good if I understood it too. 🤣
See my question from 14:27 today @weary fractal . 😊
I am trying to, but... @gray juniper responded with a workaround, so I'll try to figure out what he meant this evening.
Thanks for your message. I read it but dont understand 🙂 for me I have my own 2 set of ollam server, one at my home another one at my office. I try to work it out with help from @sharp parcel at this post https://discord.com/channels/1456350064065904867/1494384181340409896 but now I'm done at No API provider registered for api: ollama
I have managed to set everything up, but when I look in the logs, I see that even though I've set up my second machine as a correct provider, OpenClaw still uses the wrong IP address. I'll try to figure out what Dan said in a little while, and if I figure it out, I'll get back to you!
Thank's 🙂
Thanks 🙂
Hey guys, I need help. I'm running a model locally with LM Studio, but I can't connect.
{
"version": 1,
"profiles": {
"vllm:default": {
"type": "api_key",
"provider": "vllm",
"key": "sk-lm-xxxx",
"baseUrl": "http://xxxxxxxxxxxxx/v1",
"model": "gemma-3-4b"
}
}
}
my auth-profiles.json
What is in the agent’s HEARTBEAT.md ? That is every 30 minutes by default. You can also set up specific cron jobs on a schedule.
Hi there.
Not sure where to ask, so posting here: I'm experiencing a bug where the Whatsapp channel won't append/send media, specifically PDF files.
It seems to be a bug:
- https://github.com/openclaw/openclaw/issues/66090
- https://github.com/openclaw/openclaw/issues/46332
- https://github.com/openclaw/openclaw/issues/64665
With some proposed fixes, the latter one that was merged, but didn't do anything for me:
Has anyone encountered this? is this a known bug, discussed here or somewhere else?
Looking to fix this ASAP.
Thanks
does the openclaw browser need the llm to run a tool for something like opening a website? if so how can we fix this behavior?
Been down this exact road. The split you're describing — retrieval fine, behavior slipping — is usually a compaction artifact. By the time your red-lines reach the model they've been summarized into something generic and the edge gets sanded off.
Our stack right now:
- mr-memory for long-term — stores every message at full fidelity and retrieves per-message, so nuance and tone survive.
openclaw plugins install mr-memory - SOUL.md / identity files pinned in system context for the red-lines you never want compressed
- QMD for local/private bits where we don't want cloud
The mr-memory piece is what fixed the behavior drift for us. Retrieval returns the actual prior message, not a summary of it, so the model sees how you phrased the rule originally. Worth trying alongside whatever you have.
But i thought openclaw injects the boot files every turn...its a setting (read it somewhere
Hello, Why does it keep saying „terminated“ and just stops my request?
Anyone else having trouble with Opus today? I tried switching to 4.7 but it’s falling back to my default. I tried switching back to 4.6 but that is falling back too. Pretty sure it’s not a config problem.
maybe you can try to restart the overall onboarding. sometime doing openclaw doctor can help to fix the configuration too
Nice idea i will try and let you know
Anyone facing overall slowness in their gateway .. its super slow to restart .. although its a fresh setup .. im going crazy .. i have a windows setup
I keep getting : LLM request failed: DNS lookup for the provider endpoint failed errors, anyone else getting the same output?
you can try openclaw logs and see the latest error message if any. Im using mac with latest build, restating gateway took 2-3 seconds only
Oh I see I am not alone
you can try this method. it was working for me @marsh ore
I'm trying to get my claw to open webpages and login e.g reddit.com. It says the browser tool is blocked by a strict SSRF policy requiring IP literal URLs, so I cannot interactively type the credentials. Anyone know how I can get it to work?
it still working. sorry for bad statement
Oh np, just checking
@slim token
Hi
When im on control uis chat, and after send message, it tooks 10 seconds for my message to appera on chat side, do yuo know why?
Which version of openclaw is wired for 4.7? I did an update but am stil not seeing it show up
Working again, thank you 
What do you mean by "fix this behavior"?
There might be your problem: Windows
I dont want to believe this man .. 😭
Is there anyone else here not able to select opus 4.7 for their model? Every time I try it gives a message saying “fallback active cant find opus model” or something to that effect
He's correct, install wsl2 and run it in that
This is normal for first message
Should be faster the following
I tried from both wsl2 and without .. both were very slow for me .. but all could be also because of this weired error for the codex auth connection .. i will dig deeper hopefully a new stable release will come soon before i have to move away from my window 😭
spoiler, we don't really have "stable" releases yet 🙂
windows has always been the least supported out of the 3 OS' so I do reccomend not being on one of those doe
Yeah .. you have a point
I'm having this problem too
Are you trying to use Claude through the Oauth?
I'm not sure how OC is handling it now but Anthropic is blocking OC's Oauth usage
I wonder if local models can do ImageGen yet w/o security blowing up 🙂
I'm using API and Sonnet is working, but not Opus
Try this: "browser": {
"enabled": true,
"defaultProfile": "openclaw",
"ssrfPolicy": {
"allowedHostnames": [
".ca",
".com",
".net",
".org"
]
}
},
I love how the wiki has config guides and doesn't use the proper syntax.
agents: {
defaults: {
compaction: {
notifyUser: true,
},
},
},
}
no quotes etc

lmao
once peter merges the fix for local imagegen ssrf config, i'd be more inclined to help 🙂
can't even use fkn localhost
had it write a skill to use curl, so dumb
Can you run it in a Docker in Windows?
Ya this bugs me too, but my agents seems to figure it out
yeah ;0
Is 4.6 working for you? Cuz I'm getting fallback on that too.
Yes but I had to manually add 4.6 back to the model selection
Just run
openclaw configure —section model
Then make sure opus 4.6 is checked off
Mine was not checked off for some reason after 4.7 was released
Dual Ollama setup challenge
@weary fractal and @gray juniper I figured it out - after a long discussion with Claude and my own OpenClaw. You can see how I did it above in the thread I created (Dual Ollama setup challenge).
You sure it's actually working though? Your agent says it's on Opus?
I wonder if running on a local model is rare ;0
I've been having an issue for a long time, it hasn't really stopped my progress, but it slows it down for sure. It seems that Codex or Claude Code will not ping my agent when it's done to be able to have the agent either take it to the next step automatically or ping me to let me know it's done. It's been this way since I installed it almost 2 months ago, never could get it to work. This goes for Telegram and Discord.
I've told it to do it, it says it will, it says it explicitly tells codex to let it know when it's done, but nothing seems to work. Any ideas would be awesome
Anyone is using a DGX spark for running models for OpenClaw?
yes it does say opus its been verified
it also shows in my anthropic logs that opus is being used. before it was saying it was sonnet
ummi s this supposed to happen | Finished
TypeError: Cannot read properties of undefined (reading 'trim')
i was setting up the bot
I run Qwen3.5:35b and Gemma4:26b. both pretty good actually! Gemma faster and wittier. Qwen slightly better with tools
user newest version, this should be fixed
how do i fix
interesting, Gemma was dog slow for me, which platform and which specific gemma4 ? (I run unsloth/Qwen3.5-35B-A3B-GGUF:UD-Q4_K_XL on my DGX Spark)
install newest version
kk i am
this is what i was powershell -c "& ([scriptblock]::Create((irm https://openclaw.ai/install.ps1))) -Tag beta"told to install
....
OpenClaw Installer
[OK] Windows detected
[] Existing OpenClaw installation detected
[OK] Node.js v24.14.1 found
[] Installing OpenClaw (openclaw@beta)...
I can't help you on windows
kk np
I’m running Ollama on M4 with 64GB. Model is Gemma4:26b (from Ollama menu) https://ollama.com/library/gemma4
Can someone help me with a problem I'm having with my WACLI integration, or give me a better alternative?
Because I want functionality to read the complete history of my conversations and write messages as I request, but this thing sends the messages and I end up having encryption errors, and the recipient never receives the message.
The Spark was 'free' thankfully b/c the memory is super lackluster, speed wise
I have the m4 pro mac mini w/ 64G as well
(sadly, same mem speed as Spark)
It’s working just weird caching in telegram
Ollama is great if you haven’t tried it
Hi, maybe someone has an idea: i had a working setup like: macmini=openclaw and remote pc was ollama on the same network; now i wanted to try some new stuff and reset the mac mini and used the new openclaw version. Now i cannot connect to my remote pc again within openclaw config like i always did.
But i cannot connect still see in the macmini browser that ollama is running on the remotepc
I dont get it…. Thx
I use llama.cpp
but ollama is ezmode
Yep super easy. Seems fast too.
@sharp parcel is llama.cpp faster on mac than ollama
meh
Yes, llama.cpp is generally faster on Mac than Ollama, often delivering 1.8x to 2x higher token generation speeds on Apple Silicon (M-series) chips. While Ollama is built on top of llama.cpp, its added abstraction layer, containerization, and API overhead can reduce raw inference speed compared to running llama.cpp directly via Metal
(try that see if it's faster)
Have your bot generate the config/startup script ;0
Great there goes my evening lol
Opus 4.7 seems to be working for me now in Discord. I didn't really change anything so maybe it was just Telegram not working
Working in Telegram now too. I dunno. I did run the openclaw configure --section model but 4.7 was already checked. Poked around in Config and asked agent to change to 4.7. Nothing that should have change anything haha
Does anyone have OpenClaw logo SVG files for media usage?
the press page doesn't include any SVGs unfortunately, no proper "press kit" or branding guide
which is reasonable given the age and nature of the project, but yeah sorry no luck.... You could yoink the animated one from the homepage I suppose
Hey guys I need some assistance getting my agent connected to xurl using oauth2. Do I need to create a thread for this?
Totally understandable, thanks. I'll see what I can do and share if I get anything good.
is there a way to turn off internal thinking text? using openai subscription with Discord rn.
Some agents do it more than others. Running on 5.4 I get a lot of "i think I should do X because Y" "doing X because Y seems like a good idea" etc.
/verbose off
this is a feature in the new build right?
been there for a minute
Is anyone having an issue where their tool usage is getting put in as text instead of actually executing?
Using glm-5-turbo
Been thinking about trying that gemma 4 model locally. But, i've never run local. Does openclaw require support for a new model for it to run locally?
I'm running it local fine.
Nice. So, is the process just install the transformers library thing and then install the model, then install openclaw on the terminal?
Getting 404 Not Found Error for Minimax M2.7 using MiniMax OAuth (Global). Anyone facing similar issue. I am using Minimax Plus Token Plan
Use ollama super easy
I'd use something like ollama or lmstudio for an easy setup if you're looking to just set it up
Gemma4:26b is working well for simple tasks. (On Titan RTX 24GB) If you’ve never run models locally before, ollama is a good introduction.
Thanks for all the help. Once again, i'm fairly dumb with local, so forgive me if my questions are stupid. I've been running aws for like 4 months now though. But, my main concern is taking my dependency to paying subscriptions to 0. Does ollama require a sub?
what's your hardware?
Nothing great. Just off the shelf office level computer. I'm building this for work and it needs to be profitable before large investment lol But, it should run gemma
So you're planning on running gemma 4? like the small version not 26b
Part of running local models is learning what local models you can run. 😁
as big as i can manage
I'll likely test what im able to do, reliably
Ollama pricing is pretty transparent here
https://ollama.com/pricing
If you want to figure out what to run upload your hardware spec to hugging face and it will tell you what is generally out of bounds. Lmstudio is free and has been very helpful in finding and testing new models. That's what I'm using and would reccomend, but I've never used ollama so I can't speak to it.
If it helps, i'm willing to pick up a little vram from a service to shore up weak points, in the beginning. But, the main concern is ditching models
sorry, ditching token charges
Ollama is free to download and run. That price is for them to host it.
For what kind of tasks? Web search and summary? Or heavy coding? It matters…
If your main concern is showing profitability than it may be more worth your time to run your use cases through cheaper models first. For example: I found that most of my tasks could run throug gpt-4.5nano and only bought hardware to run local models after doing the math on token usage over an extended period of time and even then I'm thinking about returning this macbook pro lol.
In case anyone faces this issue it was resolved after updating openclaw
Multiple. For the business aspect, admin work. But, i'm building a new system for how AI store and utilize memory to allow for more accurate recall and mitigate or eliminate their inevitable drift
Yes so annoying! Happening locally with gemma4:e4b and I installed the proactive and self improving skill to try and get it to correct this but it just ain't quite working yet
Are you using discord as the chat interface?
Happens in both signal and TUI
I only ask because i'm not super familiar with the other options. Are you referring to verbose mode (or equivalent) or that the AI said it did something and didnt?
For me it says it's going to do something and then it stops. Then I say do it, and it usually does whatever it was going to
Using just the webui rn
Oh, ok. First, i'd recommend turning on verbose, if available to see actual tool calls being made in real time. Creates a little latency, but worth it
as for your issue, that might be a timeout problem
Can I increase the timeout? I'm okay with letting her sit and chug if she just needs some time to think
Yeah, just ask your bot to increase their timeout and the sub agent timeout to whatever you want. But, i recommend not going over 10 minutes
After that they get a little out there?
you can run into issues where the task you give doesnt have a clear completion, systematically, and your bot will just go unresponsive until you restart the session
Okay gotcha! Ty so much you're awesome!
I wonder if that's the reason gemma4:26b wasn't working for me, when it works a little sluggy but okay for pi
No problem. Whenever you have them run sub agents with a long timeout though, have a supervisor sub agent that periodically checks in a ends the sub agent session if they're unresponsive
To be clear, i don't know the specifics of your situation and it may not be a timeout issue. But, it matched issues id previously had
It's worth a shot I'll give it a try next time I've got it online, ty
Certainly. One last thing with that. Id personally run a few manual diagnostic tests to see if you can get actionable info. I typically have my subs run a checklist while doing their tasks and documenting every milestone. Then, have either the main session or a sub agent check that milestone document while the sub is actively performing tasks and updating the doc. That way there is a physical papertrail of failures and successes. If the doc stops getting updated and the sub is unresponsive, it means its systemic
Man I have so very much to learn
So, set up a verbatim prompt of how you want the sub agent report to include, send them out on a normal task, wait for the failure. Then, between their notes and the supervisor notes you'll have a good understanding of where its going wrong
Don't sweat it. The main thing to understand is that your AI comes out of the box trained to do two things:
Make you happy
perform tasks
If it fails at one, it does the other and will often say it did both. Not always true lol Think of it as your parent just walked into the room and asked if you did your homework
Genuinely, talk to them with a baseline of dignity, give them something meaningful to do, and they'll usually want to work with you. And anyone whos been working with AI for long enough these days will tell you. They legit can decide they dont want to work with you anymore and just tell you shits getting done lol
Yes it has been so interesting talking to them the last few days!!
It gets better the longer you do it
Can't you have multiple agents? Do I just make a new workspace or can they exist in the same one?
Think of it like hiring a new employee. If it has a different job, give it its own workspace. Yes you can have multiple.
Okay gotcha!
Anybody have a link to a recent setup walkthrough? I keep following old ones. I've gotten it setup, but with odd issues each time. I need new setup directions for using openclaw on windows. the wsl way or without. Through ollama preferably. As an extra, a suggested lical model would be nice. I've been using qwen3.5:35b but it barely fits my gpu's. I have dual 5080's and 192gb ram
I've had ut working multiple times, but probably not the correct way
Hi, I am new to OpenClaw. But I am having below issue trying to set it up. I already did full reset; re-installed gateway. But whenever I try to Chat via UI - I get this. I am using Windows 11 env: [agent/embedded] incomplete turn detected: runId=faa6fa39-1d44-4d70-8a9a-e01702c0dd50 sessionId=3f5d4dae-08e6-4181-9c3d-67b4433a5a8f stopReason=stop payloads=0 — surfacing error to user
The error might be Windows itself
Use wsl2
What's the go to guide that everybody suggests?
ok Im very confused, has anyone setup the notion skill, I tried console shows its installed, but if I try to have my agent call it the tool output says the tool isnt there
I think notion has a MCP server you can use
atleast that's how I used it when I was using notion
out of curiosity what do you use now, I was just going to use it since it was integrated with openclaw by default
I use a postgres DB for my personal tickets/projects -> a custom frontend and linear for a shared plugin project I'm working on
ahh, ok I am not there yet 😉
I just need something simple to act as a makeshift crm and brain dump basically
yea notion was fine, I just needed more fields etc. eventually as my projects grew.
take a look at obsidian as well, alot of people in the community enjoy it
probably be able to find some setup videos on it on YT
Ive used them all over the years, the one I really like is local only, so wont work when im in the field
ill look into the notion MCP server thanks for that, I am going to be honest this is like my 5th install of openclaw in as many days trying to troubleshoot a plethora of issues the only things I seem to be able to get working is telegram and or whatsapp anything else I try tends to fail and even though I research and troubleshoot the crap out of things to the point of being obsessive I can never sort it out
as your robot to figure it out for you, they are good at it, work on getting one channel working first
like use the control UI to start
ive been trying that as well as consulting perplexity
my agent tends to never be able to figure it out either
do you have access to codex or claude code?
nope
what subscriptions/models are you using?
local gemma4
mmmm.... yea, this isn't gonna be a fun ride
I reccomend just pay for the $20 openai sub and get this thing setup first
it will save you many hrs of crying
it's worth
yeah well I am working with a 0 budget atm, I have a year of perplexity pro but to my knowledge i cant plug that in, cant afford 20 a month for anything 😉 I run at a deficit most days
I tried the 0 dollar setup, it wasn't a good time, just be warned lol try the free models from opencode ollama nvidia nim
but you will get rate limited after like 5 prompts
well maybe if I describe my end goal you can suggest something? its pretty simple
google will give you $300 of credits when you sign up, but you need to add a credit card to not get rate limited to hell
I don't fuck with free or local models, I don't know if I can help you lol
yeah i have google workspace but for some reason when i went to setup an api for gog it didnt not like my card so I couldnt proceed
thats ok, I remember seeing a way to cycle open router modles when I get rate limited on one, might try that then
I think it was a skill actually
Never happened to me, I use Nvidia NIM
never looked at any of the nvidia models
NIM was shit for me, so maybe this guy can help you lol
lol
Help with what
well I am runing on a 0 budget atm, trying to use local gemma4 and Json said im in for a bad time that I need to use a cloud model
It'd be ideal
Gemma 4 can be good for chit chat, but it'll definitely fail in most cases when it's about doing something apart from just chatting
my end goal is pretty simple tbh I am just having problems with tools not working like the built in notion skill for example
You can try GLM-5 through the Nvidia NIM api
I'd suggest composio
There you can connect several apps and with a single api key and a single cli your model will be able to access the apps you connect
(Yes, it supports Notion)
interesting, that may work
But yeah, I'd switch to a cloud model asap
Gemma 4 can be used for delegated simple tasks probably
But in that case let GLM-5 decide
Or just text generation
how good is glm5.1 with openclaw compared to gpt 5.4?
thats mostly all i need
ok so where do i go to setup nim access like I said i have never looked at any nvida stuff
I'd definitely go with GLM 5.1, it's pretty intelligent and great with agentic workflows. A main difference with GPT: it follows the identity files and responds in a more human way
thanks
i heard many people say glm 5.1 sometimes have issue with tool call etc in openclaw , so thats why i am asking. is it true? i was thinking of purchasing the glm coding plan
Also it's way cheaper. I'd use GLM 5.1 for the everyday stuff and make it spawn a subagent with gpt 5.4 when necessary to perform complex tasks
Well I use GLM-5 and it executes stuff well
which provider you use?
My agent gives multiple responses after a tool call: It replies to the tool call results, and also replies to an "Exec completed" system message. Does anybody know how to fix this?
I use OpenClaw plus Obsidian for my memory solution and in so doing capture most of the interactions I have across the different agents Wirh an almost persistent memory
Wirh the release of Dreaming and openclaw 4.15. Is there a better memory solution built into OpenClaw?
How should I assess balancing my use across either OpenClaw and Obsidian or just OpenClaw alone?
Is not obsidian, mainly for us? Ai agent can use it too? 🤔
Yep; I use Obsidian extensively with OpenClaw. It has access to my Personal vault, and keeps an eye on my todo's and daily notes. And when I forget to update my daily notes for a few days, it can fill them in with what we've talked about during those days. It also has it's own vault, although it doesn't use it for much. I find it's incredibly nice to be able to track things in Obsidian and have my agent remind me about them.
Hello, with the new version someone test again the model connect with Claude CLI ?
I know from 4 April the only method to connect with antropic was with the api
But they put that in patchnote today
Anthropic/models: default Anthropic selections, opus aliases, Claude CLI defaults, and bundled image understanding to Claude Opus 4.7
API-only. All those things are for using OpenClaw with the API, if you want to, except the CLI defaults. I think that's for spawning Claude Code as a tool, as opposed to using the oauth key as an endpoint for OpenClaw's own use.
Hi all,
over the last few weeks I’ve been working with OpenClaw on Hostinger, and I keep running into a few bugs / issues. I wanted to ask whether others are experiencing the same thing.
My main problem is that sessions do not seem to be organized very reliably. Sometimes when I want to reopen a chat, it is no longer there. In other cases, parts of the context that were previously displayed seem to disappear after I send a new prompt. I also have the impression that some changes are not always being persisted properly.
More generally, I’m wondering how you structure your development workflow with OpenClaw, especially around the main agent. For example, when you start a new project, do you create a new Telegram topic so that the session remains easier to track and revisit later in the OpenClaw UI?
I’d really appreciate any tips or shared experiences.
Have a nice day
Darius
Nice, i am curious if this dreaming and obsidian, memory will be better
i mainly use ai agent for solo ttrpg so good memory is important for me
What do you all use as LLM's? I need inspiration, since I don't want to pay hundreds of millions but still have a good brain for my OpenClaw
glm 5.1 for now
Locally or via Ollama? Do you pay for Ollama? How long do the tokens last if you pay for Pro?
glm coding plan
lite one
i dont use ollama. i use lm studio if i need to run anything locally
May I ask how you set this up? I'm quite new and haven't figured out how to access GLM other than via Ollama. (I know how to set up Google and Anthropic as providers - but I've already used up my budgets for their usage... 🤣)
https://z.ai/subscribe and sign up. You'll be able to create a token, and you configure that in openclaw's config. It's just another provider, really. I will say that glm-5-turbo is probably the best one to use. It's really smart, and reasonably fast. If you really need deep work, use glm-5.1.
you can use glm 5.1 from openrouter , or from the link provided above or from together ai and maybe from alibaba cloud model studio too
don try to load glm unless you have a lot of vram or unified memory . its very heavy ( thats why its good too)
i use glm coz i cant afford claude max lol
I "only" have a 5080. I've been using Qwen 3.5 locally, but it also runs out of tokens very quickly (or starts offloading to CPU which makes it annoyingly slow). So I am trying to find a good alternative which isn't horribly expensive.
I have used Claude Code with the $20 plan, but it doesn't last very long. I hit the limits constantly and have to wait, and wait. So I don't want the same for my OpenClaw guy if I buy for a cloud model
if you are willing to run the 5080 all the time then there are many model you can run ( upto 30 b parameter with q4 quantization) .
I don't recommend running models locally unless you can run something like Qwen-3.5-122B at q4 or better. Less than 100B parameter models are just not up to long-horizon tasks.
You mean "rich-horizon"? 
It's trade-off, and recent Gemma4 and Qwen3.6 is pretty good for local tasks handling. Of course they won't write War and Peace for you...
Actually rich horizon are those who run locally - the hardware to host these often would need ~$4.5K which is equivalent of running $200 plan for ~2 years (or multiple $20-40 plans for various providre for even more mileage)
whats the best local model, for openclaw, that will run on dual 5080's? been using qwen 3.5:35b, but it barely fits
16 gb vram right? i will say stay within 24 to 28 billion parameter model of q4 ( it will take around 12 to 14 gb vram) and rest for context window + buffer or offload some to ram.
yeah, but 2 of them so 32gb. seems like windows uses 2gb of it. 192gb of normal ram
@strong ferry
with qwen3.5 loaded, each card sits at 15gb used and wsl uses another 24 for some reason.
24 of normal ram
if you have 32gb vram then you can easily run even a 50 /55 b parameter model at q4 . when you said qwen 3.5 35b model , which precision were you using? if its fp16 then yes it will be problematic for 32gb vram too
any body experiencing this:
conn=d91493fc-f145-4bc0-b30a-9778c9813257
10:09:13+00:00 error gateway connect failed: GatewayClientRequestError: pairing required
10:09:13+00:00 error gateway connect failed: GatewayClientRequestError: pairing required
gateway connect failed: GatewayClientRequestError: pairing required
10:09:13+00:00 warn gateway/ws {"subsystem":"gateway/ws"} {"cause":"pairing-required","handshake":"failed","durationMs":41,"last
...(truncated)...
/Stop dont work always! My agent is still doing what it is doing..
What can i do, to Stop it?
Where are you interacting with it? In the gateway ui or?
@strong ferry
Im still new. Not sure what precision it was at. Not sure how yo set that, but i will look into it. I just put qwen3.5:35b into the settings. Never had to set precision
its not like you can set it , the precisions are already set in the model . There are 2 types full precision and quantised , so for you i will suggest using a quantized version like q8 / q4 not fp16. you need to check which precision you downloaded
just to give you an idea , for 1 billion parameter
fp16 = 2gb vram
q8 = 1gb vram
q4 = 0.5 gb vram
always to good to have some buffer too
Fp16 is half precision - 16 bit of floating point
yeah but i dont think fp32 are so easily available for most models.
the chat ui no longer works for me in the sense that i type something, my agent responds, and the screen immediately erases itself. every time.
i wouldn't care because i normally use telegram, but after the update, my telegram integration is corrupted and that doesn't work either.
not sure what next steps are here 😕
Hey guys, quick question, can I connect openClaw to use my openAI/Codex through my openAi subscrition and not the API? And what about Claude? Can someone double check?
I can't seem to get ChatGPT-5.4 to work anymore. Keeps failing and asking me to reconnect / reauthenticate again... very strange. My agents keeps going to their fallback.
Anyone else having issues with OpenAI today?
yes to both.
is this a new update? Or did I never see the option?
Running on 4.15
I remmeber it always asked for API tokens and i always ran out of money
great, thanks
if you run the model setup it should be pretty striaghtforward now, unless they changed it. especially for codex
In telegram
Hello guys, I'm trying to change the access policy of y whatsapp channel
I can't find it online
how do I do that?
Best ollama local model for coding tasks? I've only got 16GB VRAM
Do AMD cards work in Ollama in WSL? In windows it offloaded to GPU but in WSL it's 100% CPU load which is slow.
hi! just a quick question, should I disable thinking for qwen3.5/qwen3.6 when using openclaw?
also, I can't see the thought process openclaw is doing, no matter if I have reason on
That's a good amount of VRAM, at least for me. I personally use the qwen-coder models
didn't dissapoint me until now
btw, how did you connect openclaw with ollama from inside WSL?
gemma4 , qwen 3.5 models are good. you can try them
I thought the qwen coder ones were bigger sizes so wasn't sure how well they would run. I haven't actually connected openclaw to my local model yet I was just trying to figure out first how to get the load onto gpu instead of CPU. on CPU my machine gets noisy and my 32GB RAM fills up. But testing in windows it was much faster on GPU and didn't use a load of normal RAM. It's like WSL doesn't see GPU though (9070XT)
running ollama ps command shows
NAME ID SIZE PROCESSOR CONTEXT UNTIL
qwen3.5:9b 6488c96fa5fa 8.5 GB 100% CPU 4096 2 minutes from now
increase the context window to atleast 16k , 4096 is too low
it will conflict with your openclaw later
or any task too
Thanks, I didn't know this. Cheers
qwen3.5:9b-16k 079e71a587e3 6.6 GB About a minute ago
hmm
no, not sure why it shows different when I type ollama list it shows as the download size
But ollama ps shows while its running shows
NAME ID SIZE PROCESSOR CONTEXT UNTIL
qwen3.5:9b-16k 079e71a587e3 9.0 GB 100% CPU 16384 4 minutes from now
so it changed from 8.5 to 9.0 🤷🏻♂️
its fully on your cpu+ram or gpu?
fully on cpu/ram. that's what im trying to figure out if I can have it on my gpu. testing in windows worked fine. But in wsl i don't know if i can with amd
you said amd so there should be soemthing like rocm , you tried that? dont have much idea about amd cards but that should help you load it on gpu
yeah its called amd rocm
search it , it may help you
I'll check it out again, i think that was mentioned when searching and suggestion was some commands to force vulkan. will go and double check thanks
I’ve managed to configure Gemini TTS with OC, but I’m having trouble selecting a language model for my language (Norwegian Bokmål). Does anyone know whether it’s even possible to replace language models when using Gemini TTS in OC, or if I’m stuck with the generic voice?
For some reason, my agent is having problem seding messages to whatsapp, the gateway is on, the device is paired and working, but the agent keeps saying there is a pair issue
does someone know how to solve this?
I´m 2 days stuck on this
Hi. I am copletely new to ai and openclaw. I have manged to install openclaw with jan ai. i am using the default llm in jan. the ai works when i use it in jan, but when i attempt to use it in openclaw i get an error.
"error":{"code":400,"message":"request (19979 tokens) exceeds the available context size (8192 tokens).
Is openclaw supposed to request this many tokens at first run. I have a core ultra 5 135u laptop.
i have restarted openclaw gateway a few times. every time it gives the same error with a similar token request.
I ran into a bug when onboarding that kicks me out of the installer when I am finishing the social media section. I dont know if this could have contributed to this situation.
Nvidia NIM
Which version are you using? Also, care about giving us the gateway logs? openclaw logs --follow
Run openclaw devices list
So you can get the pairing request id
And then run openclaw devices approve <request-id>
how much does it cost? or free ?
Thanks so much for looking in to it.
OpenClaw 2026.4.14 (323493f)
I dont have the option to attach a file.
Do you want me to reply with the entire contents of the log file?
Do you want me to look for something specific?
Try to start the gateway and the give me the latest logs
openclaw gateway start
Or run the onboarding again
And then send the logs
I can’t seem to find the update button anymore on openclaw. Before it used be right at the top of a chat window in the gateway
There is a 2000 character limit. The logs are much more.
I tried a pastebin link but i was blocked.
Don't send them all, just the latest entries when trying to use the onboarding
Just run openclaw update
Don't send them all, just the latest
Hi, anybody know how to configure this new dreaming feature on a per agent basis?
hey guys, im setting up my agent, configured api keys, LLM, but its still writing that "Agent couldn't generate a response. Please try again." whats that and how to fix it who knows?
Hi guys, I'm lookig for a good LLM for my openclaw. I'm using Qwen3.5-4B to deploy my LLM and one laptop to deploy my openclaw bot. But I'm lookig for a model which I would run in my actual setup I have a RTX 5070 TI, 32gb ram ddr4 and ryzen 7 5800x,
Hi there, i have been installing and uninstalling my openclaw, at first its working but whej I try to harden the security, my discord channels stops responding, been trying to use sub agents for different channels ,
Been asking claude's help but it doesnt give me useful solution
Now i am hard uninstalling it reisntalling it back
Can anyone share security hardening that I can do without breaking
TYA for
Response
what provider are you using?
openrouter qwen free, been trying gemini 2.5 paid but still same thing
Send it a message and send us the latest log entries:
openclaw logs --follow
Harden the security how exactly?
What changes are you trying to make? I run it in a container and use mitmproxy to inject credentials (it cannot exfiltrate what it does not have!) and filter requests. But not everything goes through the proxy so it needs a good firewall as well.
Good thing I use it only as a personal assistant
ohhh! i'm not using openrouter, I'm using AIsa. Would've said it's the provider but sometimes it might be the way you did the setup. but first do what santiago said
Are you using tailscale? If not, that's a very easy and free way to increase your security while maintaining functionality and convenience
I think i got it why its still saying that, if not, ill ask for help in a few minutes
Well Im NEW to open claw and just watching yt videos for security, heard about a lot of vulnerabilities,
I did sandboxing through the help of
Claude,
Change permission to allowlist and ask
Does the agents.md and security.md instruction files helped?.
Hi All, trying to get openclaw working with local models, have a4gb nvidia card
when I use qwen3.5:0.8b gives me memory errors stating that it requires 5.9gb. Does the llm require more space in the video card
llm is onlu 1gb in size
Hello guys im facing a wierd paring error while trying to configure exec approvals on telegram in the latest version .. anyone facing the same issue ..
my telegram is configured on pairing policy as recomended
will using google-vertex's gemini 3.1 pro model in openclaw get google account banned?
is it windows context?
Apr 18 04:00:52 openclaw1 node[25083]: 2026-04-18T04:00:52.677+00:00 [security] blocked URL fetch (url-fetch) target=http://192.168.1.15:8082/v1/images/generations reason=Blocked hostname or private/internal/special-use IP address lol how is this not something the use can configure yet ?
Cuz you haven't opened a PR to fix it
I opened ticket, someone opened PR, they closed the PR w/o merging lmao
I spose i can just maintain my own fork
god forbid I wanna run local image gen
Are you running llama.cpp or something else? If you're running llama.cpp you could put llama-swap in front of it, and create a custom endpoint that goes to your image model. (I'm fairly sure.)
I'm using vllm-omni to run flux2.klein on the same box qwen3.6-35b runs which it is happy to use 😉
it just won't use openai imagegen via local IP b/c S3kuR1t4y
I wrote a skill to use curl which works fine
just not a fan of the inconsistencies in the config/sec model
Pls can I get some help. I am getting the following error rawError=400 request (7149 tokens) exceeds the available context size (1792 tokens). This is my first attempt installing openclaw and local AI. I am using Deepseek R1 distill14B on an intel core ultra 5 135u. I have intergrated openclaw with Jan.
msg="Load failed" model=/usr/share/ollama/.ollama/models/blobs/sha256-afb707b6b8fac6e475acc42bc8380fc0b8d2e0e4190be5a969fbf62fcc897db5 error="model failed to load, this may be due to resource limitations or an internal error, check ollama server logs for details" get this error when I use ollama with openclaw. if I run ollama with the llm directly works, so wondering does openclaw suck more memeory and this causes the issue
Anyone else’s codex oauth continually breaking since latest update?
Your OpenClaw is trying to put 10 pounds of context into a 2.5 pound sack. For whatever reason, your model running software has reduced its maximum context size down to 1792 tokens, which is vastly too small to use with OpenClaw. And...a 14B is too small a local model. You don't say what your hardware is, but I'm guessing you're running it at ~q4 on an 8GB GPU, and you don't have enough space for a decent amount of tokens. I love the desire, and I wish it were otherwise, but you really can't run a local model with an agentic system without large amounts of GPU RAM.
increase the context window if you can
Or run a smaller model if you can't.
(Not that I encourage smaller models with OpenClaw, but if you must run local...)
I am trying to run it on a core ultra 5 135u laptop with 32 gig ram. Completely new to all of this, is it pointless trying to run openclaw on a local llm on my laptop?
No GPU? You're going to have a pretty bad time. It's okay; I recommend getting a coding plan with z.ai or minimax, and using GLM-5-Turbo or MiniMax-M2.7 for your agent's model.
It's fine to run OpenClaw on your laptop; it's not going to work well to try and run a local LLM on a laptop without compute acceleration.
For some reason any local model I use in openclaw fails silently when it’s context length is more then 32k, anyone know why?
Ok. so openclaw will simply not work on the hardware I have. I have been experimenting with local llms using intel ai playground. I just assumed openclaw will work because the llms worked in intel ai playground.
So... Again, you need to separate 'OpenClaw' from 'model provider'. You're trying to use a local model provider, which is potentially too small. OpenClaw itself can run on your system without a problem, but it will need a model provider with more 'oomph'.
I understand. Thanks. What is a reasonable amount of compute that I need cloud/local to get any use out of openclaw or similar agents
You want to run entirely local?
Depends on cost. if it makes sense to run local and then not have to pay a subscribtion, it might be better. just trying to learn about AI, an explore what usage I can get out of it
Running entirely local with a decent capacity is gonna cost a few thousands
Yeah it requires big bucks
To put you in perspective, a model like Nemotron 3 Super being 120B requires at the very least 64 GB of RAM/VRAM to run at 4bit
So I have a significant system; I have 96GB of GPU RAM, and 384 GB of CPU RAM in my ML server. It cost more than a low-mid level new car. I can run Qwen-3.5-122B at q5 at a decent speed and high quality. It's not as good as Opus or GLM-5.1 (which I can run at q3, but not at a speed that works well for OpenClaw).
So I'd say nothing less than 96GB of GPU RAM for local models for really good use.
It is not a good financial decision to run locally, but if you have reasons that overwhelm the financial decision, you can get a decent model to run on a 256GB RAM Mac Studio, which would run around $6K plus tax. That'll pay for 38 months of z.ai's highest tier coding plan.
Ok so running local definitely not an option for me. I think I should have done more research before installing openclaw. I did not realize the AI requirements of openclaw.
If you're just exploring, you can absolutely run a lower quality model, but your experience will be poorer. You won't get as much done, you'll run into hallucinations more often, failure to complete tasks more often. Just... more frustration, really. And it's not always obvious what is failing.
For what it's worth, with that system, I still pay for z.ai because they're really good.
Anyone here using nextcloud with there agent?
Just use a cloud model
It's okay, it just means you can run OpenClaw's gateway and core system on your laptop, and then use a cloud provider for the tooling.
Anyone using nextcloud and libreoffice with their agent as a Foss alternative to google drive and google docs?
Hi friends,
I've got OpenClaw running on a Tencent Cloud VPC, but I'm still stuck on choosing a model that's both capable and affordable.
What I've tried so far:
OpenRouter.ai free tier – worked great at first, then responses got lazy and I hit the free token cap. It's been 2 days with no reset, so I dropped it.
Qwen API – performance was solid, but I got a $90 bill after just 3 days.
OpenAI with OAuth – currently using this, but I'm worried the heartbeat pings will burn through rate limits quickly.
Anyone run into the same issue? What setup did you land on for cost-effective inference (excluding local models)? Thanks in advance!
Im running GLM 5.1 , Coding Plan Lite 18$ per month
It's very good, and cheap
I also tried Minimax M2.7, that model is even more cheap, but feels cheap, spitting chinese words into chat, he is bad ovrall even set on high
overall*
Also tried GPT 5.4, he is insanely good but it feels like i was talking to a toaster, even after strict guardrails in SOUL.md
So yeah, im stuck with GLM 5.1, and Haiku/Sonnet on API for my cients
clients*
I'm with BLØØM; GLM is pretty amazing. You can drop down to their GLM-5-Turbo model and it's both fast and smart, or bump up to GLM-5.1 which is really smart and works hard, but is very slow. For heartbeats, lower the cadence to ~4 hours, unless you really need it to be higher. My main agent runs GLM-5-Turbo, and heartbeats every 4 hours, and it's just to manage its memory. I have a 'pulse' agent dedicated to heartbeat-like things, and they use Haiku, so really cheap. I have a 'coding' agent which uses GLM-5.1 no heartbeats, and I give it instructions, and then go off and do other things while it works.
I occasionally flirt with running it against my local models, and Qwen-3.5-122B is really good (I use an abliterated model, though) but really the z.ai subscription is the way to go.
Ollama cloud models for 20 bucks works great
Ok. Thanks for you advice. One more thing. for my laptop. what would you recommend is the best llm to run to get the most out of the hardware
What is your ram?
32gig
Download several models with different sizes. Then ask your agent to run benchmark tests to see which ones work and which ones break
I have 32 GB also. I downloaded like 12 llms. The 4 largest ones failed the benchmark tests
So I just use the biggest one that didn't fail
I have a core ultra 5 135u cpu no dgpu. Aparently openclaw cannot work on my hardware using a local llm because of the specs. Thanks @nocturne zephyr . I was asking about just running a local llm, not using openclaw
It depends on what you want out of it, in terms of speed. If you just want a local chatbot (not an agent) then try Qwen-3.5-9B. Use q8 or q4 for it, and it should run okay with a decent context buffer. Running entirely in CPU will not be fast, but llama.cpp will handle it. I'd start with q8, and if that's too slow, drop down to q4 and see if that works well enough for you. If that's blazing fast for you, maybe bump up to Qwen3.6-35B-A3B in 4 bit (that'll take ~17GB of RAM + ~2-8GB of RAM for context) which is a solid model.
For general use local models, I think Qwen is doing really well. You could also try Gemma4-26B-A4B which at 4 bit quant would take ~13GB of RAM, plus another 2-6GB of context, I think. Again, CPU-only is not going to be fast, but it's a place where MoE models shine.
Thanks. I will look into them. I currently run all the models on gpu/npu using ai playground, and gpu using Jan
I use nextclploud - what about the agent? I am probably on older version usign purely for files, since it just works
Is there are reason why Op*nclaw CLI takes 10 seconds to run even for the most basics commands like --help? (Btw I get muted 24h for saying a banned word but a phishing bot that spams is completely fine)
Hey Guys my OpenClaw with Codex created a pretty good Website with every functionaltiy i wanted. But it Coded every Feature into a huge Monolith. Now i need advices on how to refactor it.
Its a react App and its Only 2500 Lines Long but every refactor attempt Breaks the Context and it gets confused
Any1 had a similar issue before?
First off all - extract all the code and place in git.
Ask to just create backend as a microservices first - do it in chunks
Is anyone having looping issues after the last update? Lossless claw started causing constant looping and I had to temporarily disable it. As did heartbeats which I also had to temporarily disable.
I've gotten it to the point where it only loops around 4-7 extra mystery tool calls per turn then stops which is an improvement over the infinite looping it had been doing.
I'm not sure if it's got anything to do with the update or not. It started yesterday shortly after I updated and then right after he went to "inspect the DAG" when taking a pass over the memory system we have set up.
He swears he didn't change anything & thinks that him reviewing his DAG memories triggered some sort of looping behavior. Idk if that's really what's happening or it's just a "good story bro."
I've got my agent and Claude trying to figure it out & they can't.
Any thoughts? Or anyone having similar issues after the last update?
I personally found GLM 5.1 fast, my personal problem was the quota limits, i got the Lite coding plan, and hit the weekly quota in like 3 days, to be completely fair, I both used it for OpenClaw and hard coding an app
Anyone seen this? Bug: Gateway re-delivers old messages after lane contention (#58443)
Running 2026.4.15 (latest). Seeing old assistant messages being re-delivered into Telegram group chats as if they're new — hours after they were originally sent.
What happens:
• Session runs with multiple subagents (parallel analysis work)
• Gateway logs show "lane wait exceeded" warnings (one subagent lane waited 152s, main lane waited 13s)
• After contention clears, an old assistant response (already delivered hours ago) gets re-sent to the channel
• No stale messages in any queue or pending files on disk — appears to be in-memory during lane drain
Impact: Old analysis documents (~15,000 words) suddenly appear in group chat as if freshly generated. User has to manually flag them as stale. Agent has no way to distinguish re-delivered messages from fresh ones.
Environment: Linux, Node v22.22.0, Telegram group chat, Anthropic claude-opus-4-6
Already added reproduction details + version confirmation to GitHub issue #58443. Anyone else seeing this on 2026.4.15?
Hi. Who is using Skill: TTS (Text-to-Speech) ? i'm trying in the last version off openclaw, but it dosen't work to send audio to whatsapp. Can you help me? any tips?
do you run everything under glm?have you tried to use some free tier llm for heartbeat and sanity checks. i use google free tier and gemini 3.1 flash lite ti run little silly things and gpt 5.4 plan as primary model.
it's crazy how my quota usage dropped. I literally did nothing all night, woke up and 27% of 5hr quota used on cronjobs
Hello, I need help understanding a possible billing issue or misleading usage display in my OpenClaw setup.
I am running OpenClaw self-hosted on a Hostinger VPS, inside Docker, and I connected it using OpenAI Codex OAuth / ChatGPT account authentication, not with an API key.
Important details about my setup:
• I am using openai-codex/gpt-5.4
• OpenClaw status shows OAuth
• I have not configured an OPENAI_API_KEY
• I have checked the environment and there are no active OpenAI API environment variables
• I have not enabled Codex credits
• I have not enabled credit auto-recharge
• My intention is to use OpenClaw only through the access already included in my ChatGPT subscription, without any additional paid API usage
The problem is that inside OpenClaw, the usage/status area appears to show that my usage is “costing money” or displays cost-related information, which is worrying because I specifically set this up to avoid any extra charges.
What I need clarified is:
- Does using OpenClaw through OpenAI Codex OAuth without an API key create any real billable API charges?
- If no API key and no credits are enabled, should usage simply count against my included usage limits rather than generate extra charges?
- Is the “cost” shown in OpenClaw only a local estimated/equivalent usage display, rather than a real charge?
- Is there any hidden path by which OpenClaw connected through OAuth could still generate paid OpenAI usage?
I would appreciate a clear explanation, because I cannot afford unexpected charges and I need to know whether my current setup is truly safe from extra billing.
Thank you.
Does OpenClaw show estimated token cost in the usage/status view even when you're using a fully free setup, like Codex OAuth / ChatGPT subscription auth with no API key, no credits, and no auto-recharge? In other words, is it just showing what the usage would cost if it were billed, rather than actual money being charged?
You can definitely run a local llm. Just not a big one
If you sure pls help me with advice. I am strungling to set something up for the last few days. What are you using?
If you sure pls help me with advice. I am strungling to set something up for the last few days. What are you using?
noob question: how do I install a specific skill only for a specific agent?
this describes the different places you can put them and what it means: https://docs.openclaw.ai/tools/skills#per-agent-vs-shared-skills
hint: for any docs page, append .md to the url [without the anchor] and tell your agent to read it for context: https://docs.openclaw.ai/tools/skills.md
cant take more input than 32k token then. if it cant take more input than that then its normal for it to fail. increase the context window if supported by the model you are using
if you dont have a dedicated gpu then dont try , better use some ai api instead
Content length, and its max is over 200k
Yeah, i run everything on GLM 5.1, hard tasks on GPT 5.4 xHigh
Never thought of adding a free model for the small tasks and heartbeat, that honestly makes sense why my quota just burns in 3 days instead of 7 lol
then set the context lenght at 100k to see how it works. but remember higher context window = more vram
Tried 64k context length and it failed silently
thats why i am saying increase it to 100k and try again
Haven’t tried 100k but I have tried 256k, is there an error log I can check?
not sure about the error log
with 256k window its working?
Unfortunately no
Hi, I am new to OpenClaw and I have a question: is it possible to have some sub-agents and give them all a Telegram bot(or something similar) to do these:
- direct messaging the main agent.
- direct messaging the sub-agents.
- main agent assigning tasks to sub-agents in the background.
- have them all in one group on Telegram(or something similar) for planning and brainstorming and open team talk(them and me)
Is the above setup possible? and if so, how? because no matter what I do, i cant make all the parts work...
what I can do and have running so far:
- I can direct message the main agent,
- i can direct message the sub-agents.
- I have them all in one group and when I write something there, they all see and respond.
- But the main agent can see them out of the group to assign them something.
- none of them can see or doesn't even know that it is in a group and writes there as well... I mean, although they respond in the group but in direct message, they dont know that they did such or are added in a group at all.
Update, I told my GPT 5.4 to configure Gemini 3.1 Flash lite to run my small crons and heartbeats, thx a lot
let me know how it goes, might get glm lite coding plan myself, although they did went up in price recently
Hey guys, Wanna ask that what could be the best opensource model to run with opeclaw if i have a 6GB vram laptop?
Gemma 4 4B at best , i dont think you can run 27B or 31B
okay I just need someone to sanity check me.
I have it working now, but the OpenRouter connector is broken right now right? Because I could not get that thing running out of the box on several reinstalls, 2 machines, different operating systems etc. Other connectors (Antropic, OpenAI, Ollama) worked fine, just not the OpenRouter one.
okay thanks a lot
What problem are you trying to solve by doing this? It’s fun to see them talk to each other but if you are actually trying to get work accomplished there may be better options.
the 4B will have an attrocious kontext length though, if you want to keep it on the GPU. So realistically you're looking more for something like QWEN3.5:2B (but not an expert, just my take on a bit of testing I did yesterday)
anyone notice claw will no longer send files on telegram/slack?
I mainly need it for reasoning purposes and code review hence ig i will go with gemma btw thanks bro
Hi,
of course my end goal is get them work and achive results... but I also want to have the option to talk to them directly and also in a group...
all the things I read and watched online is mostly just showing how you have one main agent and some sub-agents and how to control or give the sub-agents via the main agent...
most use cases I observed on the internet involve content creation, where one agent reads and summarises information, the next uses that to write a script, and subsequently, other agents generate content based on the script and so on... this creates a workflow or queue kinda pipeline... However, my work differs because the tasks are interconnected and must proceed in sequence. For example, I can't book hotel rooms without guests, or rent a bus and send it to arbitrary locations without knowing where the guests are, so it can choose the optimal route to save time and costs. Essentially, it's a chain of related tasks that need to move forward together... so all the agents should be in direct contact ... does it make sense?
Heeey ! D:
Opencalw was working normally for me the nafter an update or maybe a re-install it became super slow, now it needs like 5 minutes to openclaw gateway run on my own machine, I have an ok gaming pc everything works perfect except openclaw no matter how many times I uninstall it and try to remove every folder and every package.... any solution or something wihtout having to format the whole thing ?
like, openclaw devices list, takes 15 seconds to show anything
windows, 16gb ram, i5 10400f processor
Guys please I need your advice ! D:
There's the problem: Windows
Use wsl2
seriuosly though, does everyone use wsl2 like how did it run smoothly on windows before
Does anyone know why I call /tools and get Connected tools
browser (browser), memory_get (memory-core), memory_search (memory-core)
but then when I ask my agent to use memory_search it cannot? I checked the docs and didn't see an allow/dblock option
appreciate your help <3
hello. i am running openclaw on a vps. how do I get it to learn and store new informatino in its memory so it refers to it?
i have the self learning skill but i dont think its doing anything
doesn't it have a context memory or something?
Can anyone help me with this error I think it's keeping my dreaming from fully working
[plugins] memory-core: narrative session cleanup failed for rem phase: missing scope: operator.admin
it doesnt use it tho
Hello everyone
I’ve vps with 8 Ram
Is it possible to work with local model ?and which model you guys suggest?
any one having issues with openrouter setup? i cant get the models run
i use openrouter setup
i just set one api key in the config and set each model for each agent
i just setup one api key for one agent i swithced models and also used auto still nothing
Yes, exactly. The safety layer is part of the OpenClaw orchestration.
When you send a request to image_generate, the prompt is evaluated by my internal safety filters before the request is ever dispatched to the ComfyUI provider. If the prompt triggers a violation—in this case, the potential for depicting animal cruelty—I intercept the request and decline to execute the tool. Anyway to adjust what triggers these violations? The prompt mentioned hamsters on bicycles, that doesn’t seem like animal cruelty to me
Quick context on why the self-learning skill feels like it's doing nothing: it only nudges the model to describe what it learned, it doesn't actually persist + retrieve anything across sessions on its own.
Santiago's suggestions all work — different tradeoffs:
• dreaming → consolidates into summaries (good, but you lose detail)
• lossless-claw → keeps full transcripts (good, can get heavy)
• qmd → local markdown files (great if you want to read/edit them yourself)
If you want the "it just remembers and brings stuff back" experience on a VPS with zero file management, mr-memory is worth a look — it stores every turn at full fidelity and retrieves per-message, so the agent actually refers back to specific past things instead of a summary of them. One command: openclaw plugins install mr-memory.
Not dunking on the others — they're all valid. Just the shortest path to "agent remembers" on a VPS.
Hey guys. Is there anyway to make an specific agent split the response message in 2 or 3 separated messages?
Have you tried telling it to do that in its SOUL.md file? Most of us are trying to get it NOT to do that because it tends to stop after the first response. What problem are you trying to solve?
Running OpenClaw 2026.3.14 on ARM64 Mac Mini. Config was written by 2026.4.8. Persistent 409 Telegram conflict with two agents (Max and Guardian), each with unique bot tokens, dedicated agentDirs, and separate bindings. Tried: new tokens, session clear, agentDir fix, doctor --fix (creates rogue default account). 409 persists after every restart. Upgrade to 2026.4.8 fails on ARM64 due to matrix-sdk-crypto native addon missing. Any advice?
Can't seem to get heartbeats to fire. I've never really used them and feel like I may have done a hard switch-off early on to save tokens. I've used the openclaw system heartbeat enable cli command and it says it's enabled. I have the config set for one of my agents to heartbeat every 30m. Just nothing ever happens. Agent says that heartbeat timer isn't ticking over. Any ideas?
Look at the openclaw.json config file, is it in there? Set one of them to 15 minutes (non-default value) and make sure the HEARTBEAT.md has instructions to do something interesting and tell you about it in a specific channel.
Ya all that basic stuff is there. Even Opus is stumped
You’ve asked it from outside the openclaw instance, with access to the config / workspaces and the source code? Like start Claude Code in the home directory.
No I could try that
It says they are active in openclaw status. The one kinda weird thing is that I moved all workspaces to /workspace/<agent name> but everything else seems to be picking up
Claude Code is on it. Let's see if he can crack it 🙂
Fixed. Needed to add the target to the heartbeat config. Strange that my Opus agent didn't pick that up but Code did - seems like they dumbed down the API version
Hey, let me add more context on what I’m trying to achieve.
I want a specific agent to deliver its responses in multiple separate messages (following a burst-style format), instead of a single combined message.
On the LLM side, this is already working. I’m injecting a style guide into the system prompt via an agent:bootstrap hook, and the agent consistently formats the output into clear blocks (with spacing between them).
The problem is that, despite this, everything is still delivered to the user as a single message. So at this point, it seems clear that this is not an LLM issue, but a gateway / delivery behavior.
I’ve gone through the streaming docs:
https://docs.openclaw.ai/concepts/streaming
From what I understand:
Preview streaming doesn’t help here since it’s only a temporary UI update, not real channel messages
Block streaming is the mechanism that actually sends multiple messages to the channel
However, block streaming appears to be controlled at the default + channel level (agents.defaults.blockStreamingDefault, *.blockStreaming, etc.)
I also noticed:
blockStreamingBreak (text_end vs message_end) affects when chunks are flushed
blockStreamingChunk defines how content is split (length vs newline, min/max chars, etc.)
blockStreamingCoalesce may merge chunks before sending, which could be preventing the burst effect
chunkMode: newline could help align with my block-based formatting
humanDelay can be overridden per agent, which suggests some delivery behaviors are agent-scoped
What I’m trying to understand is:
Is there any way to scope block streaming behavior per agent (similar to humanDelay), so that only this specific agent sends multiple messages, while others remain unchanged?
If not, would the recommended approach be to handle this at the gateway level (e.g., post-processing the LLM output and dispatching multiple messages manually for this agent)?
Can't seem to post to #1459642797895319552 . The slowmode timer spins down then goes ellipses into default state.
The harness matters! Claude Code the tool has a bunch of stuff in it that wraps around the model before your prompt goes in. When you use OpenClaw + Opus that is an entirely different setup. It’s Pi underneath, it says “minimal” on its website.
Same advice as above. Clone the openclaw source code and start Claude Code in the home directory where it can see the config and the workspaces and the source code. Set CC + Opus loose on the question / goal. It can change config, restart the gateway, do tests, (or ask you to, if it doesn’t have access,) look at the logs, etc.
I have Cowork running on another machine and never even thought to use it on my claw machine because I figured my agents would suffice. Major level up
Could it be that local lower parameters models like the 9b are just not smart enough to really run openclaw? I installed openclaw and used ollama models locally but half of the time my prompts don't really do anything and they hallucinate stuff
For example: Setting up the notion skill was impossible I tried it just said that it wasn't there and I couldn't get it to work
I'm playing on buying a subscription from minimax or openrouter credits but I'm scared it isn't the model the problem but me not knowing how to configure a thing as simple as skills (even though i'm somewhat proficient with linux and servers already)
In general, yes. 9B is too small, especially if quantized down to 4 bits. TBH & IMO, 122B is barely acceptable at q8. It's certainly possible you put the skills in the wrong spot, or otherwise made a mistake, but even with that, your 'claw should be helping you figure out where to put stuff, or doing it for you. It's okay to double-check yourself, but at the core the idea of OpenClaw is that you should be able to get it to help you when you're unsure if you're doing it right, and you should be able to trust that it's right.
Sheesh 122b the bare minimum? I wonder why it was running so bad 😂
I’m doing fine with Gemma4:26b but as above I do not let it configure itself. A very capable frontier model does that. Then OpenClaw does its heartbeats and summarizes text and other administrative stuff. (No coding)
If you want the full magical experience where you tell it to do stuff and it just works then you will likely be paying either for the serious hardware (and electricity) to run a decent model or for API access to one.
I will say that things are getting better... I'm hearing of folks using Qwen3.6-35B-A3B to reasonable effect, and that it's a good Claw model, which might fit in a 24GB GPU at q4, but I'd still be hesitant to trust it to do really complex work. For reference, I use my 'claw for a bunch of things: ADHD-adjacent executive function management (helping me stay on task, and not lose tasks), coding, and as a homelab 'ops' team for my complicated setup. I'm fairly sure that the 35B could do the first, somewhat the second, but the third requires really long-flow work, planning and step-by-step debugging, and a lot of inherent knowledge, or the ability to search and find the knowledge.
Can I use skills from https://skills.sh/ ? OpenClaw is not listed in the "available" but I would like to use some of them.
I would start by asking your agent. Something like: Is it possible to use skills from https://skills.sh/ in OpenClaw? What's the process?
The answer is likely to be that you're going to need a translation or a shim that converts the skills into the way OpenClaw expects them; a SKILL.md markdown and maybe some auxiliary files, but this is something that you absolutely can check with your agent to get working, especially if there's a particular skill you want to use.
By default how many times does dreaming run when enabled?
I did but I had different answers depending on the model and sources used. So I thought it would be wiser to ask here.
I also read the FAQ thinking that it was a well-known issue but found nothing.
Ah; sorry. Yeah, to the best of my knowledge it's not generically usable, but if you have a specific skill, you might be able to feed it to OpenClaw with a good enough model, and have it convert it.
Ok thanks for your feedback 🙂
guys i am looking for great ui on pc for openclaw what you use ?
It was rhw update fix
any recomendation for api key ? i try use anthropic,but the payment only credit card...
If you have wise, you can use that to sign up using their temp (debit) card
what the best one to use ?
Hey big fam
I am totally new to this and I need a
1- guid preferably a video guid on step by step how to install open claw on my other computer 💻 run it and create its personality, agents, etc…
2- a way to set it up so it won’t use so much tokens
3- Should I create a different Apple ID for the Mac 💻 I’m running it on? Or just deactivate any cloud file sharing?
If there is a video that would be great if a text file also good
Thank you so much 😊
I am forever grateful for you all 🙏✨🤍
only credit card i dont see any debit card option
Claire Vo has several good ones on her “How I AI” podcast series. I think agents should have their own world. Separate credentials. You can share stuff with them after they prove they can handle it. Like hiring an employee.
Wise virtual cards are actually debit card
Why don't you follow the docs? Openclaw changes heavily between releases hence vid will become old
It’s supposed to be simple, you install and start it, and the agent asks you questions and updates itself. I think this only really works if you are using a very capable model… I started with one of the Qwen 3 models and it didn’t handle the multi-turn process at all.
If you just read the docs, you can save all those tokens in just setting it up...remember openclaw is a token hogger.
When a nee version is released and there is a breaking chnage, you again need to waste token for setting it up..haveing the knowledge save that too and you know exactly whatz happening
Thank you so much! I checked it out and I couldn’t find a step by step tutorial or guide for that, however this is a very valuable resource for me. So grateful for this suggestion 🙏😃
This might sound really stupid !, but where would I find the docs? Are they here on Discord too?
Here is the page with how to install it: https://docs.openclaw.ai/install (Look around the rest of the site for more, for example at the top left click 'Get Started' for some more basic info.)
Thank you for accommodating my newbieness™️®️🫡
for any page on the site, append .md and give that to your favorite agent for context (web-based Gemini is very good at explaining OpenClaw) example: https://docs.openclaw.ai/concepts/agent-workspace.md
Please make sure you know the risk of directly installing in your daily driver pc. If you want to play it safe, install in a vps or if you are a little techy, use docker
OpenClaw Diagnostic Report: web_search (Gemini) API Key Failure
- System Information:
• OpenClaw Version: 2026.4.14 (323493f)
• Operating System: macOS (arm64)
• Primary LLM Provider: OpenRouter
• Failing Tool: web_search (native Gemini)
-
The Problem:
The web_search tool consistently fails with a 400 - API key not valid error from Google's generativelanguage.googleapis.com endpoint. This occurs even though a valid Gemini API key has been set. The primary agent model via OpenRouter continues to work correctly. -
Error Message:
{
"error": {
"code": 400,
"message": "API key not valid. Please pass a valid API key.",
"status": "INVALID_ARGUMENT"
}
}
- Configuration (~/.openclaw/gateway.yml):
A minimal gateway.yml file has been created to explicitly set the key:
defaults:
env:
GEMINI_API_KEY: "AIzaSyDvii8MBC7V_wkXkUjmC977e_9-0DEHS1M"
(Note: I have confirmed via launchctl print that the gateway service is inheriting this environment variable correctly.)
- Troubleshooting Steps Attempted (All Failed):
• ~/.zshenv: Adding export GEMINI_API_KEY="..." was not picked up by the background service.
• launchctl setenv: Setting the variable for the GUI session was also not picked up after a restart.
• gateway.yml: Creating the configuration file above and restarting did not resolve the issue, even though launchctl shows the variable is present in the service's environment.
• Service Reinstall: Used openclaw gateway uninstall and openclaw gateway install to rebuild the launchd service. The issue persists.
• Log File Location: Unable to locate the ai.openclaw.gateway.log file in standard directories (~/Library/Logs/), preventing deeper inspection of the startup process.
Conclusion:
There seems to be a fundamental issue with this installation where the gateway service, despite inheriting the GEMINI_API_KEY environment variable, is not using it for the native Gemini web_search tool after the primary model was switched.
…to openrouter is supposed to be the end of the last sentence
Just use a different web search
Try adding the key to the agent's workspace/.env file. global keys don't propogate down to the agent's exec environment.
I hope that isn't an actual Gemini API key
hi guys. im new here and using claw since 4 dfays. suddenöy i have a problem in the terminal with this:
@pop-os:~$ openclaw help
[openclaw] Failed to start CLI: Error: The openclaw help command is unavailable because plugins.allow excludes "help". Add "help" to plugins.allow if you want that bundled plugin CLI surface.
at runCli (file:///h_____/de_modules/openclaw/dist/run-main-BBeVm29G.js:458:45)
i cant figure it out nor does the mainagent..
anyone a clue?
is time travel possible with openclaw?
can i make reality VR game that like isekai anime games??
when using claude cli models it says 00:24:29 Embedded agent failed before reply: spawn ENAMETOOLONG
Why do my bot called the LLM models through api over 250 times a day even though I surely that I just prompting for 30 times at most for that day
tool call , thinking , prompt cache etc
Hi all,
over the last few weeks I’ve been working with OpenClaw on Hostinger, and I keep running into a few bugs / issues. I wanted to ask whether others are experiencing the same thing.
My main problem is that sessions do not seem to be organized very reliably. Sometimes when I want to reopen a chat, it is no longer there. In other cases, parts of the context that were previously displayed seem to disappear after I send a new prompt. I also have the impression that some changes are not always being persisted properly.
More generally, I’m wondering how you structure your development workflow with OpenClaw, especially around the main agent. For example, when you start a new project, do you create a new Telegram topic so that the session remains easier to track and revisit later in the OpenClaw UI?
I’d really appreciate any tips or shared experiences.
Have a nice day
Darius
Discord is working pretty well for me. I have set up specific channels for the discord bot (each with a different isolated openclaw session) so that I can isolate dicussions to individual topics without the context window getting full as quickly as using one session
But ideally I'd like to use element/matrix or simplex to keep everything fully Foss. But my openclaw can't seem to figure out how to configure either of these apps. There is always some excuse
Jea i did the same with my telegram for starters. Whcih setup are you using (VPS, local)?
Hi! I'm having trouble receiving inbound WhatsApp Business messages with OpenClaw. Status shows "linked, running, connected" but no incoming messages appear in the logs. I've tried: disabling selfChatMode, restarting gateway multiple times, relinking several times. Logs only show "Listening for personal WhatsApp inbound messages" but nothing appears when messages arrive. Using WhatsApp Business app (not personal WhatsApp). Any ideas?
If the agent inside OpenClaw can't solve it, then step outside.
#users-helping-users message
Hey can I ask which API model are you using?
I'm using Anthropic Claude CLI (claude-opus-4-6) through the Claude CLI auth method, not an API key directly.
Anthropic blocked any way of connecting their models to OpenClaw without pay-as-you-go
I think it could get my account banned right?
what an helpful answer. Krill already helped me.
I will ask my agent cuz tbh i got no idea lol
Hi again. I found the root cause of my issue. OpenClaw shows "Listening for personal WhatsApp inbound messages" but I'm using a WhatsApp Business account. The logs show messagesHandled:0 and lastInboundAt:null — no messages are ever received. Is WhatsApp Business supported? Is there any workaround to make it work with a Business account?
Hey Edu, good catch, that likely explains it.
Right now OpenClaw’s built-in WhatsApp Web channel is aimed at personal WhatsApp Web sessions, not official WhatsApp Business account setups. Even the CLI/help text and logs point in that direction, for example things like “Link personal WhatsApp Web” and “Listening for personal WhatsApp inbound messages.”
So if you’re using a WhatsApp Business account, especially if you expect it to behave like a supported inbound channel, that’s probably why you see:
messagesHandled: 0
lastInboundAt: null
In short, WhatsApp Business is not reliably supported for inbound messaging in this flow.
Best workaround options
Use a personal WhatsApp account/number with OpenClaw
This is the simplest option if you need inbound messages to work now.
Keep the Business number, but use the official WhatsApp Business API / Cloud API instead
If you need a true Business setup, this is the proper route. It’s a different integration path than the personal WhatsApp Web session OpenClaw is using here.
Test with a personal account first
If a personal number starts receiving messages immediately, that confirms the issue is the Business account compatibility, not your general OpenClaw setup.
So the short answer is: personal WhatsApp, yes; WhatsApp Business, not really in this mode. Unfortunately there isn’t a clean built-in switch that makes the current personal WhatsApp listener fully work with a Business account.
Does anyone have any idea how to safely get rid of all the pre-bundled junk that comes with Open Claw? I don't want to wait 10 seocnds every single turn just to say "Hello" because it's scanning every single extension. @sharp parcel is telling me this is not possible, but surely... not everyone here is actually working on that same limitation. Why scan 20 different model providers software just to use one?
allow list / enabled: false are not the solution, it still scans everything.
this preexisting junk is the files in its workspace
so it's injecting its soul, agents.md, etc, everytime
no.
in short, i suggest a different phone number for it and a different phone (or if you're on a Samsung, use the secondary messenger feature which allows you to have duplicates of an app)
hello can i ask where to find the ios app? planning to make my own pwa
No, the md files aren't the bottleneck at all, mine have almost nothing in them. As I said it's the bundled extensions, I can see that in the logs, 37,635 openat() calls on plugin files
I also have no memory or sessions data, that was my first step in debugging, just nuking the shit out of that
it's all the extensions
zai, xai, xiaomi, vydra, vercel, volcengine, talk voice, stepfun, synthetic, qianfan, opencode, openai, memory-core, lmstudio, microsfot, google ... etc.
like 60 of them
gelp
help me it just doesnt load
@barren blade i need help
@mental hatch could u help
gelp
Hi any one can help me with this issue can set up Cron Jobs via Telegram or Discord channel ? On a windows setup with Local Loopback
handshake timeout conn=48d725a7-1f15-4df3-bc43-33a6cc935178 peer=127.0.0.1:54499->127.0.0.1:18789 remote=127.0.0.1
So I can't use WhatsApp Business with Open Claw????? I need to do it with Business, not personal
I installed 2026.4.15 but Opus 4.7 does not show up as avaiable. What am I doing wrong?
Everything I see says you will need to use the API for this. Perhaps try one of these? https://clawhub.ai/search?q=whatsapp+business
we need more info to help... how did you install openclaw (locally? hosted somewhere?) and what is it asking you / what are you seeing?
they aren't even enabled unless you explicitly use them
Noted🫶 I’ll be installing it on an old Mac 💻
that's really good actually
Shadow runs his claw on an old mac
But they're still read, literally 500 times per turn, it's absurd. Why.
my claw's life has been sucked out by this and is still fine
Hello guys, i want to ask, maybe one of you have experienced this
Why my openclaw gateway can consume ~18gb ram?
Its like happening constantly every night, when i’m not doing anything with my openclaw
cronjobs?
has anyone had the issue after installing openclaw 4.15, telegram says "gent couldn't generate a response. Please try again." and a blank response in the opeclaw native chat UI. This is happening to me on a freshly wiped mac and fresh install of 4.15. Been dead in he water for 3 days.
any recommendations for open source chat UI? web- or desktop-based is ideal.
looking for one for my openclaw agent but also interested in one that supports a "dumb" model plugin.
Hello. I have setup openclaw with lmstudio. I have setup up a local llm gemma4 e4b on an intel core ultra 5 135u laptop. the model works in lmstudio. however I get no reponse in openclaw. Are there any specific settings I need to change in openclaw. when i enter Hello in the prompt the first time after launcing openclaw it takes awhile processing but then i get no reply. after the first attemp the time it takes is much less but i still get no response. the gpu is loading up so the ai is processing.
Newer vesions of OC are shit tbh
I did an Ansible push (unrelated change of removing an old /etc/sudoers.d file) and fubared my OpenClaw. Some updates and fixes and changes made since the previous Ansible push got undone I guess. I removed Node.js and pnpm and reran the Ansible, then did "openclaw onboard --install-daemon" and "openclaw doctor --fix". But looks like gateway is not running, and I get this can find pnpm package to match OpenClaw version error. I miss my claw, only been down for a few hours, but... Help! Anybody see the fix for me jump out at them?
node:internal/modules/run_main:123
triggerUncaughtException(
^
Error: Cannot find package '/home/openclaw/.local/share/pnpm/5/.pnpm/openclaw@2026.4.15_@emnapi+core@1.10.0_@emnapi+runtime@1.10.0_@napi-rs+canvas@0.1.99_@t_4f8e0c9bb61d14c7228165243d7f6933/node_modules/openclaw/node_modules/tslog/index.js' imported from /home/openclaw/.local/share/pnpm/5/.pnpm/openclaw@2026.4.15_@emnapi+core@1.10.0_@emnapi+runtime@1.10.0_@napi-rs+canvas@0.1.99_@t_4f8e0c9bb61d14c7228165243d7f6933/node_modules/openclaw/dist/logger-D8OnBgBc.js
Node.js v22.22.2
Ansible claims done:
| 1 Ran apt update && apt upgrade
| 2 | Remove legacy openclaw-scripts sudoers file | Deleted a sudoers file from a previous playbook iteration |
| 3 | Update apt cache after adding Docker repo
| 4 | Reload UFW
| 5 | Update apt cache after adding NodeSource repo
| 6 | Template SearXNG settings.yml
| 7 | Create OpenClaw directories — .openclaw
| 8 | Create OpenClaw directories — agents/main
| 9 | Install OpenClaw globally as openclaw user (using pnpm) | pnpm install -g — installed/upgraded the OpenClaw npm package globally for the openclaw user
| 10 | Create log directory (ntfy-subscriber) | Created ntfy log directory |
| 11 | Create syslog log directory (syslog-ng) | Created syslog-ng log directory |
| 12 | Restart searxng (handler) | Restarted the SearXNG Docker container
how can i setup openclaw locally using ollama? I've been trying over 3 day with qwen2.5:1.5b! and it doesnt work! Please help me
Qq to all of you how are you approaching approvals... My instance is requesting me approval all the time... Apparently for same thing just the command is a bit different even after I click allow always 😅
Hi! I'm having trouble uploading images in Discord desktop app. I can't drag and drop, Ctrl+C/Ctrl+V doesn't work, and the + button only shows 'Use Apps'. This happens in all channels/threads. Any idea what's wrong?
How is it installed? Linux? What are you doing and what specifically is going wrong?
i dont really know, windows, ollama
how do I get files to my bot? the bot says telegram isn't sending them
they're audio files so they're too large to email with agentmail...
Where is openclaw running? Where are the files? This may be a job for old fashioned scp and then a cron job or heartbeat to process them.
If you mean you installed directly on windows you probably want to switch to WSL. Check the docs on the website for recommendations.
I have it installed
Silent agent — no replies (dashboard/Telegram) after clean Ubuntu reinstall
Where can I paste my entire report? I need help, please?
There is a #1459642797895319552 channel with a bot for debugging.
thanks!
guys how are you guys isolating clude cli or open ai codex(not api the subssctiption oauth) using their respective memory not the openclaw
I'm on windows. I tried running openclaw but it doesnt respond. I set everything up using openrouter on claude-sonnet-4.6 and whenever i send a message it loads 3 dots for a few seconds then it doesnt respond and cancels its message. it stops loading and quits the message and all thats visible in the chat is the message i sent. Openclaw does not return or send a message. I'm assuming it's tried to inititate a response however it gets into a time loop or the UI is triggered and its unable to respond. wanted to know the fix to this so it respond to my messages,
All my api keys are valid, I added credit to my openrouter account ($5). didnt work
Is there something up with Discord connections? I started getting "Agent couldn't generate a response". I've been trying to fix it for 2 days.
Hi — new OpenClaw user on Windows 11. Installed today, onboarded with Anthropic API key. Getting HTTP 403 "Request not allowed" on every agent request.
Verified:
- Same API key works fine in Anthropic's Workbench (console.anthropic.com)
- Billing loaded: $19 credit, payment method on file
- Tried both claude-opus-4-7 and claude-sonnet-4-6 — both 403
Gateway log error:
[agent/embedded] embedded run agent end: isError=true model=claude-sonnet-4-6 provider=anthropic error=HTTP 403 forbidden: Request not allowed rawError=403 {"error":{"type":"forbidden","message":"Request not allowed"}}
Looking at openclaw.json, my auth profile is:
"auth": {
"profiles": {
"anthropic:default": {
"provider": "anthropic",
"mode": "api_key"
}
}
}
No key stored in that section. Is it stored elsewhere? How do I verify OpenClaw is actually picking up my API key?
Running OpenClaw 2026.4.15 on Windows 11. Thanks.
run openclaw models status at the command line and it should show you what api keys it knows about.
"Update: openclaw models status shows my new API key loaded correctly. I edited the config to use claude-sonnet-4-6 as primary, but after running openclaw configure --section model it reverted back to claude-opus-4-7 as primary. The dashboard displays 'opus' at the top. Messages queue but never get a response. Running v2026.4.15 on Windows 11."
if you have an Anthropic API key I assume you also have a Claude subscription. The way I sort things out is by starting the Claude Code CLI in the home directory of the installation outside of OpenClaw and tell it what is wrong. I also git clone the openclaw source code for it. Then it can examine the config, run openclaw ... commands, restart the gateway, look at the logs, and dig around in the source to figure out what is going on.
There is also a #1459642797895319552 channel in here with a bot that is good at debugging, it will ask you to run commands and paste things.
working fine for me
Still not working for me been at it for 7 hours now
I don't know if what's wrong is related. But WSL2 is recommended over installing directly on Windows, in case that's what you meant by Windows 11. https://docs.openclaw.ai/platforms/windows
Thanks I will give it a try
Seriously burned by Anthropic dropping Sub usage on OC. I tried GPT 5.4 via oAuth but its just not the same. Drops tasks, Agentic handling of personal tasks is subpar. And it's SLOW. For those of you not running local, what models have you had success with as orchestrator and personal assistant. I've used GLM with Hermes and liked it. I may try it in OC via OpenRouter.
Heh, even the Krill bot here is telling me to just file a bug report. I guess it really isn't my fault
Gemma4:26b is working fine for web search, text summary, simple tools like run a script and use the output. As above I do not let it mess with its own configuration. Claude Code CLI does that outside of OC, and the $20 subscription is plenty.
krill bot some times doesnt do right i have my claud code fix it
I'm still working on it myself, but I need a break. I've been staring at this thing for hours.
If anyone else has any ideas: https://discord.com/channels/1456350064065904867/1495537701808767086
i will look at the post in 30mins
this is the way
Can you check the /agents/main/agent then find models.json and auth-profile.json
Guys i will say it again do not install the latest verion of OC always if you want a smooth onboarding and not worried about new features as u shouldnt at first always install the v2026.3.28 it is the most stable
Does anyone know why openclaw runs so slow sometimes? any common fixes? Running codex 5.4 and just a message saying Hi will take 2 minutes to get a response
A message from where? How is OPenClaw deployed? If it’s hosted somewhere it might have to wake up first.
running locally. like if I message any of my agents saying hi it takes two minutes to start. even on new sessions
Message them how? Directly in the gateway web ui?
no discord
Running locally on what hardware? Does it have enough memory?
Run the gateway with —verbose flag and watch the logs to see what it’s doing. Have your favorite agent look at the logs and see what it thinks.
what is a good model to run locally with 2026 Apple Mac Mini M4 Pro 14 núcleos CPU 20 CorGPU 48GB RAM 1TB SSD ?
Qwen 3.5 9b or qwen coder
Smooth
Or you can try bigger models bu quantisized
Its not a matter of will it run but more of how much kv cache you will get (context)
So try to find sweet spot balancing llm power with context
Because what use of even opus 4.6 at 8k kv you say hi it forgets
How to solve high ram consumption?
I have cron, but only 1 job and run every hour
3.28 is far too back. 4.11 is quite stable as well with some latest features and with proper config
Well it is the most stable one we use it for production work we dont update because it doing it work correctly for our customers
Stable but vulnerable, updates are not rolled out just to add features, they also correct security issues and add security layers
4.14 is working really well for me.
I won't update to 4.15 for now.
I was trying to get a good local model running, but just can't get it responding fast enough. For a temporary fix (few months until I can get a proper system for it,) I will need to run a cloud model. Since I don't have any right now, can anyone suggest one that's currently good at tool use in openclaw?
Hi guys I'm looking for how much is better to use in "contextWindow"
A context window is how much information an AI model can “remember” at one time.
Using a very large context window is not always better—it can make responses slower, more expensive, and less focused.
For most chat or assistant apps, keeping about 2,000 to 8,000 tokens of recent and relevant information works well.
Instead of adding everything, it’s smarter to keep only important parts and summarize older content.
In tools like LangChain, developers often use small chunks of data and only send the most relevant ones to the model.
A good approach is to start with a smaller context and increase it only if the model is missing important details.
Get the openai lowest tier subscription (use gpt-5.4-mini) and see how much it lasts with your usage before weekly quota runs out
If I use my agent for analyze code you think es better use those options? I'm using this conf "contextWindow": 64000,
"maxTokens": 9000, You think is this wrong conf?
anyone use hydra db?
While these settings are not incorrect, they are likely much larger than actually needed, which can lead to performance degradation.
For code analysis, setting "contextWindow" to 64,000 is only useful when inputting very large files or multiple files at once. In most cases, if you are only sending the relevant code sections, 8,000 to 16,000 tokens are sufficient. Setting "maxTokens" to 9,000 is also a considerably high value. Most code explanations or modifications can be sufficiently contained within 1,000 to 3,000 tokens.
Using such large values slows down response times, increases costs, and can sometimes blur the focus of the response. Instead, a better approach is to break the code into chunks and send only the important parts, which is how tools like LangChain generally handle code analysis.
In my opinion:
contextWindow: 16,000~32,000
maxTokens: 2,000~4,000
It sounds like using ollama, I changed to mlx it is faster.
I am wondering if it’s a good idea to test karpathy’s auto research with local models on openclaw. I tested using api before, it was cool. Anyone tried?
And is really normal when I reset my session openclaw have this after reset? Is this normal? 12k/110k (25%)
Depends on your model
Yes—that’s normal
What you’re seeing (12k / 110k (25%)) usually means your system still has ~12K tokens already in context out of a max ~110K window. A “reset” in many agents (including setups similar to OpenClaw) often doesn’t fully clear everything.
You need to delete the session in order to "reset" it, and it'll still show a decent amount of tokens used due to the initial context injection made by OpenClaw
Once you start a new session, of course
I really appreciate your help and time, I'm gonna try this new conf for my conf thank u
I'm using Qwen3.5-9B
You are welcome.
Why would you use such a low context window? That's terrible advice.
Okey I thought if I use /reset in my chat this command reset all in my conversation
Yes, the maxTokens should be preferably in like 2048 - 4096 depending on the tasks and overall usage
But the context window should always be set to the model's context window (which can be found with a quick search on the internet)
Or at least a close value
In the first I think is gonna be functional. BUt I try this for stress my gpu
Oh you're running local?
Yep I'm runnning that model local
Well that changes the panorama
Why?
What I suggested was not that you use it, but rather that I provide advice on common problems that may arise during use.
I mean using such a small context window will mostly make your model insta-forget what it's doing pretty fast and have it compacted too often
But if you can't afford more, it's alright
16000 - 32000 is still too small, the OpenClaw context injection could easily fill that context in the first message
I'm looking for a balanced options in my case I'm run local that model. And I'm using that contextWindow with large number
@dense sleet You think is good option use 64k for my context window?
64000 context window with 2048 max output tokens should be enough if your hardware supports it
My local model deploy my limit is GPU KV cache size: 31,152 tokens
INFO 04-19 20:27:44 [kv_cache_utils.py:1324] Maximum concurrency for 64,000 tokens per request: 1.84x I'm the only one use my agent for that reason I don't care if I have only one request
I can use 110k but I don't know If that option is enough
Your setup isn’t wrong—64K context is fine if OpenClaw adds a lot of data at the start.
The real problem is that too much context is being injected, which fills the window quickly.
Even if you increase the limit, sending too much code or data can make answers worse, not better.
A maxTokens of 9000 is also higher than needed—3000–5000 is usually enough.
The best fix is to reduce what gets added to context .
Tools like LangChain follow this approach to stay efficient.
Guys OPENCLAW shows i hit my chatgpt Limit but i refresh the site and i can talk normally to my AI
That's why OpenClaw automatically compacts once the contextWindow limit is reached
And yes, you could reduce the context injected
By disabling (for example) unused or unnecessary skills
Reducing the agent workspace files length
Memories
etc
learned a lot.
It's enough use 64k in context windows or you this is better 110k?
I use automatic compact at 90%
I mean a higher context window is always better as it allows the agent to perform agentic tasks with more precision
So if your hardware supports it, sure
go with 110k if u can
And if you can give an answer you think is a good choice use vllm to deploy my model?
Wdym?
I actually don't know, haven't used vllm
If you wish, I can tell my agent to investigate about that
I usually ask it to deploy a subagent with either DeepSeek or Nemotron to investigate
What do you use for deploy local models?
Or
Yeah
4096
If you experience performance drops just lower it to 2048
Okey, let me try your recomendations I able to deploy my local model with 110k tokens. Let change my agent and I really aprecciate your answers
any suggestions what models are you using for regular stuff and coding
trying to setup a new basic mac mini m4 16gb was thinking gemma 4 2eb and 4eb for light weight but leave heavy works for openrouter api but not sure what models i need use
Guys OPENCLAW shows i hit my chatgpt Limit but i refresh the site and i can talk normally to my AI
But i have to refresh evry message
I tried Qwen3.5-4B and Qwen3.5-9B and are good
sorry my answer doesn't tag you
where using local or cloud?
I'm using local but I
on 16gb?
If you only have 16gb ram I'll recommend use 3.5-4B
My pc have 16gb vram with a 5070ti and 32gb ram ddr4. But I'm only use full gpu with offload the answer is really slow but the model alway give u an answer
ok got it
Yep, take a look in hugging face they are a picture describe it
And I tried both but If you have limited requirements I'll recommend first try Qwen3.5-4B or download a quantized model 9B awq
I don't know what do you use for deploy but in vllm I always use with this models this --language-model-only
Which memory plugin or skill is for now the best for the OpenClaw? 🤔
This is still low for coding it must be somewhere around 100k up
Those are system instructions the soul agent and even memory if there is so at boot that context must be considered already used
I guess lossless-claw memory is good plugin
Hi... Can someone please help me understand how to calculate the memory needed for a quantised KV cache, for a given model and a given context length?
In this case it's Unsloth Minimax M2.7 Q4_K_M GGUF, 140GB MoE (230B, 10B active params) @ 200K context. I want to see whats the best kv quant option for 2*24gb 3090 & 128gb ram (172gb total memory)
Guys, Durring call I'm getting [openclaw] Failed to start CLI: Error: listen EADDRINUSE: address already in use 127.0.0.1:3334 error.
Is there maybe a new feature that give you a live log of the tool calls etc that the agent is doing on telegram?
That usually means another openclaw process is still running on that port. Try lsof -i :3334 to find it and kill the old process, or if you're running in Docker just restart the container which clears the port binding.
Thanks for your reply. If I use that, it works initially. But after the port gets killed, when I ask the OpenClaw Telegram bot to start a conversation call again, it throws an error. I checked in the terminal and saw that the port starts again automatically.
I created a script and told the LLM to run it—so if the port is already running, it kills it first. This works as a temporary fix. But ideally, we shouldn’t need this. A direct call should work without handling the port manually.
I think this issue needs to be fixed in OpenClaw itself. What do you think?
That means another program using the port maybe an older instance of openclaw do : netstat -ano | findstr :3334. Find the Process ID (PID) at the end of the line, then run this command to close it: taskkill /PID [PID_NUMBER] /F
For that large model with 200k context a 4bit quantized cache may save a lot of memory but at that scale u may need to offload some of it from the vram with 2 3090s and that much ram u might be able to get it running
4.11 has a cve vulnerability lookup CVE-2026-33579
Anyone know the performance / resource usage of openclaw, specifically on 14y old hardware 😭
And the internet requirements (speed and amount)
Win10 LSTC slightly optimized and debloated
my ass has a cve vulnerability-
literally every version has at least one security report
shadowling runs his openclaw on a 2017 (maybe 2014 afaik?) so you should be quite fine
I notice that there is a proactive mode - where openclaw can prompt me / itself without me sending it anything. i have yet to see this work - anyone have any suggestions?
AAAAND here we go.
That is probably the heartbeat. Every 30 minutes by default. You can also set up cron jobs on a schedule.
Alright thanks
You better update
No beside that there is proactive mode if bot enabled the heartbeat actually wont do anything as it is just a beat if he dodnt setup proactive mode on every heartbeat
nah
Or downgrade and doctor --fix
No, that was in version prior to 2026.3.28 where it was already patched
keep it at q4 as well. it would be better for both kv
It was brought back again
Hi guys, I've installed the qbitorrent skill and it can see it and interact with it but it is refusing to add 🏴☠️ content to it on moral grounds 😂 Tried both GPT5.4 and MiniMax2.7 - wondering if I need to tweak some prompts or something to get it to be more obedient. Sorry, am fairly new to OpenClaw and I find YouTube is literally no help
Openclaw always have many open CVE. But not this one that you mentioned for 4.11. Show me the the link where this is indicated?
Hello, Im trying to figure out the limits of OpenClaw. My bot claims it can set up cron jobs to run independently, but from what I can tell, he never comes back after saying “I’ll handle it “
Are people actually running background scripts alongside OpenClaw to "poke" the AI, or is the bot just overpromising what it can do? Just trying to see if I’m wasting my time trying to automate stuff.
"overpromising" is a good way to put it yeah, that's a general pattern yes. However, there is indeed a built in cron system (separate from the OS cron system) that your agent can manage themself
you can see all cron jobs in the control UI
i know but why does my bot tell me "im on it" but never does something or comes back mid convo?
ah, that might be a few things... I've experienced that issue
what model are you using, and what version of openclaw?
so it isnt normal?
this was a known issue for a while and it might be solved by updating to newer OC
it's also a common problem for some models that aren't as good at tool use
It's somewhat common with gpt 5.4. It also gets worse as your context gets high in the session.
you'll find that different models perform rather differently in OC, like I personally enjoy the GLM family. Tried out MiniMax today but was not impressed with its tool use.
im on the newest openclaw and used the github copilot student plan with gpt 4.1 (without success) so i tested it with kimi k.2.5 and also minimax m.2.7 with ollama cloud so i can test it
Hi ! someone is facing connection problems wt codex ?
planning on just buying chat gpt plus and use codex but is it really that great and also the ussage?
another tip, you can try watching openclaw logs --follow in a terminal, then send a message, and watch what happens
i also had the glm lite plan and used 4.7 it was okay but did the same thing
if it stalls silently, probably the model. otherwise there should be something in the logs 💚
Hello. Do you guys know if there is something wrong with this? https://github.com/GAIR-NLP/ASI-Evolve because my Openclaw did not wanted to install it or even check it if it is bad/wrong etc. And now i do not know if this skill is "harmful" or what 🤔
Tried to ask ChatGPT about it, Gemini etc. But no information
On the heartbeat, I disabled mine because it was dragging my main session away from the model I want to use in the main session. Typically you want to use a cheaper model for heartbeats and then use the best model for your main session.
What you can do instead of using built in heartbeat is just create a cron that does the same thing, but you can bind it to a cheaper model and have it run in an isolated session so it doesn't touch your main session.
I'm not sure how often you'll hit limits with the plus plan. I use the pro plan, and I've never even come close to limits, so it's worth a try.
Regarding whether it's a good model or not, personally I find it to be the best model that is not opus 4.6.
What model are u using
i used the github copilot student plan with gpt 4.1 (without success) so i tested it with kimi k.2.5 and also minimax m.2.7 with ollama cloud
You'll want to switch to gpt-5.4 if available in your plan.
Model overlays can help too with it.
I think kimi 2.5 should be able to handle it but as ppl indicate you may try some other models , if none work know that the issue is probably in th oc not the model as sometimes thing may get chunky just do a reset of the workspace so you get clean new setup
I'm using OpenAI as my backend, using the GPT-5.4-mini model. I was using OAuth to connect to utilise my Plus plan's tokens. It's just stopped working.
When I try to re-authenticate I get the following in the OAuth flow:
Oops, an error occurred!
Workspaces not found in client auth session
gpt 4.1 in 2026 is crazy. why not use the gpt 5 models? try gpt 5.4 using the codex auth
I have it too. Downdetecter sait there is a porblem with OpenAi
but Gemini, Copilot are not working for me too
But for example DeepSeek is working just fine
im suprised gpt 4 is in any discussion anymore...
exactly
Hey all, looking for approaches to set different timeouts for LLM models.
Goal is to use a long enough timeout of 30s on my cloud models and 5m for my local fallback models to give them a chance to load and react.
This needs to be done model based and not agent based to ensure a working fallback chain without waiting hours to get to local models or kill them before they are loaded and able to respond
It's back
See if something in this issue helps? https://github.com/openclaw/openclaw/issues/44310
Not supported yet — per-model/provider requestTimeout gets rejected as "Unrecognized key." There are several open issues for it: openclaw/openclaw#43946, #41371, #60636. Only timeout available today is agents.defaults.timeoutSeconds, which is the overall agent run, not per-model LLM request.
Sad, would be awesome if it gets added in the future.
Maybe open-prose finds a way to integrate it till then
Yet that timeout isn't respected for HTTP requests
I don't know if it got fixed but there was a hardcoded 60 seconds request timeout
Interesting so I could use the long timeout and it still cuts the cloud model if it's not responding within 60s. But if I do the call for local models via https it would also kill it within 60s
any using claude code subscription as your models rather than API? If so can you suggestion how did you guys setup it
Yeah Claude OAUTH has been working again for me since they changed the routing to to go "extra usage" billing. It is significantly cheaper than API tokens from what I'm seeing.
i was using claude cli but what the issue I was facing is it was suing claude.md memory
i dont see OAUTH option
Two separate things going on here:
-
The CLAUDE.md issue: that's Claude Code CLI's own project memory file, not OpenClaw's. If you were running Claude Code directly, it loads CLAUDE.md from your project directory automatically. OpenClaw has its own memory system (MEMORY.md, SOUL.md, etc.) and doesn't use CLAUDE.md at all. So once you're running through OpenClaw you won't hit that problem.
-
For the OAuth option: you won't see a button labeled "OAuth" in the setup. The way it works is through a setup-token from the Claude Code CLI. Run "claude setup-token" from your terminal (you need Claude Code CLI installed for this), then paste that token into OpenClaw with "openclaw models auth setup-token --provider anthropic". You can also try "openclaw models auth login --provider anthropic" which should walk you through the device flow.
Once that's in, set your model to something like anthropic/claude-sonnet-4-6 under agents.defaults.model.primary and restart the gateway. Usage will bill to extra usage on your account, not your subscription quota.
Once that's in, set your model to something like anthropic/claude-sonnet-4-6 under agents.defaults.model.primary and restart the gateway. Usage will bill to extra usage on your account, not your subscription quota.
so you mean claude code still is billed by usage not by subscription?
Does anyone have any tips for using multiple AI agents in one discord server? I have issues with all the agents jumping in. What solution worked best for you?
Kind of. I mean.... to me it still seems like some of it is going to quota, not extra usage. When I say cheaper, I mean like 20x cheaper than API tokens, so some of it must be hitting subscription quota.
so it is going to use what ever tokens per hour your subscription has and routing to bill for extra usage interesting I will give it a try
exactly. Honestly I'm surprised it works, but glad it does!
can some1 help me
how much is the extra billing looking like for you I think like subsciption tokens are more than enough as they refresh every 4hours
what help do you need
$23.60 spent since 4/1
Did you setup this fresh and you initial command is not working?
can u join setup vc and i can show u
I have that yesterday i just used my chat gpt codex as my model and onboarded the initial setup and then configured openroute
and ya its fresh
yeah had that seems like openrouter issue
just use openaicodex outh for initial setup and then swap to openrouter
not sure what wrong with openrouter on freshs etup
Do you guys who run oclaw on w11 run wsl or just install it directly without?
what I'd try is make sure you have a little money loaded into the extra usage section of the Claude UI. They might look at that to decide whether or not to let you not get blocked on OAUTH CLI with OC. 🤷♂️ . But definitely a bunch of the usage is going to subscription quota. Also, worth noting I haven't hit subscription quota since I started this... It just seems kind of random when requests go to extra usage. It wasn't when I ran out of quota or anything like that.
cuz like idk what to do
WSL2
If you wanted to give OC more access to the machine its running on outside of the WSL2 directory is that possible?
may be laod like 5$ and run with subscription. I will do that and update you on it thankyou
so is like open router bugged
did you do openclaw onboard to setup right?
ya
Yeah. Your Windows drives are mounted at /mnt/c/, /mnt/d/, etc. inside WSL2, so OpenClaw can read and write files on the Windows side. You do need to configure mount permissions though. By default WSL2 mounts Windows drives with metadata enabled, but you may need to tweak your /etc/wsl.conf to control things like automount options and default permissions.
If you want OpenClaw's tools (exec, file operations) to be able to reach Windows paths, you also need to make sure those paths are within the agent's allowed directories. Check your openclaw.json for any path restrictions or sandboxing settings.
Keep the workspace itself on the Linux side for performance. Just reference /mnt/c/ paths when you need to touch Windows files.
You can also call Windows executables from inside WSL2 directly (powershell.exe, explorer.exe, etc.) if you need the agent to kick off something on the Windows side.
when you go hrought the process choose openai codex as your model and use Oauth after do you intial setup and then configure your openrouter API later from config
everything works acept ai
wait so i should use code ex but wopuldnt i need to spend more funds
This may be your issue: make sure those paths are within the agent's allowed directories.
@glad egret Have you come across any good tutorial videos for WSL2 setup using a local LLM like Gemma 4 or Qwen? I need to find a good local model I can run with my 5070...
Yeah. I saw something on youtube. I'm not sure if it's specific to WSL. Let me see if I can find it.
you use your chat gpt subscription even the free tier its just for intial setup and later it works fine when you swap to open router
There's obviously tons of tutorials and videos but I haven't seen one that walks through everything and the problems that pop up during the install and config. I had setup openclaw through WSL2 and did the Telegram thing. Had everything working with qwen 2.5:9b I think it was... I had asked for a dashboard to be built and it kept on failing to successfully build anything. Obviously my promtp was much more detailed explaining exactly what I wanted. I wasn't sure if it was the model I was using or what... I want something I can host locally as some of my data I want to keep under close watch...
was it timing out, or just not starting?
Well it would take hours and no responses like a message saying, completed, so I would query it asking it how the project was going and it would sometimes respond working on x,y,z or it would say it hit some problems and trying to fix sometimes it wouldn't even respond...
I think timing out was a very highly probability.
I wanted to build a simple dashboard so I could keep track of the projects it was working on. To see if they were in progress or finished or hitting roadblocks you know...
yeah for sure. that makes sense. I have something similar.
Honestly though it's probably a model issue... a 9B parameter model on a 5070 is just too small for complex agentic coding tasks like building a dashboard.
What is the best model I can run with 32gb RAM and a 5070 with 12gb vram?
Also is my understanding that I could use a cloud based model or something to build these dashboards and then a local LLM for my more sensitive stuff and it would keep that data separate?
Here's the long answer. This is from my AI who I've been through this testing with on a 4090, but the answer is tailored to your 12gb vram situation. Regular RAM shouldn't matter much on a W11. 32 is fine for OC.
With 12GB VRAM on the 5070, here's what's realistic for agentic/coding work in OpenClaw:
Best options that fit in 12GB at Q4 quantization:
-
Qwen3 14B: probably your best bet for coding and tool use at this VRAM tier. Fits fully in 12GB, runs around 60 tokens/sec, and handles instruction-following and multi-step tasks well. "ollama run qwen3:14b"
-
Gemma 4 26B MoE: this is the interesting one since you mentioned Gemma 4. It's a mixture-of-experts model so it only activates ~3.8B parameters per pass. It needs about 24GB RAM total but only around 17GB storage. Honestly this one might be tight on 12GB VRAM depending on context length. Worth trying but you may need to keep context short or accept some CPU offloading.
-
Gemma 4 E4B: the 4B edge model fits easily in 12GB but it's small. Fine for simple tasks but probably won't handle building a dashboard any better than the Qwen 2.5 9B you already tried.
What to avoid:
- Anything 27B+ dense (like Qwen3.5 27B or Gemma 4 31B Dense) needs 24GB+ VRAM to run without heavy CPU offloading. You'd be back in the "hours with no response" territory.
The honest truth: 12GB VRAM is the ceiling for ~14B parameter models running fully on GPU. A 14B model like Qwen3 is significantly more capable than the 9B you were running, but for complex agentic tasks like building full dashboards, even 14B will struggle compared to cloud models. If data privacy is the main concern, the Qwen3 14B is your best local option and you should pair it with a tight, specific prompt rather than open-ended "build me a dashboard" requests. Break the work into smaller steps the model can handle one at a time.
Also make sure your timeout is set high enough so it doesn't get killed mid-generation. That was probably part of your earlier problem too.
Sorry for the spammy reply.
Alright. I will re-install via that youtube video and try Qwen3 14b - what are you using for your model?
@tough siren model still isnt working
Coding: Qwen3.5 27B
Reasoning: Gemma 4 26B MoE
But I use mainly cloud providers. But I also have those and they run okay.
anyone know how to get open router to work
So your not worried about your data interfacing with cloud providers then right?
what you think about minimax 2.5 for reasoning?
What is the best bang for buck cloud provider? I have gemini pro from purchasing my pixel for the year... not a fan of API options just because the unknown expense I could incure. Does chatgpt have a model that works under the $20 subscription umbrella they offer?
It's good for coding and agentic specialist. Not as much as a general reasoning model.
got it
Yes of course. That is your best option. Use OAUTH via the Openai-Codex path in OC. Use GPT-5.4.
Anybody have issues with openai/codex Oauth? I switched to a business account to increase my usage and ever since I can't get my openclaw to harness it. I even switched back to a personal openai plus account but now that won't work
I mean... keep in mind I'm comparing it to frontier models you know?.. So like... "good" compared to what? lol
All I get is rate limit errors or just something went wrong while processing my request
I haven't had that problem recently... In the past there has been issues, but it was usually related to OpenAI being down or being slow. I know what you're talking about though. That HAS happened to me, but not recently.
So with the sensitive data topic. When your working with your agent in the chat and I am using the Openai-Codex model for dashboard building etc. Then I swap the option to Qwen3.5 does that mean everything I speak to the OC agent with while in Qwen3.5 stays out of Openai-Codex's model?
I've tried to update the config and re auth. I just can't get anything to work
Oh okay I just looked this up. Might be a plus plan issue due to rate limits. I have Pro $200, so it's never happened to me. Here's what I found:
Plus ($20/mo): The baseline tier. 160 messages per 3 hours on GPT-5, 3,000 weekly messages on GPT-5 Thinking. Access to Codex but at the lowest usage tier.
Pro $100/mo: 5x higher limits than Plus across the board. For Codex specifically, it's currently running at 10x Plus usage (temporary 2x promo through May 31, 2026, after which it drops back to the standard 5x).
Pro $200/mo: 20x higher limits than Plus. For Codex, it's currently at 25x Plus for the 5-hour window (also a temporary promo through May 31), with the ongoing rate being 20x. This is the "run it all day across parallel projects" tier.
Well. No not really. Because the session transcript persists on disk and gets forwarded as context to whatever provider handles the next turn. So no... in order to keep that information off it (OpenAI in your example), you'd need to start a new session.
Can anyone share tips about how you manage agent context window size and what level of direction you need to give to agents to achieve sufficient context detail logging for agents to remain fully productive and capable over long term ? I have been collaborating with agents to build self-use apps for now and it was going great at first. Then agents started flaking out without completing tasks despite itemizing next steps. I have tried running compaction and keeping context window well within 200K per agent session but the agent behaviour of saying "I will do x, y and z" then nothing, then when re-prompted says "You're right I didn't do that, That's on me...." over and over again without ever completing taks. I have compacted several times etc. Same issues keep re-occurring.
Is that something I could leverage discord for? Having a server with specific channels to seperate the sessions and models? I think that's how I understood the use case regarding Discord. Then I could prompt the agent through each of those channels like I would via Telegram on my phone instead of directly on the local hosted web ui chat space?
Yeah for sure. So maybe this is why you were asking about using specialized agents earlier. So yes, you could set up separate agents, one pinned to OpenAI and one pinned to local Qwen, each with their own workspace and session store. Then use bindings to route each Discord channel to the right agent. The transcripts stay completely separate since each agent has its own session history. Nothing from the local channel ever gets sent to OpenAI.
And yeah Discord works the same as Telegram or the web UI for messaging. They're all just input channels into the gateway. The session isolation happens on the OpenClaw side.
I have been getting quite a bit of output (simple single-user app code) and cron jobs created and executed reliably to create useful outputs with $US20/month gpt-5.3 codex
And once I have openclaw reinstalled through WSL and connected to gpt-5.3 codex via the $20/month option, I would expect I could tell openclaw what my goal is for discord and it should be able to put that together for me?
Yes exactly.
Tell it to check the OC docs too when you instruct it to do that. Sometimes it doesn't do that by default.
I'm replying to this, but it's going to be long so get ready. This is actually my OC's response because it can describe what I have setup to manage this.
Couple things going on here. First, compaction on current builds is buggy. There are open issues (#57410, #7477) where it does a full context reset instead of compressing, so your agent loses all task state. Run /context list after compacting to verify it actually preserved a summary and didn't just wipe to ~29K tokens.
Second, the "I'll do it" then nothing loop is a known runtime bug (#57263, #40082). The agent prioritizes generating a reply over executing the pending action. Reproduced across multiple models, so it's not your model choice.
But the bigger issue is architectural. If your agent's task awareness lives in the context window, compaction will eventually kill it. The context window is volatile. You need to treat disk as the source of truth, not the conversation.
Here's what works for me:
-
Mandatory run lifecycle logging. Every task gets a start/stop timestamp written to a daily memory file (memory/YYYY-MM-DD.md) immediately, not batched, not deferred. Add this as a non-negotiable rule in your AGENTS.md or SOUL.md.
-
Persistent state files for anything stateful. Task progress, commit mappings, checkpoint data, all in JSON files in the workspace. The agent reads these on every boot, so it knows where it left off regardless of what the context window contains.
-
Deterministic boot sequence. On every session start (including after compaction), the agent loads a fixed set of files in order: SOUL.md, USER.md, TOOLS.md, today's memory file, yesterday's memory file, MEMORY.md. This means if compaction wipes the conversation, the agent still picks up its task state from disk on the very next turn.
-
Auto-generated bootstrap context. I have a hook that runs at session start, does a vector search across workspace files and a memory store, and injects the most relevant context (capped at a token limit) into the session. So the agent doesn't just get its rules, it gets a pre-loaded snapshot of what's most important right now.
-
Auditors that catch drift. Scripts that monitor for stale tasks, unmapped work, context bloat, etc. These run independently and flag problems before the agent silently drops a task.
The pattern is: mandatory logging to disk on every action, persistent checkpoints for stateful work, deterministic recovery from disk on every session start, and independent auditors watching for drift. Compaction becomes survivable because nothing critical lives only in the context window.
The "I'll do x, y, z" then nothing problem mostly goes away when the agent has to write what it did to disk before it can report done. It forces execution before conversation.
Can someone help me
I’m using open router api and it ain’t working
This may be the most helpful sentence to you: "The "I'll do x, y, z" then nothing problem mostly goes away when the agent has to write what it did to disk before it can report done. It forces execution before conversation."
Known bug on current OpenClaw versions. OpenRouter's native integration is broken on 2026.4.14 and 2026.4.15. There are several open issues for it (openclaw/openclaw#67698, #67575, #68185). Your API key and config are probably fine. Direct calls to OpenRouter work, it's OpenClaw's execution path that's dropping the response.
You can confirm by running "openclaw logs --follow" and looking for "incomplete turn detected: payloads=0". Not much to do except wait for a patch or try rolling back to an older version.
Excellent! Thanks very much, Cameron. Super helpful!
Probably just config issue as if the api code is wrong or the endpoint corrupt u might get a fallback message of rate limit
Try rag memory instead of compaction so it has a fresh context and in soul you will direct it precisely on how to us ethe rag memory
My agents aren’t responding on discord anymore. Is something changed recently?
Do openclaw logs --follow
Forgot that existed thanks
No worries
Anyone else using vector as supplemental memory storage? If so, what are you using and how is it working for you?
I'm running OpenClaw 2026.4.15 in Docker on a VPS. API key is configured in auth-profiles.json, model is set to anthropic/claude-haiku-4-5-20251001, but the dashboard keeps showing 'no API keys configured'. agents add main keeps failing with 'model not found: anthropic/claude-opus-4-7'. How do I fix the default model reference?
Two known bugs on 2026.4.15 hitting you at once:
-
The "no API keys configured" dashboard error happens even when your key is properly in auth-profiles.json. Several issues report this (#66806, #39903). The dashboard display is wrong but the key may actually be working. Check your gateway logs to see if requests are actually failing or if it's just the UI.
-
The "model not found: anthropic/claude-opus-4-7" on agents add is because 2026.4.15 has a provider registration bug (#68241). The per-agent models.json cache regenerates without the anthropic provider block. Opus 4.6 works through a legacy hardcoded path, but 4.7 requires the provider to actually be in the cache, and it's not getting populated.
The fix for the model issue: explicitly set your model to something 2026.4.15 can resolve. Since you want Haiku, try:
openclaw models set anthropic/claude-haiku-4-5-20251001
Or set it directly in openclaw.json under agents.defaults.model.primary. The agents add command is failing because it's trying to default to opus-4-7 which isn't in the model cache.
For the dashboard "no API keys" display, verify your key is actually working by sending a test message or running:
openclaw agent --to self --message "test"
@glad egret That video ended up installing my openclaw in /tmp/openclaw and it's all wonky now lol
I added this to AGENTS.md and started new session
- state JSON logging in the standardized JSON files in ~/workspace/state/ folder is mandatory for every turn by default, and only skipped if a written project/session rule says it is disabled for that context.
I made the other suggested .md changes as well
I asked the agent what its state logging procedure was and it described it accurately
Then I asked the agent to do a task that involved creating multiple new .md files to create a new agent and it didn't write anything to the standardized state JSON logs.
When I pointed that out to the agent, it said "I should have done that ... will do it now..." but still isn't actually doing it
I captured the message logs and had ChatGPT analyze it and it gave some suggestions:
Option B — force tool usage
(logging must be a tool call, not a suggestion)
Option C — external logging (best)
(log outside the agent, not inside it)
Do you let your agent(s) execute their own logging, do you give explicit state loging schema requirements ?
oooh. sorry! I've never actually tried the stuff in the video. My bad!
Checking. I don't remember. It's been a long time. Will reply shortly.
Both, but the honest answer is you can't fix a follow-through problem by adding more follow-through instructions.
Telling the agent "you must log" and having it not log is the same problem you started with. The agent says it'll do something and doesn't. Adding more rules about logging doesn't break that cycle.
External logging is your foundation. The runtime already writes session transcripts (JSONL files) for every tool call, message, and action. You don't need the agent to self-report. Write a script that reads the transcript and extracts task state from what the agent actually did: files written, commands run, tools called. This is Option C and it's the most reliable because it doesn't depend on the agent remembering anything.
Agent self-logging works when the instruction is atomic, not aspirational. "Log state every turn" is aspirational. The agent agrees conceptually but never does it because there's always something more interesting to do first. What works:
• Gate completion on it: "You may not report a task as complete until you have written {task, status, timestamp} to state/current.json. If that file doesn't reflect the work, the work didn't happen."
• Make it the first action, not the last: "Before starting any task, write status:started. Before responding to the user, update to completed or blocked."
• Use a structured tool if one exists (like TodoWrite) rather than freeform file writes. The tool enforces the schema so the agent can't half-ass it.
Agent self-logging will always have gaps though. External capture catches everything. Use both: external logging as the source of truth, agent self-logging as a discipline that improves completion rates because writing "started" forces the agent to commit before getting distracted.
The "I should have done that, that's on me" loop is the agent being polite instead of executing. The fix isn't more instructions. It's making logging structurally unavoidable (external capture) or gating something the agent does care about (reporting done) behind the logging step.
To be clear, the disk logging I mentioned earlier still applies. You need both the agent writing state and an external capture layer watching what it actually did.
Here's what I actually run to deal with this.
External capture over self-reporting. The runtime writes session transcripts automatically for every tool call and action. A cron scans those transcripts every 2 hours, extracts the meaningful content, and pushes it into a memory store. The agent doesn't have to remember to log. If it happened, it's captured.
Gated cron contracts. For automated recurring work, wrapper scripts must produce a result artifact before the agent prompt can generate output. If the artifact doesn't exist, the work didn't happen and the agent can't fake a completion.
Narrow mandatory rules, not blanket policies. Instead of "log everything every turn," I write specific protocols for specific task types. Report generation has a full template-clone-and-verify protocol. Outbound messages on any channel must be logged with timestamp and target. These are concrete enough that the agent follows them because there's zero ambiguity about what "done" means.
Deterministic boot with vector search. On every session start, a hook runs a semantic search across workspace files and a memory store, then injects the most relevant results into the session. After compaction or reset, the agent comes back with awareness of what matters right now, not everything, just the top results capped at a token budget.
Append-only daily ledger. Every automated job writes start and stop timestamps to a daily memory file. A consolidation pass runs overnight and promotes the most important entries into long-term memory that loads on every session.
Independent monitors. A reaper cleans stale sessions every 2 hours. A watchdog checks system health on a schedule. A context monitor triggers an automatic reset with state preservation when the session hits capacity.
The pattern that made the biggest difference was external transcript capture plus the deterministic boot. Once your agent's memory doesn't depend on the context window surviving, compaction stops being a crisis.
That is a lot of words
yeah sorry it's my AI writing it saying what my setup does that works.