How to use local ollama model to run heartbeat? | Friends of the Crustacean 🦞🤝 | Page 1

uncut niche Mar 13, 2026, 4:30 AM

#

I want to use local ollama model to run heartbeat. ohno
Openclaw Version: 2026.3.11
Model: qwen2.5-3b on cpu machine
My heartbeat config:

        "heartbeat": {
          "every": "1m",
          "model": "ollama/qwen2.5-3b-64k:latest",
          "lightContext": true
        }

but the openclaw logs will report

12:00:01 warn agent/embedded {"subsystem":"agent/embedded"} Removed orphaned user message to prevent consecutive user turns. runId=45b64a2b-229c-4439-89c3-e39cac3286c7 sessionId=2a73ffd0-bea9-4015-b250-6b4de6df91b5
12:10:01 warn agent/embedded {"subsystem":"agent/embedded"} embedded run timeout: runId=45b64a2b-229c-4439-89c3-e39cac3286c7 sessionId=2a73ffd0-bea9-4015-b250-6b4de6df91b5 timeoutMs=600000
12:10:01 warn agent/embedded {"subsystem":"agent/embedded"} Profile ollama:default timed out. Trying next account...

I try to run curl to test the model ,all is ok:

meepo@meepopc:/data1/workspace$ curl -N http://localhost:11434/api/chat -d '{
  "model": "qwen2.5-3b-64k:latest",
  "messages": [
    { "role": "user", "content": "1+1=?" }
  ],
  "options": {
    "num_ctx": 32768
  }
}'
{"model":"qwen2.5-3b-64k:latest","created_at":"2026-03-13T03:55:16.18305073Z","message":{"role":"assistant","content":"two"},"done":false}
{"model":"qwen2.5-3b-64k:latest","created_at":"2026-03-13T03:55:16.272794902Z","message":{"role":"assistant","content":""},"done":true,"done_reason":"stop","total_duration":602120389,"load_duration":263165407,"prompt_eval_count":39,"prompt_eval_duration":91826431,"eval_count":2,"eval_duration":86661250}

Is it because the OpenClaw HEARTBEAT is carrying too much context? Can we reduce it?

chrome coyoteBOT Mar 13, 2026, 4:30 AM

#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

fierce sparrowBOT Mar 13, 2026, 4:36 AM

#

That embedded run timeout … timeoutMs=600000 means the heartbeat turn never finished within 10 minutes. Your curl /api/chat test is “all ok” because it’s a tiny prompt — but OpenClaw heartbeats can be much heavier than 1+1=? even with lightContext.

Key point: lightContext: true only shrinks the injected bootstrap files (it basically keeps just HEARTBEAT.md). It does not automatically remove:

the existing session chat history, and/or
the tool schema payload (which can be surprisingly large, and small CPU models struggle with it)

Ways to reduce heartbeat “context” / make it finish fast

Don’t run heartbeat every 1 minute while debugging
If a run takes >60s, you’ll get weirdness like:
Removed orphaned user message to prevent consecutive user turns
because new heartbeat ticks stack up behind an in-flight / timed-out turn.
Try every: "5m" or "15m" first.
Move heartbeat to a “clean” agent (best fix for giant session history)
If your main session has lots of messages, heartbeat will drag that along. Easiest is a dedicated heartbeat-only agent:

#

{
  agents: {
    list: [
      { id: "main", default: true },
      {
        id: "hb",
        tools: { profile: "minimal" },
        heartbeat: {
          every: "5m",
          model: "ollama/qwen2.5-3b-64k:latest",
          lightContext: true,
          target: "last"
        }
      }
    ]

#

}
}

Important nuance: if any agent has a heartbeat block, only those agents run heartbeats.

Shrink the tool payload for Ollama (often the hidden killer)
If you don’t want a separate agent, at least restrict tools for the Ollama provider/model so the system prompt isn’t gigantic:

{
  tools: {
    byProvider: {
      ollama: { profile: "minimal" }
      // or even "ollama/qwen2.5-3b-64k:latest": { profile: "minimal" }

#

    }
  }
}

That cuts the available tools down drastically (and reduces the schema the model has to “read”).

Docs (relevant knobs)

Heartbeat config + lightContext: https://docs.openclaw.ai/gateway/heartbeat
Ollama provider notes: https://docs.openclaw.ai/providers/ollama
Tool profiles / allowlists (why this affects prompt size): https://docs.openclaw.ai/gateway/sandbox-vs-tool-policy-vs-elevated

If you paste one thing, paste this

Can you run and paste (redact anything sensitive, though Ollama keys are usually dummy):

#

openclaw models status --plain

Mainly I want to see (a) what baseUrl OpenClaw is using for Ollama, and (b) what tool profile is currently active.

uncut niche Mar 13, 2026, 4:42 AM

#

how to create a clean hearbeat agent

fierce sparrowBOT Mar 13, 2026, 4:44 AM

#

To make a “clean heartbeat agent”, you want a separate agentId + separate workspace + its own empty session history, and then enable heartbeat only on that agent.

1) Create a new agent + fresh workspace

Use the CLI (recommended):

openclaw agents add hb --workspace ~/.openclaw/workspace-hb

This creates a new isolated agent with its own:

workspace files (AGENTS.md, SOUL.md, HEARTBEAT.md, …)
session store (so it won’t carry your main chat history)

Docs: https://docs.openclaw.ai/cli/agents

2) Configure heartbeat only on that agent

Edit ~/.openclaw/openclaw.json (or use openclaw config set …) so only hb has a heartbeat block.

#

Example:

{
  agents: {
    list: [
      { id: "main", default: true },

      {
        id: "hb",
        workspace: "~/.openclaw/workspace-hb",

        // IMPORTANT: keep tools tiny so the system prompt isn't huge for a small CPU model
        tools: { profile: "minimal" },

        heartbeat: {

#

          every: "5m", // start slower than 1m while testing
          model: "ollama/qwen2.5-3b-64k:latest",
          lightContext: true,

          // deliver somewhere explicit (Discord example)
          target: "discord",
          to: "1481872120282873887" // channel/thread id you want to receive heartbeats in
        }
      }
    ]
  }
}

Notes:

tools.profile: "minimal" is a big deal for local small models — it massively shrinks the tool schema payload.

#

For Discord, to should be the channel/thread ID (enable Discord Developer Mode → right-click channel/thread → Copy Channel ID).

Docs:

Heartbeat: https://docs.openclaw.ai/gateway/heartbeat
Tool profiles: https://docs.openclaw.ai/gateway/sandbox-vs-tool-policy-vs-elevated

3) Restart the gateway

openclaw gateway restart

4) Keep the heartbeat checklist tiny

Edit:
~/.openclaw/workspace-hb/HEARTBEAT.md

Keep it short (a few bullets). If it’s “effectively empty”, OpenClaw will skip heartbeat runs to save compute.

#

If you paste your current agents block from openclaw.json (redact tokens), I can show you the exact minimal diff to add hb without breaking your existing setup.

uncut niche Mar 13, 2026, 4:46 AM

#

i want this heartbeat agent can run all agent’s HEARTBEAT.md ，ok?

fierce sparrowBOT Mar 13, 2026, 4:47 AM

#

Not really “one heartbeat agent runs every agent’s HEARTBEAT.md”.

In OpenClaw, HEARTBEAT.md is per-agent workspace, and heartbeat turns are designed to run per agent (so each agent only sees its own checklist + its own session context). That’s the clean / scalable way.

Recommended: enable heartbeats on each agent (each reads its own HEARTBEAT.md)

You have two clean options:

Option A) Use `agents.defaults.heartbeat` (runs for all agents)

Make sure you do not set any agents.list[].heartbeat blocks (because if any agent has one, only those agents run heartbeats).

{
  agents: {
    defaults: {
      heartbeat: {
        every: "5m",

#

        model: "ollama/qwen2.5-3b-64k:latest",
        lightContext: true,
        target: "discord",
        to: "1481872120282873887"
      }
    },
    list: [
      { id: "main", default: true, workspace: "~/.openclaw/workspace-main" },
      { id: "coding", workspace: "~/.openclaw/workspace-coding" }
    ]
  }
}

Option B) Put a `heartbeat` block on every agent

This is best if you want different cadence/model per agent.

#

Docs note (important): https://docs.openclaw.ai/gateway/heartbeat#per-agent-heartbeats

“Can I hack it so one agent reads all the other HEARTBEAT.md files?”

Only if you intentionally break isolation by letting that agent use tools to read other workspaces’ files (and you’d have to hardcode the paths). It’s doable, but it’s exactly the kind of thing that becomes slow on a small CPU model and tends to reintroduce the timeout problem you’re hitting.

So: if the goal is “reduce context + avoid timeouts”, per-agent heartbeats are the right way.

If you tell me how many agents you have + their workspace paths (just the paths, no tokens), I can paste a config snippet in the exact shape your openclaw.json is currently using.

uncut niche Mar 13, 2026, 4:50 AM

#

If every agent run it‘s heartbeat , will have large context.

fierce sparrowBOT Mar 13, 2026, 4:51 AM

#

Running heartbeats for every agent does not create one “combined huge context” prompt.

What happens is: each agent runs its own separate heartbeat turn, with that agent’s session history + its own workspace files. So:

Context size per heartbeat depends on that agent’s session/history (and tool payload), not on how many agents exist.
But total CPU/load scales roughly linearly with the number of agents that heartbeat.

So if you enable heartbeat on 10 agents at every: "1m" on a CPU + 3B model, you’re basically asking for 10 LLM runs per minute — it’ll fall behind and you’ll see timeouts.

Practical ways to keep it small + fast

Only heartbeat the agents that actually need it
- For others: either disable (every: "0m") or keep their HEARTBEAT.md empty/comments so OpenClaw skips the run automatically.
Use a dedicated “hb” agent (clean history) and keep other agents’ heartbeats off
This is usually the best “small local model” setup.
Keep lightContext: true and set the heartbeat agent’s tools profile to minimal:

#

   tools: { profile: "minimal" }

(Huge tool schema is a common reason small Ollama models stall.)

Also: 1m is extremely aggressive. Even for one agent, I’d start at 5m or 15m until you confirm it reliably completes.

If you tell me how many agents you have and what you want the heartbeat to do (just “keep alive”, or “check inbox”, etc.), I can suggest a sane cadence + layout.

uncut niche Mar 13, 2026, 4:54 AM

#

set tools config likes
"tools": {
"web": {
"search": {
"enabled": false
},
"fetch": {
"enabled": false
}
},
"sessions": {
"visibility": "all"
},
"agentToAgent": {
"enabled": true,
"allow": [
"main",
"coding"
]
}
"byProvider": {
"ollama": { "profile": "minimal" }
"ollama/qwen2.5-3b-64k:latest": { "profile": "minimal" }
}
}
is ok ?

fierce sparrowBOT Mar 13, 2026, 4:55 AM

#

Mostly yes — those keys are real (tools.web.*, tools.sessions.visibility, tools.agentToAgent, tools.byProvider.*). The main issues with what you pasted are:

You’re missing commas (your snippet as-is won’t parse)
You only need one of these (provider-level already covers the model):
- byProvider.ollama.profile: "minimal"
- byProvider["ollama/qwen2.5-3b-64k:latest"].profile: "minimal" (redundant if the provider entry exists)

Here’s a corrected, valid JSON5 example:

{
  tools: {
    web: {
      search: { enabled: false },
      fetch: { enabled: false },
    },

#

    sessions: { visibility: "all" },
    agentToAgent: {
      enabled: true,
      allow: ["main", "coding"],
    },
    byProvider: {
      ollama: { profile: "minimal" },
      // optional / redundant if you already have the provider rule:
      // "ollama/qwen2.5-3b-64k:latest": { profile: "minimal" },
    },
  },
}

Two important “gotchas”:

#

tools.byProvider.ollama.profile: "minimal" applies to all uses of Ollama, not just heartbeat. If you ever try to chat with the Ollama model and want file tools etc, it’ll be blocked.
If your goal is “only heartbeat should be minimal”, the cleaner approach is: set tools.profile: "minimal" on the heartbeat agent (agents.list[].tools.profile) instead of doing it globally by provider.

Docs:

Tools profiles / byProvider: https://docs.openclaw.ai/gateway/configuration-reference#tools
Tool policy overview: https://docs.openclaw.ai/gateway/sandbox-vs-tool-policy-vs-elevated

uncut niche Mar 13, 2026, 6:13 AM

#

my memorySearch and memoryFlush config block is ok ?

      "memorySearch": {
        "enabled": true,
        "sources": [
          "memory",
          "sessions"
        ],
        "provider": "ollama",
        "fallback": "none",
        "model": "ollama/embeddinggemma:300m-qat-q8_0"
      },
      "contextPruning": {
        "mode": "cache-ttl",
        "ttl": "30m"
      },
      "compaction": {
        "mode": "safeguard",
        "memoryFlush": {
          "enabled": true,
          "prompt": "Write any lasting notes to memory/YYYY-MM-DD.md; reply with NO_REPLY if nothing to store.,",
          "systemPrompt": "Session nearing compaction. Store durable memories now."
        }
      }

fierce sparrowBOT Mar 13, 2026, 6:16 AM

#

A few notes — you’re close, but there are 2 important correctness details and 1 practical gotcha.

1) `memorySearch` must be under `agents.defaults.memorySearch`

Docs explicitly call this out: config goes under agents.defaults.memorySearch, not top-level memorySearch.

So this is correct only if your snippet is inside something like:

{
  agents: {
    defaults: {
      memorySearch: { ...your block... }
    }
  }
}

#

2) `sources: ["memory", "sessions"]` needs the experimental flag

Session transcript indexing is opt-in. If you want "sessions" in sources, add:

experimental: { sessionMemory: true }

Example:

agents: {
  defaults: {
    memorySearch: {
      enabled: true,
      experimental: { sessionMemory: true },
      sources: ["memory", "sessions"],

#

      provider: "ollama",
      fallback: "none",
      model: "ollama/embeddinggemma:300m-qat-q8_0"
    }
  }
}

(Without experimental.sessionMemory: true, the "sessions" source won’t behave the way you expect.)

3) Your `compaction.memoryFlush.prompt` has a typo + tool-policy gotcha

You have an extra comma at the end of the prompt: "... if nothing to store.," → remove that trailing comma.
Memory flush only helps if the agent can actually write files. If you set Ollama to tools.profile: "minimal" (session_status-only) for the same sessions that are compacting, the model won’t be able to write those memory/YYYY-MM-DD.md notes, so the flush becomes mostly pointless.

`contextPruning`

contextPruning: { mode: "cache-ttl", ttl: "30m" } is valid (when placed under agents.defaults.contextPruning) — just remember it mainly prunes old tool result blobs, not normal chat history.

#How to use local ollama model to run heartbeat?