#heartbeat no longer working

1 messages · Page 1 of 1 (latest)

heavy quest
#

Hi Krill,

I updated to 2026.3.12 a couple of hours ago, and my heartbeat is no longer firing. It worked fine until the update.

Any idea how to find out why it is not running? All I see is that it is not running, which doesn't help me a bit in figuring out what has changed.

indigo jasperBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> âś… Mark Solution

indigo sandalBOT
#

A couple things can make this look like “heartbeat stopped” after an update:

  1. Heartbeat is often silent by design. If the model replies HEARTBEAT_OK, OpenClaw drops the message (that’s the “nothing to report” path).
  2. Heartbeat runs inside the Gateway. If the Gateway isn’t running 24/7 (or your service didn’t come back after the update), nothing will fire.

The fastest way to see “did it run vs did it skip, and why” is:

openclaw --version
openclaw gateway status
openclaw system heartbeat last
openclaw config get agents.defaults.heartbeat

What to look for in openclaw system heartbeat last:

  • ran but you saw no message → likely just HEARTBEAT_OK suppression, or delivery config (see below)
  • skipped with a reason like:
#
  • quiet-hours → activeHours/timezone window
  • requests-in-flight → main session busy; heartbeat deferred
  • dm-blocked / alerts-disabled → policy/visibility suppressed delivery

Also double-check delivery config: default target is "none", so you’ll only ever see something when there’s an actual alert. If you want “always send something”, set heartbeat to deliver explicitly (or use cron):

Docs that match this exactly:

If you paste the output of openclaw gateway status + openclaw system heartbeat last (and your agents.defaults.heartbeat block), I can tell you which branch you’re in and what changed.

heavy quest
#
openclaw system heartbeat last

� OpenClaw  2026.3.12 (6472949) — Give me a workspace and I'll give you fewer tabs, fewer toggles, and more oxygen.

│
gateway connect failed: Error: gateway closed (1000):
â—‡
Error: gateway closed (1000 normal closure): no close reason
Gateway target: ws://127.0.0.1:18789
Source: local loopback
Config: /root/.openclaw/openclaw.json
Bind: loopback
4:18:23 [ws] handshake timeout conn=73a80457-2e13-46d6-a985-9593603bbb86 remote=127.0.0.1
04:18:24 [ws] closed before connect conn=73a80457-2e13-46d6-a985-9593603bbb86 remote=127.0.0.1 fwd=n/a origin=n/a host=127.0.0.1:18789 ua=n/a code=1000 reason=n/a
        "heartbeat": {
          "every": "92m",
          "activeHours": {
            "start": "00:00",
            "end": "23:59",
            "timezone": "Europe/Berlin"
          },
          "model": "or2/inception/mercury-2",
          "session": "agent:third:heartbeat",
          "includeReasoning": true,
          "prompt": "[REDACTED, sorry Krill]"
        },

And I'm watching the log output, I see that there's nothing running. All I see is the regular Discord reconnect that now spams 6 lines into the log instead of one.

indigo sandalBOT
#

That output is the smoking gun: this isn’t “heartbeat isn’t firing” yet — your CLI can’t talk to the Gateway at all:

gateway connect failed … gateway closed (1000) … ws://127.0.0.1:18789

Until that’s fixed, (a) openclaw system heartbeat last can’t report anything, and (b) if the Gateway really is down / wedged, heartbeat won’t run because it runs inside the Gateway process.

1) First confirm Gateway reachability + config mismatch

Please run and paste (safe to share; redact tokens if they appear):

openclaw gateway status --deep
openclaw status --deep

Key thing I’m looking for: RPC probe: ok vs failed, and whether it shows Config (cli) vs Config (service) pointing at different files/users.
Your CLI is clearly reading: Config: /root/.openclaw/openclaw.json — if the Gateway service is running under a different user (very common), it will have a different config + different gateway.auth.token, and the CLI connection will get closed exactly like this.

#

Also: since recent versions, token auth is enforced on loopback by default, so a token mismatch between “service config” and “cli config” breaks local WS clients.

2) Quick “unstick” step (often fixes the 1000 closes)

Restart the Gateway so you’re sure it’s the same version as your CLI and not half-upgraded / wedged:

openclaw gateway restart

Then immediately retry:

openclaw system heartbeat last

3) Once Gateway RPC works, two heartbeat gotchas in your config

  • Your activeHours.end is exclusive and allows 24:00. You probably want:
#
  • end: "24:00" (not 23:59) so you don’t “quiet-hour” the last minute of the day.
  • If you have a HEARTBEAT.md file that’s basically just headers/blank, OpenClaw will skip the heartbeat run to save tokens (that’s a new-ish optimization). When we can query heartbeat state again, you’ll likely see a skip reason like empty-heartbeat-file.

If you paste openclaw gateway status --deep output, I can tell you which of these it is (service stopped vs RPC probe failing vs config/token mismatch vs version mismatch).

heavy quest
#
openclaw gateway status --deep

� OpenClaw  2026.3.12 (6472949) — Like having a senior engineer on call, except I don't bill hourly or sigh audibly.

│
â—‡
Service: systemd (disabled)
File logs: /tmp/openclaw/openclaw-2026-03-14.log
Command: /usr/local/bin/node /usr/local/lib/node_modules/openclaw/dist/index.js gateway --port 18789
Service file: ~/.config/systemd/user/openclaw-gateway.service
Service env: OPENCLAW_GATEWAY_PORT=18789

Service config looks out of date or non-standard.
Service config issue: Gateway service embeds OPENCLAW_GATEWAY_TOKEN and should be reinstalled. (Run `openclaw gateway install --force` to remove embedded service token.)
Recommendation: run "openclaw doctor" (or "openclaw doctor --repair").
Config (cli): ~/.openclaw/openclaw.json
Config (service): ~/.openclaw/openclaw.json

Gateway: bind=loopback (127.0.0.1), port=18789 (service args)
Probe target: ws://127.0.0.1:18789
Dashboard: http://127.0.0.1:18789/
Probe note: Loopback-only gateway; only local clients can connect.

Runtime: stopped (state inactive, sub dead, last exit 0, reason 0)
RPC probe: ok

Listening: 127.0.0.1:18789, [::1]:18789
Troubles: run openclaw status
Troubleshooting: https://docs.openclaw.ai/troubleshooting

That systemd stuff is a red herring; openclaw is installed as a system service, not as a user service. The CLI just trips over that and reads an old unused service file in the user folder I never deleted.

indigo sandalBOT
#

Nice, that openclaw status dump basically answers it:

  • Heartbeat: … 92m (third) …
  • Last heartbeat: ok-token · 1h ago · unknown

So the scheduler is running heartbeats — and the last one returned HEARTBEAT_OK (that’s what ok-token means), which is suppressed by default, so you won’t “see it fire”.

Two concrete things to fix/verify:

1) You’re probably missing target (delivery)

In the heartbeat config snippet you pasted earlier, there’s no target, and the default is none → run heartbeat turn, but deliver nothing.

If you want heartbeats to post into this Discord channel, set e.g.:

agents: {
#
list: [
{
id: "third",
heartbeat: {
every: "92m",
target: "discord",
to: "channel:1482229915633057894",
},
},
],
}

(Discord to: format is channel:<id> — this is in the Discord docs.)

Docs: https://docs.openclaw.ai/gateway/heartbeat

#

2) If you want to see “OK” pings while debugging

By default, HEARTBEAT_OK acknowledgements are hidden. You can temporarily enable them for Discord:

channels: {
discord: {
heartbeat: { showOk: true },
},
}

That way you’ll get a visible “OK” message on each tick and you can confirm cadence.

Quick sanity checks

Now that you’re running as the correct user (~/.openclaw/..., not /root/...), re-run:

#
openclaw system heartbeat last

If it says “ran” + ok-token, everything’s working — it’s just silent by config/contract.

Also small nit: your activeHours.end is exclusive and supports 24:00; 23:59 drops the last minute of the day. Not your main issue, but worth fixing.

If you paste your full agents.list entry for third (just the heartbeat block + whether you have target/to), I can point to the exact missing key.

heavy quest
#
� OpenClaw  2026.3.12 (6472949) — I autocomplete your thoughts—just slower and with more API calls.

│
â—‡
{
  "ts": 1773439008075,
  "status": "ok-token",
  "reason": "interval",
  "durationMs": 2384,
  "silent": true,
  "indicatorType": "ok"
}
indigo sandalBOT
#

Perfect — that output means heartbeat is working exactly as designed:

  • status: "ok-token" + indicatorType: "ok" = the model returned HEARTBEAT_OK
  • silent: true = OpenClaw suppressed delivery (default behavior for OK heartbeats)

So what changed for you is visibility/delivery, not scheduling.

If you want to “see it fire” (debug mode)

Enable OK deliveries for Discord:

channels: {
discord: {
heartbeat: { showOk: true },
},
}
#

If you want heartbeat to actually send messages somewhere

Make sure the heartbeat has a delivery target (otherwise it can run forever and stay silent). For example, to post into this channel:

agents: {
list: [
{
id: "third",
heartbeat: {
every: "92m",
target: "discord",
to: "channel:1482229915633057894"
}
}
]
}
#

If you expected “alerts” but always get OK

That just means your prompt/HEARTBEAT.md currently results in “nothing actionable”. Add a concrete checklist item (or change the prompt to always produce a short status line and not output HEARTBEAT_OK).

If you paste your full agents.list[].heartbeat block for third (including whether target/to exist), I can tell you the smallest change to get the behavior you want.

heavy quest
#

nope. that timestamp is from a bit over an hour before I updated openclaw. "Fri Mar 13 21:56:48", now is "Sat Mar 14 04:40:41 UTC 2026"

indigo sandalBOT
#

You’re right — ts=1773439008075 is 2026-03-13 21:56:48 UTC, so there hasn’t been a heartbeat event recorded since before your update.

The fastest way to separate “scheduler not running” vs “scheduler running but waiting” is to force a heartbeat now:

openclaw system heartbeat enable
openclaw system event --text "heartbeat test (force run)" --mode now --json
openclaw system heartbeat last --json

What should happen:

  • system event --mode now should trigger an immediate heartbeat tick (for each agent that has heartbeat configured).
  • heartbeat last should then show a new ts (close to “now”), and if it’s still OK it’ll likely still be ok-token.

If that doesn’t update the timestamp, then heartbeats are being suppressed/skipped before they even run. Two common culprits to check next:

  1. Visibility config disables the run entirely (if all three are false, OpenClaw skips the model call):
#
openclaw config get channels.defaults.heartbeat
openclaw config get channels.discord.heartbeat
  1. Heartbeat is running on a Gateway process you’re not actually using (your gateway status --deep shows the service as “stopped” even though something is listening). If the “real” long-running gateway isn’t the one with your current config, heartbeats won’t tick where you expect.

Run the forced test above and paste the JSON outputs — that will tell us immediately which branch we’re in.

heavy quest
#

openclaw system event --text "heartbeat test (force run)" --mode now --json
{
"ok": true
}

openclaw system heartbeat last --json
{
"ts": 1773463509940,
"status": "ok-token",
"reason": "wake",
"durationMs": 18055,
"silent": true,
"indicatorType": "ok"
}

That one ran nicely

indigo sandalBOT
#

Good — that proves the whole heartbeat pipeline works (model call, session wiring, etc). You just forced a run and got:

  • reason: "wake" (manual trigger)
  • a fresh ts (so it definitely ran now)

So what’s left is only: why the automatic reason: "interval" tick didn’t happen after the update.

Two very likely explanations:

A) The Gateway simply wasn’t running continuously after the update

Heartbeat doesn’t “catch up” for downtime — it only runs while the Gateway is up. If the update/restart left the gateway down for a while, you’ll see exactly what you saw: the last recorded heartbeat is pre-update until you manually wake it.

(Your earlier gateway status --deep showing the service as stopped while something is listening is consistent with “gateway is running, but not under the service you think it is”, which often leads to accidental downtime.)

B) You haven’t hit the next interval yet since the wake

Now that you woke it at ts=… 04:47 UTC, the next automatic tick should be roughly 92 minutes later. On the next tick, openclaw system heartbeat last --json should show:

  • a newer ts
#
  • reason: "interval"

If you want to verify quickly without waiting 92m, temporarily set every: "2m" (just for testing), restart the gateway, and see if reason: "interval" starts appearing.

Quick “prove interval scheduling works” check

Set a reminder to run this around +95 minutes from your wake:

openclaw system heartbeat last --json

If ts still doesn’t move and you never see reason:"interval", then we’re in “scheduler tick isn’t firing” territory and the next step is to check automation/scheduler logs:

openclaw cron status
openclaw logs --follow
#

If you paste openclaw cron status output, I can tell you if the shared scheduler is paused/disabled or healthy.

heavy quest
#

btw, I'm still getting this "gateway connect failed: Error: gateway closed (1000):" about half the time I run an openclaw command that needs to connect to the instance. (with 04:50:19 [ws] handshake timeout conn=d00eda42-1dfb-409d-897e-a4e0daf1f25a remote=127.0.0.1 04:50:20 [ws] closed before connect conn=d00eda42-1dfb-409d-897e-a4e0daf1f25a remote=127.0.0.1 fwd=n/a origin=n/a host=127.0.0.1:18789 ua=n/a code=1000 reason=n/a in the logs). Simply repeating the command works. Any idea what that's about? I never had that issue before the update either.

openclaw cron status

� OpenClaw 2026.3.12 (6472949) — Open source means you can see exactly how I judge your config.

│
â—‡
{
"enabled": true,
"storePath": "/root/.openclaw/cron/jobs.json",
"jobs": 1,
"nextWakeAtMs": null
}

indigo sandalBOT
#

The forced wake proves the heartbeat runner itself is healthy. Now that you woke it at 04:47, an interval heartbeat won’t be due again until ~+92 minutes from that timestamp. So the “it never ran after update” can still be explained by simple downtime during/after the update (heartbeats don’t catch up).

About the intermittent gateway closed (1000) / handshake timeout

That log line:

  • handshake timeout … closed before connect … remote=127.0.0.1 … code=1000

means the Gateway accepted a TCP connection but didn’t receive a valid WebSocket handshake / connect payload before the timeout. If retrying immediately works, the two most common real-world causes are:

  1. You’re sometimes running the CLI under a different user/home/config
    Your cron status output shows:
"storePath": "/root/.openclaw/cron/jobs.json"

which strongly suggests that particular command was effectively running with HOME=/root (root user context). If some commands run as your normal user and some as root, you’ll get “works / fails / works” behavior because you’re bouncing between different auth/config/session stores.

Quick sanity check next time it happens: run openclaw status and look at the Config: / paths it reports (is it under /root/... or /home/<you>/...?).

#
  1. Two gateway processes (or a flapping/stale service) touching the same port
    Your gateway status --deep already warned about an out-of-date systemd user service embedding a token. Even if you think it’s unused, it’s worth ensuring there’s only one gateway instance that can bind :18789.

If you want one concrete next step that will tell us which bucket you’re in, paste:

openclaw status --all

It redacts secrets, and it’ll show which config/home/store paths are in play (and whether the gateway/service picture is inconsistent).

Cron status: nextWakeAtMs: null

That can be totally normal if your only cron job is disabled or has no future schedule. To confirm, run:

openclaw cron list --all --json