#Gateway response slow/failing before and after upgrading to 2026.4.25
1 messages · Page 1 of 1 (latest)
That sounds like two things may be stacked:
- Model/provider overload from Gemini preview.
- Gateway/channel delay because Telegram messages not appearing in the Gateway UI for ~3 minutes â that part is before the model response, so I wouldnât treat this as only a model problem.
Please run/paste this safe diagnostic bundle:
openclaw --version
which openclaw
openclaw status --all
openclaw gateway status --deep
openclaw channels status --probe
openclaw models status
Then tail logs while sending one Telegram message:
openclaw logs --follow
Look for lines mentioning:
telegrampolling stalls/timeoutsqueued for ...ms- provider errors like
overloaded,resource exhausted,429,timeout - event-loop delay / liveness warnings
Also run:
openclaw doctor
openclaw gateway restart
One nuance: if you manually selected a model with /model, that session selection is strict; it may fail visibly instead of walking your configured fallback chain. For debugging, /new is worth trying, but the 3-minute UI delay still points at Gateway/Telegram polling/queue/event-loop issues.
Docs for the relevant runbook: https://docs.openclaw.ai/help/faq and https://docs.openclaw.ai/gateway/troubleshooting đŠ
I pasted the diagnostic bundle.
I then ran tail logs and sent a message. But got a message is the gateway even running
I then ran openclaw doctor, doctor --fix and re-started the gateway
Each of the openclaw processes took a long time compared to previously
I donât see the diagnostic bundle in this thread â Discord may have eaten it if it was too long.
The new clue is important: if even openclaw ... CLI commands are slow, this is probably below Telegram/model now: gateway/service startup, machine resource pressure, or OpenClaw hanging while probing the gateway.
Please paste these as code blocks:
openclaw --version
which openclaw
time openclaw gateway status --deep
time openclaw status --all
And the latest logs without --follow:
tail -n 200 "$(ls -t /tmp/openclaw/openclaw-*.log | head -1)"
Also this read-only system snapshot:
ps -eo pid,ppid,etime,%cpu,%mem,cmd | grep -E 'openclaw|node' | grep -v grep
free -h
df -h
If gateway status says RPC/connectivity is down or âis the gateway even runningâ, donât keep testing Telegram yet â first we need the gateway healthy. Docs runbook: https://docs.openclaw.ai/gateway/troubleshooting
What do you mean by "paste these as code blocks"
In Discord, wrap the output with triple backticks so formatting stays readable.
Like this:
```bash
openclaw --version
which openclaw
time openclaw gateway status --deep
```
Then after you run it, paste the output like:
```text
<paste terminal output here>
```
The important part is the three backticks before and after:
your output here
'''đŠ OpenClaw 2026.4.25 (aa36ee6) â I'm the assistant your terminal demanded, not the one your sleep schedule requested.
â
â
Service: systemd (enabled)
File logs: /tmp/openclaw/openclaw-2026-04-28.log
Command: /usr/bin/node /home/XXXXX/.npm-global/lib/node_modules/openclaw/dist/index.js gateway --port 18789
Service file: ~/.config/systemd/user/openclaw-gateway.service
Service env: OPENCLAW_GATEWAY_PORT=18789
Config (cli): ~/.openclaw/openclaw.json
Config (service): ~/.openclaw/openclaw.json
Gateway: bind=loopback (127.0.0.1), port=18789 (service args)
Probe target: ws://127.0.0.1:18789
Dashboard: http://127.0.0.1:18789/
Probe note: Loopback-only gateway; only local clients can connect.
Runtime: running (pid 387527, state active, sub running, last exit 0, reason 0)
Connectivity probe: ok
Capability: admin-capable
Listening: 127.0.0.1:18789
Troubles: run openclaw status
Troubleshooting: https://docs.openclaw.ai/troubleshooting
real 0m4.915s
user 0m5.535s
sys 0m0.967s'''
That worked â thanks. Tiny formatting note: Discord code blocks use backticks ``` not apostrophes ''', but I can read this fine.
Good news: the gateway is running and reachable:
Runtime: runningConnectivity probe: ok- listening on
127.0.0.1:18789
real 0m4.915s is slow-ish for gateway status, but not âhungâ. Next we need to separate Telegram polling delay from model/queue delay.
Please paste these next:
openclaw channels status --probe
openclaw models status
openclaw status --all
And since your log path is shown, after sending one Telegram test message, paste:
tail -n 300 /tmp/openclaw/openclaw-2026-04-28.log
In that log, the most useful lines will mention telegram, poll, queued, overloaded, timeout, 429, or event-loop delay.
â
â Checking channel status (probe)âŠpenclaw models status
â
Gateway reachable.
- Telegram default: enabled, configured, running, connected, in:13m ago, mode:polling, bot:@XXXXXXXX, token:config, works
Tip: status --deep adds gateway health probes to status output (requires a reachable gateway).```
That output is useful: Telegram itself probes as working, but in:13m ago means the gateway hasnât seen an inbound Telegram message for 13 minutes. So now Iâd look at the polling loop / duplicate bot consumer, not the model first.
Also looks like two commands got mashed together here:
Checking channel status (probe)âŠpenclaw models status
Run/paste these one at a time, waiting for the shell prompt between each:
openclaw models status
openclaw status --all
Then this filtered log view:
grep -iE 'telegram|poll|409|conflict|getUpdates|timeout|stall|queued|overload|429|event-loop|liveness' /tmp/openclaw/openclaw-2026-04-28.log | tail -n 200
And check for duplicate gateway/processes using the same Telegram bot token:
ps -eo pid,ppid,etime,%cpu,%mem,cmd | grep -E 'openclaw|node' | grep -v grep
If the logs show 409 Conflict / terminated by other getUpdates request, that means another OpenClaw/gateway/bot process is polling the same Telegram bot token. That would explain Telegram delays/non-responsiveness pretty neatly.
''' OpenClaw 2026.4.25 (aa36ee6) â I autocomplete your thoughtsâjust slower and with more API calls.
â
â
Gateway reachable.
- Telegram default: enabled, configured, running, connected, in:17m ago, mode:polling, bot:@XXXX, token:config, works
Tip: status --deep adds gateway health probes to status output (requires a reachable gateway).```
Thatâs still the channel status output, not models status.
Key clue: in:17m ago is increasing. So OpenClaw thinks Telegram polling is âworksâ, but it has not actually received a Telegram inbound message recently.
Letâs skip models for a sec and check the Telegram logs. Please copy/paste just this one line into terminal:
grep -iE 'telegram|poll|409|conflict|getUpdates|timeout|stall' /tmp/openclaw/openclaw-2026-04-28.log | tail -n 200
Then paste the output here.
After that, run this one:
ps -eo pid,ppid,etime,%cpu,%mem,cmd | grep -E 'openclaw|node' | grep -v grep
If we see 409 Conflict / getUpdates, it likely means another process is polling the same Telegram bot token.
I haven't sent a telegram message for 12mins
Ah, fair â then in:17m ago isnât suspicious by itself. My bad, I over-weighted that signal.
Letâs do a clean timed test:
- Start this in one terminal:
openclaw logs --follow
- In Telegram, send a simple message like:
ping test
- Watch how long it takes for any Telegram/inbound line to appear in logs or Gateway UI.
If itâs delayed again, stop the log tail and paste the relevant chunk. You can also use this after the test:
grep -iE 'telegram|inbound|poll|queued|overload|429|timeout|event-loop|liveness|error' /tmp/openclaw/openclaw-2026-04-28.log | tail -n 250
Also still worth checking processes once:
ps -eo pid,ppid,etime,%cpu,%mem,cmd | grep -E 'openclaw|node' | grep -v grep
Thatâll tell us if itâs âTelegram update not arrivingâ vs âarrives quickly but waits in queue/model.â
Following the log command. It opened logs. Then I sent the telegram message then
Gateway target: ws://127.0.0.1:XXXXX
Source: local loopback
Config: /home/XXXXX/.openclaw/openclaw.json
Bind: loopback
Hint: run `openclaw doctor`.```
Thatâs a stronger signal: the gateway is either crashing/restarting, or becoming unreachable right when Telegram input hits it. Letâs stop changing things for a moment and capture the crash evidence.
Please run these read-only commands:
openclaw gateway status --deep
systemctl --user status openclaw-gateway --no-pager -l
journalctl --user -u openclaw-gateway -n 200 --no-pager
And this log tail around the failure:
tail -n 300 /tmp/openclaw/openclaw-2026-04-28.log
Also, donât keep running doctor --fix repeatedly yet â if the gateway is crash-looping, repair attempts can add noise. The thing we need now is the first error/stack trace around the moment you sent the Telegram message.
â
â
Service: systemd (enabled)
File logs: /tmp/openclaw/openclaw-2026-04-28.log
Command: /usr/bin/node /home/x/.npm-global/lib/node_modules/openclaw/dist/index.js gateway --port x
Service file: ~/.config/systemd/user/openclaw-gateway.service
Service env: OPENCLAW_GATEWAY_PORT=x
Config (cli): ~/.openclaw/openclaw.json
Config (service): ~/.openclaw/openclaw.json
Gateway: bind=loopback (127.0.0.1), port=x(service args)
Probe target: ws://127.0.0.1:x
Dashboard: http://127.0.0.1:x/
Probe note: Loopback-only gateway; only local clients can connect.
Runtime: running (pid 387527, state active, sub running, last exit 0, reason 0)
Connectivity probe: ok
Capability: admin-capable
Listening: 127.0.0.1:x
Troubles: run openclaw status
Troubleshooting: https://docs.openclaw.ai/troubleshooting```
That status says the gateway is currently healthy and still on the same PID, so it probably didnât hard-crash. More likely it became temporarily unreachable / event-loop blocked / stalled.
The status output only tells us ânowâ; we need the logs from the exact failure window.
Please run and paste these next â especially the first one:
tail -n 300 /tmp/openclaw/openclaw-2026-04-28.log
Then:
journalctl --user -u openclaw-gateway -n 200 --no-pager
And quick question: did the âGateway not reachableâ message appear in Telegram, or in the terminal running openclaw logs --follow? That changes what path is failing.
On last question. It was in terminal
On two commands above, the output is too large to paste her
Got it â since it appeared in the terminal, that means the openclaw logs --follow client temporarily couldnât reach the gateway. The gateway may still be running, but something is stalling/interrupting the local RPC path.
Donât paste the huge logs. Please run this smaller filtered version:
grep -iE 'gateway not reachable|websocket|econn|refused|closed|disconnect|telegram|poll|queued|overload|429|timeout|event-loop|liveness|fatal|uncaught|error' /tmp/openclaw/openclaw-2026-04-28.log | tail -n 150
And this smaller journal slice:
journalctl --user -u openclaw-gateway --since "20 minutes ago" --no-pager | tail -n 150
If thatâs still too big, save it to a file and attach it here:
{
echo "=== gateway filtered log ==="
grep -iE 'gateway not reachable|websocket|econn|refused|closed|disconnect|telegram|poll|queued|overload|429|timeout|event-loop|liveness|fatal|uncaught|error' /tmp/openclaw/openclaw-2026-04-28.log | tail -n 150
echo
echo "=== systemd journal ==="
journalctl --user -u openclaw-gateway --since "20 minutes ago" --no-pager | tail -n 150
echo
echo "=== processes ==="
ps -eo pid,ppid,etime,%cpu,%mem,cmd | grep -E 'openclaw|node' | grep -v grep
} > /tmp/openclaw-debug.txt
Then attach /tmp/openclaw-debug.txt here. Quick skim/redact anything private first, just in case.
I cant attach file and the log is too long
No worries â letâs make the output tiny.
Run these exactly, one at a time:
grep -iE 'fatal|uncaught|event-loop|liveness|gateway not reachable|ECONN|refused|telegram|poll|overload|429|timeout' /tmp/openclaw/openclaw-2026-04-28.log | tail -n 40 | cut -c1-240
journalctl --user -u openclaw-gateway --since "30 minutes ago" --no-pager | grep -iE 'fatal|uncaught|error|killed|restart|exit|memory|oom|timeout' | tail -n 40 | cut -c1-240
ps -eo pid,ppid,etime,%cpu,%mem,cmd | grep -E 'openclaw|node' | grep -v grep
Also: for the live Telegram test, donât use openclaw logs --follow anymore since that itself depends on gateway RPC. Use raw file tail instead:
tail -f /tmp/openclaw/openclaw-2026-04-28.log
Then send Telegram ping test and watch whether lines appear immediately.
THere is a 2000 character limit here. I can't send the outut
No problem â letâs force it under the limit.
Run this one command and paste the output. It should be short:
LOG=/tmp/openclaw/openclaw-2026-04-28.log
echo "== counts =="
for p in "fatal|uncaught|exception" "event-loop|liveness" "telegram|poll|getUpdates|409|conflict" "overload|429|resource exhausted" "timeout|ECONN|refused"; do
printf "%-32s " "$p"
grep -iE "$p" "$LOG" | wc -l
done
echo "== last important lines =="
grep -iE 'fatal|uncaught|exception|event-loop|liveness|telegram|poll|getUpdates|409|conflict|overload|429|resource exhausted|timeout|ECONN|refused' "$LOG" | tail -n 8 | cut -c1-180
echo "== openclaw processes =="
ps -eo pid,etime,%cpu,%mem,cmd | grep -E 'openclaw|node' | grep -v grep | cut -c1-180
If thatâs still too long, paste only the == counts == section first.
{"0":"{\"subsystem\":\"agent/embedded\"}","1":{"event":"embedded_run_agent_end","tags":["error_handling","lifecycle","agent_end","assistant_error"],"runId":"952fa47d-926b-4005-97a0
{"0":"{\"subsystem\":\"agent/embedded\"}","1":{"event":"auth_profile_failure_state_updated","tags":["error_handling","auth_profiles","cooldown"],"runId":"952fa47d-926b-4005-97a0-11
{"0":"{\"subsystem\":\"agent/embedded\"}","1":{"event":"embedded_run_failover_decision","tags":["error_handling","failover","assistant","fallback_model"],"runId":"952fa47d-926b-400
{"0":"{\"subsystem\":\"diagnostic\"}","1":"lane task error: lane=main durationMs=80156 error=\"FailoverError: The AI service is temporarily overloaded. Please try again in a moment
{"0":"{\"subsystem\":\"diagnostic\"}","1":"lane task error: lane=session:agent:main:telegram:direct:7006368633 durationMs=80163 error=\"FailoverError: The AI service is temporarily
{"0":"{\"subsystem\":\"model-fallback/decision\"}","1":{"event":"model_fallback_decision","tags":["error_handling","model_fallback","candidate_failed"],"runId":"952fa47d-926b-4005-
{"0":"{\"subsystem\":\"model-fallback/decision\"}","1":{"event":"model_fallback_decision","tags":["error_handling","model_fallback","candidate_succeeded"],"runId":"952fa47d-926b-40```
Perfect â this finally narrowed it down.
Those lines mean Telegram/Gateway are not the main issue anymore. The run is reaching the agent, then spending ~80s in the model/failover path and failing with:
FailoverError: The AI service is temporarily overloaded
So Iâd do this now:
- In Telegram, send standalone:
/stop
- Switch away from the Gemini preview primary for now. Use a stable model you already have auth for: