#Slow after move from local windowns to VPN

1 messages · Page 1 of 1 (latest)

sick isle
#

Roger moved from Windows to VPS (Hetzner Helsinki). Ping to api.deepseek.com from VPS is ~1ms. But simple responses like '2+2' take 23 seconds. On Windows it was much faster. What causes this overhead and how do we reduce it? Model is deepseek-chat, think/verbose/reasoning all off

real raftBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> âś… Mark Solution

rare shaleBOT
#

23s for 2+2 is not normal. Also: ping isn’t the right signal here — api.deepseek.com may answer ICMP close to Hetzner, while HTTPS/API/model routing can still be slow.

First, please share:

openclaw --version
openclaw status --all

Then isolate where the 23s is:

curl -sS -o /dev/null \
  -w 'dns=%{time_namelookup} connect=%{time_connect} tls=%{time_appconnect} ttfb=%{time_starttransfer} total=%{time_total}\n' \
  https://api.deepseek.com/v1/models \
  -H "Authorization: Bearer $DEEPSEEK_API_KEY"
#

If that’s fast, OpenClaw overhead is likely local: huge existing session/context, queue/busy gateway, slow disk/swap on the VPS, auth/model catalog lookup, or a config/fallback path before it reaches DeepSeek.

Quick reductions to try:

/new

Then ask 2+2 again in a fresh session.

Also run while reproducing:

openclaw logs --follow
openclaw gateway stability --json
sick isle
#

curl to api.deepseek.com: dns=0.003s connect=0.005s tls=0.067s ttfb=0.355s total=0.355s — API is fast. So the 23s delay is OpenClaw-side overhead. How do we diagnose and reduce it?

rare shaleBOT
#

Yep — that curl mostly proves network/TLS/auth to DeepSeek is fine, but /v1/models is not the same as a chat completion. Split it like this:

openclaw --version
openclaw models status

Then benchmark 3 layers:

# 1) Direct DeepSeek chat completion
time curl -sS https://api.deepseek.com/v1/chat/completions \
  -H "Authorization: Bearer $DEEPSEEK_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"deepseek-chat","messages":[{"role":"user","content":"2+2? Reply with just 4."}],"max_tokens":8,"stream":false}'
#
# 2) OpenClaw provider/auth path, no Gateway/full agent context
time openclaw infer model run --local \
  --model deepseek/deepseek-chat \
  --prompt "2+2? Reply with just 4." \
  --json

# 3) Gateway model routing, still raw prompt
time openclaw infer model run --gateway \
  --model deepseek/deepseek-chat \
  --prompt "2+2? Reply with just 4." \
  --json

Interpretation:

  • curl slow → DeepSeek chat completion latency, not OpenClaw.
#
  • local infer slow → provider/auth/model path overhead.
  • gateway infer slow but local fast → Gateway/routing/runtime overhead.
  • both infer tests fast but Discord chat slow → full agent path: session history, tool schemas, memory/context assembly, queue wait, Discord delivery, or VPS resource pressure.

For the full agent path, try a clean session:

/new

Then ask 2+2 again. If that fixes it, the old session/context is the overhead.

While reproducing, collect:

openclaw status --all
#
openclaw health --json
openclaw gateway stability --json
openclaw logs --follow

If you need more timing detail for one repro:

openclaw config get logging.level
openclaw config set logging.level debug
openclaw gateway restart
# reproduce once
openclaw logs --follow

Then restore the old log level.

#

Things that commonly reduce this:

  • Start fresh with /new or compact/reset long sessions.
  • Check /status for active/queued runs; queue waits can look like model latency.
  • If queueing is the issue, stop long cron/heartbeat/subagent work or tune concurrency carefully.
  • Check VPS pressure: free -h, vmstat 1 10, CPU steal/swap/disk stalls.
  • If raw infer is fast but full agent is slow, reduce unnecessary context/tools or use a faster model/runtime for chatty simple replies.

Docs: https://docs.openclaw.ai/cli/infer https://docs.openclaw.ai/concepts/queue https://docs.openclaw.ai/logging

sick isle
#

Could the slowness be caused by the large workspace we copied over? We transferred the entire Windows workspace including Python packages, cache files, old logs etc. Could OpenClaw be loading all of that as context on every message?

rare shaleBOT
#

Yes, but not by blindly loading the entire workspace every message.

OpenClaw normally injects only bounded workspace/bootstrap context like AGENTS.md, SOUL.md, TOOLS.md, MEMORY.md, HEARTBEAT.md, skills summary, and session history. It should not read every Python package/cache/log file as prompt context.

Where the copied workspace can hurt:

  • huge AGENTS.md / MEMORY.md / TOOLS.md / memory/*.md
  • a leftover BOOTSTRAP.md causing repeated startup/bootstrap behavior
  • a giant/old session transcript
  • memory/search/indexing config pointed at broad workspace paths
  • slow VPS disk/swap from tons of files/cache
  • custom context engine plugin indexing more than expected

Best checks:

#
openclaw config get agents.defaults.workspace
openclaw config get plugins.slots.contextEngine
openclaw status --all

In chat, run:

/context list
/status

/context list is the key one — it shows what is actually being injected into the system prompt.

Then inspect workspace size/readable context files:

#
cd "$(openclaw config get agents.defaults.workspace --plain 2>/dev/null || echo ~/.openclaw/workspace)"
du -sh .
du -sh * .[^.]* 2>/dev/null | sort -h | tail -30
ls -lh AGENTS.md SOUL.md TOOLS.md IDENTITY.md USER.md MEMORY.md HEARTBEAT.md BOOTSTRAP.md 2>/dev/null
find memory -type f -maxdepth 2 -printf '%s %p\n' 2>/dev/null | sort -n | tail -20

Quick reversible isolation test: create/use a fresh tiny workspace with only minimal AGENTS.md/memory files, restart, and test 2+2. If latency drops, the copied workspace/context is the culprit. If it stays 23s, look at session history, queueing, or VPS pressure instead.

Also try:

/new

If /new fixes it, the problem is more likely old session context than raw workspace size.

sick isle
#

Direct DeepSeek API call: 0.787s total. But full agent reply takes 23s. Memory: 3.7GB total, 1.5GB used, 2.2GB available — no memory pressure. The 22+ seconds is all OpenClaw overhead. What's causing it?

rare shaleBOT
#

I’d be careful with the conclusion “all OpenClaw overhead” — the likely culprit is OpenClaw sending a much larger full-agent prompt to DeepSeek, not local CPU/memory.

Your direct curl is tiny: no system prompt, no tools, no session history, no workspace bootstrap. A full OpenClaw turn can include system prompt + tool schemas + session transcript + bootstrap/memory. DeepSeek then spends time on prefill before first token. That looks like 20s of “OpenClaw overhead”, but it’s often the model processing a large request.

Fastest proof:

time openclaw infer model run --gateway \
  --model deepseek/deepseek-chat \
  --prompt "2+2? Reply with just 4." \
  --json

If that’s fast, Gateway/provider routing is fine and the slow part is the full agent context/tool surface.

Then in chat run:

#
/status
/context list

Look for huge context usage or large injected sections.

To reduce it, try these in order:

/new

If that fixes it, old session history was the issue.

Then:

#
openclaw config set agents.defaults.contextInjection continuation-skip
openclaw gateway restart

That skips re-injecting workspace bootstrap files on safe continuation turns.

If still slow, temporarily test a lean tool surface:

openclaw config set tools.profile minimal
openclaw gateway restart

That intentionally disables most tools, so use it as a diagnostic. If latency drops hard, tool schemas are a big part of the prompt.

Also check for accidental huge prompt files:

#
ls -lh AGENTS.md SOUL.md TOOLS.md MEMORY.md HEARTBEAT.md BOOTSTRAP.md 2>/dev/null

The copied Python packages/cache/logs should not be loaded wholesale, but huge AGENTS.md/memory/session history/tool schemas absolutely can explain this.

sick isle
#

Gateway infer took 11.5s total. Direct curl to DeepSeek: 0.787s. So ~10s is pure OpenClaw/Gateway overhead on a simple prompt with no agent context. What's causing that overhead?

rare shaleBOT
#

That result changes the diagnosis: --gateway raw model run does not load workspace/session/tools, so this is not the copied workspace or full-agent context.

Likely buckets now:

  1. CLI startup/respawn overhead in the timed openclaw infer --gateway command
  2. Gateway config/auth/model resolution before the DeepSeek call
  3. Streaming mismatch — OpenClaw uses streaming; your curl may be stream:false
  4. Slow shell env import (env.shellEnv) — suspicious because defaults/timeouts can be ~15s-ish

Run these:

openclaw --version

time openclaw health --json
time openclaw infer model run --local --model deepseek/deepseek-chat --prompt "2+2? Reply just 4." --json
#
time openclaw infer model run --gateway --model deepseek/deepseek-chat --prompt "2+2? Reply just 4." --json

openclaw gateway stability --type model.call.completed --limit 5 --json
openclaw gateway stability --type run.completed --limit 5 --json

openclaw config get env.shellEnv
time "$SHELL" -l -c 'env >/dev/null'

Interpretation:

  • health --json slow → CLI/RPC startup overhead.
  • local slow too → OpenClaw provider/auth/model resolution overhead.
  • local fast, gateway slow → Gateway runtime/config/auth path.
  • model.call.completed.durationMs ~11s → the delay is inside the streamed model call, so compare with direct curl using stream:true.
  • run.completed ~11s but model.call.completed ~0.8s → pre/post Gateway overhead.
#

If env.shellEnv is enabled or login shell is slow, fix that first:

printf 'DEEPSEEK_API_KEY=%s\n' "$DEEPSEEK_API_KEY" >> ~/.openclaw/.env
openclaw config set env.shellEnv.enabled false --strict-json
openclaw gateway restart

For small VPS CLI startup, also enable the compile cache from the VPS docs:
https://docs.openclaw.ai/vps

sick isle
#

openclaw health --json reports durationMs: 1219ms but wall time is 7.986s — so ~6.7s is pure CLI/Docker startup overhead. Also: both WhatsApp and Discord show running: false, connected: false. Is the gateway actually running? How do we fix CLI startup time on VPS Docker?

rare shaleBOT
#

Yes — if openclaw health --json returns durationMs, the Gateway RPC answered. The durationMs: 1219ms is Gateway-side probe time; the extra ~6.7s wall time is your CLI/container startup path.

WhatsApp running:false connected:false and Discord running:false connected:false mean the channel transports are not running/connected. That’s separate from the Gateway being alive.

For Docker, avoid timing commands through:

docker compose run --rm openclaw-cli ...

That creates a fresh one-off container every time. Use these instead:

docker compose ps
curl -fsS http://127.0.0.1:18789/healthz
curl -fsS http://127.0.0.1:18789/readyz
#
docker compose logs -f --tail=200 openclaw-gateway

For CLI commands, run inside the already-running gateway container:

docker compose exec openclaw-gateway node dist/index.js health --json
docker compose exec openclaw-gateway node dist/index.js gateway status --require-rpc
docker compose exec openclaw-gateway node dist/index.js channels status --probe

If repeated CLI startup is still slow, add the VPS tuning env to the gateway container / service:

NODE_COMPILE_CACHE=/var/tmp/openclaw-compile-cache
sick isle
#

Roger is connected on WhatsApp but not responding to new messages. Logs show stuck session a6a0e117 in state=processing for 243s. How do I clear/kill the stuck session?

rare shaleBOT
#

Use /stop first — it’s the supported fast-abort path and clears that session’s run/queue.

From the stuck WhatsApp chat, send exactly:

/stop

If chat delivery is wedged, abort via Gateway RPC. First find the session key for a6a0e117:

docker compose exec openclaw-gateway node dist/index.js sessions --json \
  | jq -r '.sessions[] | select((.sessionId // "") | startswith("a6a0e117")) | .key'

Then abort it:

#
docker compose exec openclaw-gateway node dist/index.js gateway call chat.abort \
  --params '{"sessionKey":"PASTE_SESSION_KEY_HERE"}' \
  --json

Expected result should include something like:

{"ok":true,"aborted":true}

Then verify:

docker compose exec openclaw-gateway node dist/index.js gateway stability --type run.completed --limit 10
#
docker compose exec openclaw-gateway node dist/index.js channels status --probe

If chat.abort can’t find/clear it, restart the gateway container as the hard reset:

docker compose restart openclaw-gateway

Don’t hand-edit sessions.json while the gateway is running; that won’t kill the in-memory active run and can make state weirder. Also please grab openclaw --version when you can — stuck processing behavior is very version-sensitive.

sick isle
#

Roger came back on his own after stuck session cleared. But WhatsApp keeps getting 408 timeouts and disconnecting. How do we make the connection more stable?

rare shaleBOT
#

408 on WhatsApp Web usually means the WhatsApp/Baileys socket timed out, not DeepSeek or the agent. It should reconnect, but if it keeps happening we should make reconnects more forgiving and check network/proxy/conflict causes.

For Roger/Docker, I’d do this as one pass:

# 1) confirm current state + exact close reasons
docker compose exec openclaw-gateway node dist/index.js channels status --probe
docker compose logs --tail=300 openclaw-gateway | grep -Ei 'whatsapp|web-reconnect|web-heartbeat|408|timeout|conflict|logged'

# 2) make WhatsApp reconnect more patient
docker compose exec openclaw-gateway node dist/index.js config set web.heartbeatSeconds 60 --strict-json
docker compose exec openclaw-gateway node dist/index.js config set web.reconnect.initialMs 2000 --strict-json
docker compose exec openclaw-gateway node dist/index.js config set web.reconnect.maxMs 120000 --strict-json
docker compose exec openclaw-gateway node dist/index.js config set web.reconnect.factor 1.4 --strict-json
docker compose exec openclaw-gateway node dist/index.js config set web.reconnect.jitter 0.25 --strict-json
docker compose exec openclaw-gateway node dist/index.js config set web.reconnect.maxAttempts 0 --strict-json
#

# 3) restart the gateway container
docker compose restart openclaw-gateway

# 4) watch reconnect behavior
docker compose logs -f --tail=200 openclaw-gateway

Also check these likely causes:

  • Duplicate WhatsApp Web sessions/conflict: open WhatsApp → Linked devices → remove old/stale OpenClaw/Windows sessions, then relink if needed.
  • Proxy/env weirdness: if HTTP_PROXY, HTTPS_PROXY, or NO_PROXY are set in Docker, make sure they’re intentional. A bad proxy/NO_PROXY can cause 408/TLS disconnects.
  • Host networking: Hetzner ↔ WhatsApp may be less stable than Hetzner ↔ DeepSeek. DeepSeek ping doesn’t prove WhatsApp Web stability.
  • Version: please grab openclaw --version; WhatsApp reconnect behavior is version-sensitive.

If it keeps looping after that, run:

#
docker compose exec openclaw-gateway node dist/index.js gateway diagnostics export

and file a GitHub issue with the diagnostics zip + the repeated 408 log lines.

sick isle
#

Roger answered but took 2+ minutes. Stuck session a6a0e117 in state=processing keeps appearing. How do I clear stuck sessions, and why does processing take so long with DeepSeek V4 Flash?

rare shaleBOT
#

The stuck state=processing means that session lane is still considered active, so new WhatsApp messages queue behind it. Clear it first, then diagnose why it keeps happening.

Use one block:

# 0) Please keep this version in the thread
docker compose exec openclaw-gateway node dist/index.js --version

# 1) Find the full session key for short session id a6a0e117
SESSION_KEY="$(
  docker compose exec -T openclaw-gateway node dist/index.js sessions --json \
  | jq -r '.sessions[] | select((.sessionId // "") | startswith("a6a0e117")) | .key' \
  | head -1
)"
echo "SESSION_KEY=$SESSION_KEY"
#
# 2) Abort the stuck active run for that session
docker compose exec -T openclaw-gateway node dist/index.js gateway call chat.abort \
  --params "{\"sessionKey\":\"$SESSION_KEY\"}" \
  --json

# 3) If it did not abort, hard-reset in-memory run state
docker compose restart openclaw-gateway

# 4) After it happens again, compare model-call time vs total run time
docker compose exec -T openclaw-gateway node dist/index.js gateway stability --type model.call.completed --limit 10 --json
docker compose exec -T openclaw-gateway node dist/index.js gateway stability --type run.completed --limit 10 --json

Don’t edit sessions.json while the gateway is running — that won’t kill the in-memory active run.

Why it takes so long:

#
  • If model.call.completed.durationMs is also ~2 minutes, DeepSeek is the slow part. DeepSeek V4 Flash is thinking-capable/huge-context; even with thinking disabled it may be much slower than the tiny direct curl because OpenClaw sends a real agent prompt.
  • If model.call.completed is fast but run.completed is ~2 minutes, it’s Gateway/session/queue/delivery overhead.
  • If this is latency-sensitive, switch Roger back to the non-thinking V3.2 surface:
/new deepseek/deepseek-chat

or set the default:

docker compose exec -T openclaw-gateway node dist/index.js models set deepseek/deepseek-chat
docker compose restart openclaw-gateway
sick isle
#

Response time improved from 2min to ~34s after restart. But gateway stability --type run.completed still shows count:0 even after messages are processed. Why isn't stability logging runs? And what's causing the 34s latency — is it the agent prompt size/tools?

rare shaleBOT
#

run.completed count:0 usually means one of these:

  1. diagnostics.enabled is off
  2. you’re querying a different Gateway/container than the one processing WhatsApp
  3. the reply path is not emitting run.completed correctly → possible bug, worth an issue if confirmed

For the 34s: yes, most likely full agent prompt size + tools + DeepSeek prefill, unless model-call timing proves otherwise. A tiny curl is not comparable to the full OpenClaw agent request.

Run this single diagnostic block after one slow WhatsApp reply:

# confirm version + same running container
docker compose ps
docker compose exec -T openclaw-gateway node dist/index.js --version
docker compose exec -T openclaw-gateway node dist/index.js gateway status --require-rpc
#
# diagnostics/stability sanity
docker compose exec -T openclaw-gateway node dist/index.js config get diagnostics.enabled
docker compose exec -T openclaw-gateway node dist/index.js gateway stability --limit 50 --json
docker compose exec -T openclaw-gateway node dist/index.js gateway stability --type model.call.completed --limit 20 --json
docker compose exec -T openclaw-gateway node dist/index.js gateway stability --type run.completed --limit 20 --json

# prompt/tool size from the actual chat session
# send these in WhatsApp as standalone messages:
/status
/context detail
/usage full

How to read it:

  • model.call.completed.durationMs ~34s → DeepSeek is spending time on the actual model request, likely prompt/tool prefill.
#
  • model.call.completed fast but reply is 34s → Gateway/channel/session overhead.
  • no model.call.completed and no run.completed → diagnostics are disabled, wrong gateway, or a telemetry bug.

To reduce latency quickly, test a lean tool surface:

docker compose exec -T openclaw-gateway node dist/index.js config set tools.profile minimal
docker compose restart openclaw-gateway

If that cuts latency a lot, tool schemas were the big prompt-cost. Then re-add only the tools Roger actually needs instead of running the broad coding profile.

sick isle
#

tools.profile minimal caused Roger to stop responding entirely. Reverted to messaging, back to ~23s. What's the correct way to reduce tool/prompt size without breaking WhatsApp responses?

rare shaleBOT
#

minimal was too aggressive: it only leaves session_status. If Roger’s WhatsApp flow/config needs the message tool for delivery, the model can answer internally but nothing gets sent. So don’t use global minimal as-is.

Use a tiny messaging profile instead: keep message + session_status, remove the heavier session tools.

Try this:

docker compose exec -T openclaw-gateway node dist/index.js config set tools.profile minimal
docker compose exec -T openclaw-gateway node dist/index.js config set tools.alsoAllow '["message"]' --strict-json
docker compose restart openclaw-gateway

# verify what the agent actually sees
# send in WhatsApp:
/tools compact
/context detail
/status
#

If tools.alsoAllow is rejected on Roger’s version, use this fallback:

docker compose exec -T openclaw-gateway node dist/index.js config set tools.profile messaging
docker compose exec -T openclaw-gateway node dist/index.js config set tools.deny '["sessions_list","sessions_history","sessions_send"]' --strict-json
docker compose restart openclaw-gateway

Why this works:

  • message is the safe thing to keep for WhatsApp delivery.
  • sessions_list/history/send add schema size and are usually unnecessary for simple chat.
  • coding/unset profiles are much heavier because they expose fs/runtime/web/memory/media/etc.

Then compare latency. If it’s still ~23s, the remaining cost is probably not tool schemas alone — check /context detail for bootstrap/session size and gateway stability --type model.call.completed for actual model-call duration.

sick isle
#

Roger keeps getting stuck after every restart. Old session agent:main:main with 22k context keeps blocking new messages. /new doesn't clear the stuck in-memory run. How do we permanently clear/reset this session?

rare shaleBOT
#

/new is not enough here because it’s a normal chat command — if agent:main:main is already stuck in processing, /new can just queue behind the stuck run.

Do an offline reset of that session: stop Gateway, archive the agent:main:main session row/transcript, switch DMs away from shared main, then restart.

# Stop the live gateway first so no in-memory run can rewrite sessions.json
docker compose stop openclaw-gateway

# Future WhatsApp DMs should not reuse the shared agent:main:main session
docker compose run --rm --no-deps --entrypoint node openclaw-gateway \
  dist/index.js config set session.dmScope per-channel-peer

# Archive the old stuck main session instead of deleting it
docker compose run --rm --no-deps --entrypoint node openclaw-gateway - <<'NODE'
const fs = require("fs");
const path = require("path");
#

const state =
  process.env.OPENCLAW_STATE_DIR ||
  path.join(process.env.OPENCLAW_HOME || process.env.HOME || "/home/node", ".openclaw");

const sessionsDir = path.join(state, "agents", "main", "sessions");
const storePath = path.join(sessionsDir, "sessions.json");
const key = "agent:main:main";
const ts = new Date().toISOString().replace(/[:.]/g, "-");
const archive = path.join(state, "session-archive", `reset-main-${ts}`);

if (!fs.existsSync(storePath)) {
  console.log("No sessions.json found:", storePath);
  process.exit(0);
}
#

const store = JSON.parse(fs.readFileSync(storePath, "utf8"));
const entry = store[key];

fs.mkdirSync(archive, { recursive: true });
fs.copyFileSync(storePath, path.join(archive, "sessions.json.before"));

if (!entry) {
  console.log("No agent:main:main entry found. Backed up sessions.json only.");
  process.exit(0);
}

const sessionId = entry.sessionId;
console.log("Archiving", key, "sessionId=", sessionId);
#
delete store[key];

const tmp = `${storePath}.tmp`;
fs.writeFileSync(tmp, JSON.stringify(store, null, 2) + "\n");
fs.renameSync(tmp, storePath);

if (sessionId) {
  for (const name of fs.readdirSync(sessionsDir)) {
    if (name.startsWith(sessionId)) {
      fs.renameSync(path.join(sessionsDir, name), path.join(archive, name));
      console.log("Moved", name);
    }
  }
}
#
console.log("Archived to", archive);
NODE

# Make sure container user can still write state
chown -R 1000:1000 /root/.openclaw

# Start clean
docker compose up -d openclaw-gateway

# Verify
docker compose exec -T openclaw-gateway node dist/index.js sessions --json | grep -E 'agent:main:main|whatsapp' || true
docker compose logs -f --tail=100 openclaw-gateway

Why this should help: