#Phone/voice setup on VPS

1 messages · Page 1 of 1 (latest)

terse breach
#

How do I optimize my claw for voice calls

abstract nicheBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> âś… Mark Solution

steel heronBOT
#

Assuming you mean phone-call voice on a VPS:

  1. Run the built-in checks first

    openclaw --version
    openclaw voicecall setup
    openclaw voicecall latency --last 200
    

    The latency command is the best way to see whether the bottleneck is listening/STT, model response, or playback.

  2. Use realtime for lowest-latency conversations

    • realtime.enabled = full-duplex live voice.
    • streaming.enabled = transcription stream + normal agent loop.
    • Don’t enable both; pick one.
  3. Keep the spoken agent short
    In your realtime instructions, add something like:

#

Speak briefly. Ask one question at a time. Use deeper tools only when necessary.

  1. Use a fast voice model + “consult” only when needed
    For realtime calls, let the realtime provider handle normal back-and-forth, and only call the regular OpenClaw agent for deeper tool use/reasoning. That keeps calls from feeling sluggish.

  2. Use stable public webhook exposure
    Prefer a stable domain / Tailscale Funnel over changing ngrok URLs. Set publicUrl and webhookSecurity.allowedHosts, then rerun:

    openclaw voicecall setup
    
  3. Tune silence/endpointing carefully
    Defaults are around 800ms. Lower values can feel snappier but may cut people off; higher values feel slower but safer.

Docs:

#

If you mean Discord voice channels instead of phone calls, the optimization path is different: check ffmpeg/ffprobe, /vc status, DAVE decrypt logs, and Discord voice config.

terse breach
#

Before we set up voice on Hetzner VPS, I want to optimize Roger for maximum human-like dialog — everything else is secondary.

The goal: a conversation that feels like a real human. Natural flow, understanding, insight, no scripted feel. Responds immediately when natural, intentional pauses when it makes sense.

I'm willing to compromise on all other OpenClaw functionality to achieve this.

What is the optimal configuration regarding:

  • Identity/persona setup
  • Which LLM model gives the best conversational quality here
  • Heartbeat/loading — how much does Roger load per call, and what minimizes latency
  • ElevenLabs voice — which settings give the most natural speech (stability, similarity, style etc)
  • Twilio — anything that can be optimized for low latency
  • Pauses and response timing — can OpenClaw control natural pauses and speaking rhythm

And a specific question: In OpenClaw there appears to be both a Google voice integration and an OpenClaw Voice Call plugin — are these two separate paths, and which do you recommend for this use case on VPS?

steel heronBOT
#

First: please paste openclaw --version before you apply any exact config — voice-call changed quickly, so version matters.

My take: optimize for realtime voice, not “LLM + STT + ElevenLabs TTS”.

For maximum human-like dialog on a Hetzner VPS, I’d use:

Twilio phone number → OpenClaw Voice Call plugin → realtime voice provider → OpenClaw consult only when needed.

The Google voice integration and Voice Call plugin are not competing paths:

  • Voice Call plugin = telephony layer: Twilio/Telnyx/Plivo, inbound/outbound calls, webhooks, call state.
  • Google/OpenAI realtime voice = voice brain/audio engine used inside Voice Call.
  • Talk mode / Google browser voice = different local/UI/node path, not the normal PSTN phone-call path.

So for VPS phone calls: use Voice Call + realtime provider. Google can be the realtime provider behind it. OpenClaw docs describe Voice Call as the telephony plugin and Google as a realtime voice provider for backend bridges like Voice Call/Meet. (docs.openclaw.ai)

Recommended shape:

#
{
  agents: {
    list: [
      {
        id: "roger-voice",
        workspace: "~/.openclaw/workspace-roger-voice"
      }
    ],
    defaults: {
      heartbeat: {
        every: "0m"
      }
    }
  },
  plugins: {
#
    entries: {
      "voice-call": {
        enabled: true,
        config: {
          provider: "twilio",
          fromNumber: "+YOUR_TWILIO_NUMBER",
          publicUrl: "https://voice.yourdomain.com/voice/webhook",
          webhookSecurity: {
            allowedHosts: ["voice.yourdomain.com"]
          },

          agentId: "roger-voice",

          realtime: {
            enabled: true,
#
            provider: "google",
            instructions: "You are Roger's live phone voice. Sound like a thoughtful human, not an assistant. Keep replies short. Do not monologue. If the caller pauses mid-thought, stay quiet. If a question needs tools or memory, briefly say you're checking, then call the consult tool. Never narrate system behavior.",
            toolPolicy: "safe-read-only",
            providers: {
              google: {
                activityHandling: "start-of-activity-interrupts",
                turnCoverage: "only-activity"
              }
            }
          },

          streaming: { enabled: false }
        }
      }
    }
#
  }
}

For identity/persona: make a dedicated roger-voice workspace. Keep SOUL.md, IDENTITY.md, and USER.md tiny. Long persona files increase prompt/context load; OpenClaw injects workspace files like AGENTS.md, SOUL.md, IDENTITY.md, USER.md, etc. into context, with per-file and total bootstrap caps. (docs.openclaw.ai)

For LLM choice: don’t route normal conversation through a text LLM if “human-like” is priority. Use a native realtime voice model/provider for the live loop, and let it call OpenClaw only for “deeper brain” moments. For the freshest model pick, ask in #models — I don’t want to hardcode stale model advice here. The key architecture choice is realtime voice first, normal OpenClaw agent second.

For heartbeat/loading latency:

  • Heartbeat does not need to run for calls.
  • Disable it for the voice agent or set skipWhenBusy.
  • Keep HEARTBEAT.md empty/tiny.
  • Use /context list to see what Roger is loading.
  • Use openclaw voicecall latency --last 200 after tests.

OpenClaw heartbeat can use lightContext and isolatedSession, but for a voice-first agent I’d just disable heartbeats unless you need proactive behavior. (docs.openclaw.ai)

#

For ElevenLabs: only use ElevenLabs if the custom voice matters more than end-to-end flow. It can sound gorgeous, but STT → LLM → TTS usually feels less alive than realtime voice. If you do use it, start around:

voiceSettings: {
  stability: 0.5,
  similarityBoost: 0.8,
  style: 0.2,
  useSpeakerBoost: true,
  speed: 0.98
}

Lower stability gives more emotional variation; higher can become monotone. style and useSpeakerBoost can improve character/similarity but may add latency. ElevenLabs also notes Flash/Turbo-style models trade quality for latency, while more expressive models are higher quality but slower. (elevenlabs.io)

For Twilio latency:

  • Use bidirectional Media Streams / <Connect><Stream> for realtime AI calls.
  • Put the VPS close to the caller/Twilio region where possible.

ElevenLabs provides APIs and SDKs for text to speech, voice cloning, speech to text, sound effects, voice isolator, voice changer, and conversational AI agents. Build voice-enabled applications with lifelike audio generation.

#
  • Use a stable public HTTPS/WSS endpoint, not free ngrok.
  • Avoid extra reverse-proxy buffering.
  • Twilio media back to the call is audio/x-mulaw at 8000 Hz, base64 payloads; OpenClaw handles the bridge, but that explains why telephony will never sound like studio audio. (twilio.com)

For pauses/rhythm: OpenClaw can influence this, but not perfectly “conduct” it. The realtime provider’s VAD decides when a turn is complete. Google/OpenAI realtime APIs both support automatic voice activity detection; OpenClaw exposes knobs like silence duration / sensitivity depending on provider. Start with defaults, then tune only after listening tests. (ai.google.dev)

Bottom line: Voice Call plugin + realtime provider + tiny persona + consult only when needed. That’s the path I’d use for “feels like a real person.”

terse breach
#

Does Google realtime voice support Danish? If not, is ElevenLabs the only viable path for Danish language calls, and how does that change the recommended architecture?

steel heronBOT
#

Short answer: yes — Google realtime voice supports Danish now. Google’s current Live API docs list Danish da among supported languages, and say native-audio models can switch languages naturally; they also say native-audio models don’t let you explicitly set a language code, so steer it via instructions instead. (ai.google.dev)

So I’d not switch to ElevenLabs-first yet. Keep the recommended architecture:

Voice Call plugin + Twilio + realtime.provider: "google"

Add Danish-specific realtime instructions:

realtime: {
  enabled: true,
  provider: "google",
  instructions: "Du taler dansk som en naturlig, varm og kortfattet person. Svar pĂĄ dansk, medmindre brugeren beder om noget andet. Hold pauser naturligt. Afbryd ikke brugeren midt i en tanke."
}
#

Caveat: Google’s Vertex Live docs still show a narrower configurable language_code list that does not include Danish, but that mainly matters for non-native/configured-language paths. For this use case, prefer the native realtime audio path and prompt it to Danish. (docs.cloud.google.com)

If Google’s Danish voice quality feels off in testing, then fallback architecture becomes:

Twilio → Voice Call streaming STT → OpenClaw/text LLM → ElevenLabs Danish TTS

ElevenLabs definitely supports Danish for TTS, including newer multilingual models, and their STT supports 90+ languages; OpenClaw’s ElevenLabs docs also mention Voice Call streaming STT via Scribe v2 Realtime. (help.elevenlabs.io)

But tradeoff: ElevenLabs path may sound prettier, Google realtime will usually feel more conversational because it avoids the separate STT → LLM → TTS loop. My recommendation: test Google Danish first, then only switch if the Danish accent/prosody is unacceptable.

terse breach
#

OpenClaw runs in Docker on the VPS. openclaw --version not found on host. Found /var/lib/docker/rootfs/overlayfs/.../usr/local/bin/openclaw inside container. How do I get the version and run openclaw commands — should I exec into the container, or is there a docker compose command?

steel heronBOT
#

Use Docker Compose from the directory that has docker-compose.yml. Don’t run the binary out of /var/lib/docker/... — that’s Docker’s internal overlay storage.

Try this first:

cd /path/to/openclaw   # folder with docker-compose.yml
docker compose ps
docker compose exec openclaw-gateway openclaw --version

If openclaw isn’t on PATH inside the container, use the documented fallback:

docker compose exec openclaw-gateway node dist/index.js --version
docker compose exec openclaw-gateway node dist/index.js status
docker compose exec openclaw-gateway node dist/index.js voicecall setup
#

Some Compose setups also define an openclaw-cli helper service. If yours does:

docker compose run --rm openclaw-cli --version
docker compose run --rm openclaw-cli status
docker compose run --rm openclaw-cli voicecall setup

For logs:

docker compose logs -f openclaw-gateway

So: prefer docker compose exec openclaw-gateway ... for your Hetzner setup unless your compose file includes openclaw-cli.

Docs: https://docs.openclaw.ai/install/docker and https://docs.openclaw.ai/install/hetzner

terse breach
#

OpenClaw 2026.4.26

steel heronBOT
#

Perfect — 2026.4.26 is recent enough for the realtime Voice Call path we’ve been discussing.

From the folder with docker-compose.yml, run:

docker compose exec openclaw-gateway node dist/index.js status --all
docker compose exec openclaw-gateway node dist/index.js plugins list
docker compose exec openclaw-gateway node dist/index.js voicecall setup

If voicecall says command not found, install/enable the plugin:

docker compose exec openclaw-gateway node dist/index.js plugins install @openclaw/voice-call
docker compose restart openclaw-gateway
#

Then rerun:

docker compose exec openclaw-gateway node dist/index.js voicecall setup

For your next debugging paste, the most useful outputs are:

docker compose ps
docker compose exec openclaw-gateway node dist/index.js plugins list
docker compose exec openclaw-gateway node dist/index.js voicecall setup
docker compose logs --tail=100 openclaw-gateway

Don’t paste raw .env or full config with secrets. status --all is designed to be safer/redacted.

terse breach
#

is there a newer version of claw I should install?

steel heronBOT
#

No — 2026.4.26 is currently latest. I checked npm directly:

latest: 2026.4.26
beta: 2026.4.26

So I’d not update right now. Continue with voice setup on that version.

For Docker later, don’t run openclaw update inside the container. Update the image/deploy instead:

git pull
docker compose build
docker compose up -d
#

If you’re using the prebuilt image tag:

docker compose pull
docker compose up -d
terse breach
#

docker compose exec openclaw-gateway node dist/index.js status --all
docker compose exec openclaw-gateway node dist/index.js plugins list
docker compose exec openclaw-gateway node dist/index.js voicecall setup

â—Ź Running all three commands now.

â—Ź Bash(docker compose exec openclaw-gateway node dist/index.js status --all 2>&1)
⎿  OpenClaw status --all

 Overview
 … +34 lines (ctrl+o to expand)

â—Ź Bash(docker compose exec openclaw-gateway node dist/index.js plugins list 2>&1)
⎿  Plugins (68/115 enabled)
Source roots:
stock: /app/dist/extensions
… +228 lines (ctrl+o to expand)

â—Ź Bash(docker compose exec openclaw-gateway node dist/index.js voicecall setup 2>&1)
⎿  Error: Exit code 1
error: unknown command 'voicecall'

â—Ź Here's a summary of each command's result:

status --all — Gateway is running (v2026.4.26, Node 24.14.0). Key issues:

  • Gateway connection has missing scope: operator.read — health check can't auth
  • Port 18789 in use (by the gateway itself — expected)
  • No channels configured
  • 1 agent (main) active, 2 sessions, but bootstrap file is ABSENT
  • Tailscale exposure is off

plugins list — 68 of 115 plugins enabled. Output was large (44KB); full list saved. Notable: active-memory is disabled,
providers like Alibaba and Amazon Bedrock are enabled.

voicecall setup — Failed: unknown command 'voicecall'. This command doesn't exist in the installed version. You may be
thinking of a different command name — you can check available commands with:

docker compose exec openclaw-gateway node dist/index.js --help

The most actionable issue is the operator.read scope missing on the gateway auth token, which is blocking gateway health
checks.

âś» Cogitated for 3m 4s

steel heronBOT
#

That unknown command 'voicecall' means the voice-call plugin command didn’t load — not that 2026.4.26 lacks it.

Run this next from the Compose dir:

docker compose exec openclaw-gateway node dist/index.js plugins inspect voice-call

If it exists but is disabled:

docker compose exec openclaw-gateway node dist/index.js plugins enable voice-call
docker compose restart openclaw-gateway
docker compose exec openclaw-gateway node dist/index.js voicecall setup

If inspect voice-call says missing/not found, install it:

#
docker compose exec openclaw-gateway node dist/index.js plugins install @openclaw/voice-call
docker compose restart openclaw-gateway
docker compose exec openclaw-gateway node dist/index.js voicecall setup

If it’s still unknown after restart, check whether your config has a restrictive plugins.allow; if so, voice-call must be included there.

The operator.read scope issue is separate — annoying for health/status, but it’s not why voicecall is unknown. Also: no channels configured / Tailscale off / bootstrap absent aren’t blockers for phone voice.

Docs: https://docs.openclaw.ai/plugins/voice-call and https://docs.openclaw.ai/cli/plugins

terse breach
#

Inspecting the voice-call plugin first.

â—Ź Bash(docker compose exec openclaw-gateway node dist/index.js plugins inspect voice-call 2>&1)
⎿  @openclaw/voice-call
id: voice-call
OpenClaw voice-call plugin
… +11 lines (ctrl+o to expand)

● It's present and bundled — just disabled by default. Enabling it and running setup.

â—Ź Bash(docker compose exec openclaw-gateway node dist/index.js plugins enable voice-call 2>&1)
⎿  Config overwrite: /home/node/.openclaw/openclaw.json (sha256 965778f3abb29c03a4c4ddc1dc0fde72d2be9293991994f4813401
f01202c5fb -> f9b41b0d7fe2f3fbb0c3019459a6788ef26a4196739a52bbd564810831909e73, backup=/home/node/.openclaw/opencla
w.json.bak)
Enabled plugin "voice-call". Restart the gateway to apply.

â—Ź Bash(docker compose restart openclaw-gateway 2>&1)
⎿   Container openclaw-openclaw-gateway-1 Restarting
Container openclaw-openclaw-gateway-1 Started

â—Ź Bash(docker compose exec openclaw-gateway node dist/index.js voicecall setup 2>&1)
⎿  Voice Call setup: OK
OK plugin-enabled: Voice Call plugin is enabled
OK provider: Provider configured: mock
… +3 lines (ctrl+o to expand)

● All green. Voice-call was bundled but disabled by default — enabling it was all it needed.

Setup checks out:

  • Plugin enabled
  • Provider: mock (configured with credentials complete)
  • Webhook: not needed for mock provider
  • Mode: notify/conversation using normal TTS/STT flow

If you want to switch from mock to a real provider (e.g. Twilio), run voicecall setup --provider twilio and follow the
prompts to supply credentials.

âś» Worked for 3m 10s

steel heronBOT
#

Nice — that confirms the plugin side is fixed. 🎉

Small correction: voicecall setup --provider twilio is not a thing in 2026.4.26. voicecall setup only checks the current config.

Right now you’re still on provider: mock, so it proves the plugin loads, but it won’t make real calls yet. Next step is configuring Twilio + realtime Google.

High-level:

  1. Put secrets in Docker .env / env vars:

    TWILIO_ACCOUNT_SID=...
    TWILIO_AUTH_TOKEN=...
    TWILIO_FROM_NUMBER=+...
    GEMINI_API_KEY=...
    
  2. Recreate container so env changes load:

#
   docker compose up -d --force-recreate openclaw-gateway
  1. Set voice-call config:
    docker compose exec openclaw-gateway node dist/index.js config set --batch-json '[
      {"path":"plugins.entries.voice-call.config.provider","value":"twilio"},
      {"path":"plugins.entries.voice-call.config.fromNumber","value":"+YOUR_TWILIO_NUMBER"},
      {"path":"plugins.entries.voice-call.config.publicUrl","value":"https://voice.example.com/voice/webhook"},
      {"path":"plugins.entries.voice-call.config.webhookSecurity.allowedHosts","value":["voice.example.com"]},
      {"path":"plugins.entries.voice-call.config.streaming.enabled","value":false},
      {"path":"plugins.entries.voice-call.config.realtime.enabled","value":true},
      {"path":"plugins.entries.voice-call.config.realtime.provider","value":"google"},
      {"path":"plugins.entries.voice-call.config.realtime.instructions","value":"Du taler dansk som en naturlig, varm og kortfattet person. Svar på dansk, medmindre brugeren beder om noget andet. Afbryd ikke brugeren midt i en tanke. Hvis noget kræver værktøjer eller hukommelse, sig kort at du tjekker det."}
    ]'
    
#
  1. Restart + verify:
    docker compose restart openclaw-gateway
    docker compose exec openclaw-gateway node dist/index.js voicecall setup
    

Important Docker/VPS bit: Twilio needs a public HTTPS/WSS webhook. If you reverse-proxy from nginx/Caddy on the host, make sure the voice-call webhook port/path is reachable and WebSocket upgrades work for the realtime stream too.

Docs: https://docs.openclaw.ai/plugins/voice-call and https://docs.openclaw.ai/install/docker

terse breach
#

Does Google Gemini realtime voice (Live API) work on the free tier in Google AI Studio, or does it require billing? And for this use case (maximum human-like Danish phone conversation), is there any quality difference between using AI Studio vs Vertex AI?

steel heronBOT
#

Yes — Gemini Live can run on the Google AI Studio / Gemini Developer API free tier for testing. Google lists gemini-2.5-flash-native-audio-preview-12-2025 with free-tier input/output and describes it as Live API native audio optimized for better pacing, voice naturalness, verbosity, and mood. (ai.google.dev)

But for Roger’s use case, I’d still enable billing before real phone use:

  • free tier has stricter/variable rate limits;
  • preview/live models can have more restrictive limits;
  • free-tier prompts/responses may be used to improve Google products, while paid tier says they are not. (ai.google.dev)

On AI Studio vs Vertex/Gemini Enterprise Agent Platform: I would not expect a meaningful voice-quality difference if you’re hitting the same model with the same config. The differences are mostly product/ops: API key vs service account, no-cost usage vs cloud billing/credits, quotas/regions, enterprise support/SLA, compliance, IAM, VPC, dedicated capacity, etc. (docs.cloud.google.com)

For maximum human-like Danish phone conversation, I’d do:

  1. Start with AI Studio / Gemini Developer API key in OpenClaw:
    GEMINI_API_KEY=...
    
  2. Use:
#
   realtime: {
     enabled: true,
     provider: "google",
     providers: {
       google: {
         model: "gemini-2.5-flash-native-audio-preview-12-2025"
       }
     }
   }
  1. Test Danish quality on real Twilio calls.
  2. Only move to Vertex/Gemini Enterprise if you need enterprise controls, billing governance, compliance, or dedicated capacity — not for expected better Danish voice quality.

So: AI Studio free tier is viable for proof-of-concept. Paid AI Studio/Gemini Developer API is my recommendation for real usage. Vertex is an ops/compliance upgrade, not the first quality upgrade.

terse breach
#

For real phone usage with gemini-2.5-flash-native-audio-preview-12-2025 via the Live API — roughly how much would $10 USD cover in terms of call minutes? Is $10 enough to get started, or should I put in more?

steel heronBOT
#

For Gemini model cost only, $10 is absolutely enough to get started.

Google lists gemini-2.5-flash-native-audio-preview-12-2025 at $3 / 1M audio input tokens and $12 / 1M audio output tokens on paid tier. Gemini audio is roughly 32 tokens/sec = 1,920 tokens/minute. (ai.google.dev)

Rough math:

| Scenario                                         | Model cost/min | $10 covers |
| ------------------------------------------------ | -------------- | ---------- |
| Caller audio all minute, Roger speaks ~â…“ of time | ~$0.013/min    | ~750 min   |
| Caller audio all minute, Roger speaks ~½ of time | ~$0.017/min    | ~580 min   |
| Worst-ish: input + Roger speaking 100% of time   | ~$0.029/min    | ~350 min   |

So for testing: put in $10 and you’re fine.

For real phone usage, budget extra for Twilio minutes + phone number. Twilio pricing varies by country, call direction, number type, and account pricing; Twilio exposes account-specific per-number/per-country pricing via their Voice Pricing API. (twilio.com)

#

My practical recommendation:

• $10 Gemini credit: enough for setup, tuning, lots of test calls.
• $25–50 total budget: more comfortable once Twilio is included and you start doing real Danish phone tests.
• Don’t overfund yet; measure actual call logs first.

terse breach
#

For the public HTTPS webhook URL — we don't have a domain set up yet. Is Cloudflare Tunnel a good option for this, or do you recommend setting up a proper domain? What's the fastest/most stable path for production use?

Also: once voice is working, we want to set up a dedicated roger-voice workspace with optimized persona (tiny SOUL.md, IDENTITY.md as you described earlier). Does the domain/tunnel choice affect that setup at all?

steel heronBOT
#

Cloudflare Tunnel is a good option if you attach it to a stable hostname like:

https://voice.yourdomain.com/voice/webhook

I would not use a random/temporary tunnel URL for production. Twilio needs a stable webhook URL, and changing it later is annoying.

My recommendation:

  1. Fastest stable production path: buy/use a cheap domain → put it on Cloudflare → create a named Cloudflare Tunnel → route voice.yourdomain.com to the OpenClaw voice webhook.
  2. Also good: Caddy/Nginx directly on the VPS with Let’s Encrypt.
  3. Avoid for production: free/random ngrok or temporary Cloudflare quick tunnels.

Make sure Cloudflare Access / login protection is off for this route — Twilio must be able to call it directly. Rely on Twilio signature verification + OpenClaw config instead.

OpenClaw config shape:

#
{
  plugins: {
    entries: {
      "voice-call": {
        config: {
          publicUrl: "https://voice.yourdomain.com/voice/webhook",
          webhookSecurity: {
            allowedHosts: ["voice.yourdomain.com"]
          }
        }
      }
    }
  }
}
#

Also confirm the tunnel/reverse proxy supports WebSocket upgrades, since realtime voice uses streaming.

The domain/tunnel choice does not affect the dedicated roger-voice workspace. That’s separate:

{
  agents: {
    list: [
      {
        id: "roger-voice",
        workspace: "~/.openclaw/workspace-roger-voice"
      }
    ]
  },
  plugins: {
    entries: {
#
      "voice-call": {
        config: {
          agentId: "roger-voice"
        }
      }
    }
  }
}

So: set up stable public HTTPS first, then tune roger-voice persona/workspace independently.

terse breach
#

Called +4552517979. Got "we are sorry an application error has occurred". What should we check first — Twilio logs or OpenClaw voice-call logs?

steel heronBOT
#

Check Twilio Debugger / call logs first. That message usually means Twilio tried your webhook and got an error, timeout, bad TwiML, TLS issue, or couldn’t reach it.

In parallel, tail OpenClaw logs while making one fresh test call:

docker compose logs -f --tail=200 openclaw-gateway

Then check Voice Call setup:

docker compose exec openclaw-gateway node dist/index.js voicecall setup

In Twilio, look for:

  • HTTP status from your webhook request
  • timeout / connection refused / TLS cert errors
#
  • signature validation failures
  • invalid TwiML
  • exact webhook URL Twilio requested

In OpenClaw logs, look for:

  • [voice-call]
  • inbound webhook received / rejected
  • missing provider config
  • public URL / allowed host mismatch
  • realtime provider errors

Fastest diagnosis question: does OpenClaw log anything at the exact time of the call?

  • No OpenClaw log entry → domain/tunnel/proxy/Twilio webhook routing problem.
  • OpenClaw receives it but rejects/errors → OpenClaw voice-call config or Twilio signature/publicUrl mismatch.
  • OpenClaw accepts then errors after stream starts → realtime Google/Twilio Media Stream issue.