#TTS/STT/STS
1 messages · Page 1 of 1 (latest)
I’m checking the exact docs paths so I don’t give you made-up keys.
You’re setting up 2 different features:
- Discord voice channel / call =
channels.discord.voice - Telegram/Discord voice-note replies =
tools.media.audio+messages.tts
Also, do not install @openclaw/voice-call for this. That plugin is for phone calls (Twilio/Telnyx/Plivo), not Discord voice channels.
Step 1 only: install audio tooling on your WSL2 gateway host.
sudo apt update
sudo apt install -y ffmpeg
ffmpeg -version | head -1
ffprobe -version | head -1
Why first: Discord voice messages need ffmpeg and ffprobe on the gateway host.
Docs:
- https://docs.openclaw.ai/channels/discord#voice-messages
- https://docs.openclaw.ai/tools/tts
- https://docs.openclaw.ai/help/faq
Reply with done or paste the error, then I’ll give you Step 2, the exact OpenAI + ElevenLabs config.
Done
Nice. Step 2: put your API keys in ~/.openclaw/.env on the WSL machine running the Gateway.
mkdir -p ~/.openclaw
nano ~/.openclaw/.env
Paste this:
OPENAI_API_KEY=your_openai_key_here
ELEVENLABS_API_KEY=your_elevenlabs_key_here
Notes:
ELEVENLABS_API_KEYcan also beXI_API_KEY, but I’d useELEVENLABS_API_KEY.- No quotes, one key per line.
Why this way: if OpenClaw runs as a service, shell env vars often do not carry over, but ~/.openclaw/.env does.
Docs:
Reply done and I’ll give you Step 3, the exact config block.
Hold up. I'm new. Put a dummy key and also show super step by step and also API setting on Eleven Labs, like which option to turn on. ANd also I can choose any voices?
Let me sanity-check the ElevenLabs side so I don’t send you on a UI scavenger hunt.
Yep, let’s slow it way down.
Answer first: yes, you can use any ElevenLabs voice that exists in your account/library and that your API key can access.
For your first setup, I strongly recommend a premade voice from ElevenLabs, not a cloned/custom one. Less friction.
Step 2, beginner version
Goal for this step only: collect 3 things
- OpenAI API key
- ElevenLabs API key
- One ElevenLabs Voice ID
Do not edit OpenClaw config yet.
2A. Get your OpenAI API key
- Open: https://platform.openai.com/api-keys
- Click Create new secret key
- Give it a name like:
openclaw-tts - Copy it somewhere safe
It will look roughly like this:
OPENAI_API_KEY=sk-proj-REPLACE_THIS_WITH_YOUR_REAL_KEY
2B. Get your ElevenLabs API key
- Log in to ElevenLabs
- Go to the dashboard/account area
- Open API Keys
- Click Create API Key
- Name it something like:
openclaw-voice - Copy the key
It will look roughly like this:
ELEVENLABS_API_KEY=sk_1234567890_REPLACE_ME
What to turn on in ElevenLabs?
For basic OpenClaw TTS, you usually only need:
- an API key
- a voice
You do not need to set up:
- Agents
- Conversational AI
- Webhooks
- Phone
- STT inside ElevenLabs
If ElevenLabs asks about scopes/restrictions, basic text-to-speech access is what matters.
2C. Pick a voice in ElevenLabs
- Open Voices / My Voices / Voice Library
- Pick a premade voice you like
- Find the three dots or More actions
- Click Copy voice ID
It will look something like:
ELEVENLABS_VOICE_ID=EXAVITQu4vr4xnSDxMaL
That sample ID is just an example, not “the” required voice.
Dummy example only
Do not use these exact values:
OPENAI_API_KEY=sk-proj-REPLACE_THIS_WITH_YOUR_REAL_KEY
ELEVENLABS_API_KEY=sk_1234567890_REPLACE_ME
ELEVENLABS_VOICE_ID=EXAVITQu4vr4xnSDxMaL
Quick answer to “can I choose any voices?”
Yes, if:
- the voice is in your ElevenLabs account/library
- your key can access it
For first setup, choose one premade voice.
Docs:
Reply with done once you have those 3 values, and I’ll give you Step 3: exactly where to paste them in WSL with a safe beginner config.
done collecting.
The voice is jqcCZkN6Knx8BJ5TBdYR
Perfect. The voice ID is not secret, so jqcCZkN6Knx8BJ5TBdYR is fine to use later.
Do not paste your API keys here.
Step 3
Put your 2 API keys into the OpenClaw env file on WSL.
3.1 Open WSL terminal
3.2 Run this
mkdir -p ~/.openclaw
nano ~/.openclaw/.env
3.3 Paste this into the file
OPENAI_API_KEY=sk-proj-PASTE_YOUR_REAL_OPENAI_KEY_HERE
ELEVENLABS_API_KEY=sk_XXXX_PASTE_YOUR_REAL_ELEVENLABS_KEY_HERE
3.4 Save the file
In nano:
- Press Ctrl+O
- Press Enter
- Press Ctrl+X
3.5 Lock down the file
chmod 600 ~/.openclaw/.env
That’s it for this step.
Docs: https://docs.openclaw.ai/help/faq#env-vars-and-env-loading
Reply done, and I’ll give you Step 4: the exact openclaw.json config for:
- Discord voice channel voice
- Telegram/Discord voice-note replies
- using your ElevenLabs voice ID
jqcCZkN6Knx8BJ5TBdYR
I’m here 🦐
Step 4
Goal: write the basic voice config, without touching your API keys again.
Paste these in WSL, one by one:
openclaw config set messages.tts.auto "inbound"
openclaw config set messages.tts.provider "elevenlabs"
openclaw config set messages.tts.providers '{
elevenlabs: {
voiceId: "jqcCZkN6Knx8BJ5TBdYR",
modelId: "eleven_multilingual_v2"
},
openai: {
model: "gpt-4o-mini-tts",
voice: "alloy"
}
}' --strict-json --merge
openclaw config set tools.media.audio.enabled true --strict-json
openclaw config set tools.media.audio.models '[
{
provider: "openai",
model: "gpt-4o-transcribe"
}
]' --strict-json
openclaw config set channels.discord.commands.native true --strict-json
openclaw config set channels.discord.voice.enabled true --strict-json
openclaw config set channels.discord.voice.tts.provider "elevenlabs"
openclaw config set channels.discord.voice.tts.providers '{
elevenlabs: {
voiceId: "jqcCZkN6Knx8BJ5TBdYR",
modelId: "eleven_multilingual_v2"
},
openai: {
model: "gpt-4o-mini-tts",
voice: "alloy"
}
}' --strict-json --merge
openclaw config validate
What this does
- Telegram / voice-note replies: reply in audio only after you send audio
- Discord voice channel / call: enables
/vc join - Speech-to-text: uses OpenAI
gpt-4o-transcribe - Voice: uses your ElevenLabs voice
jqcCZkN6Knx8BJ5TBdYR
Important note
This is the clean baseline.
- Telegram should do native voice-note style nicely.
- Discord voice channels should speak with this voice.
- Discord text-chat audio replies may first show up as an audio attachment instead of the fancy waveform-style voice message. We can handle that after the base test.
Docs:
Reply with either:
- “validate passed”
- or paste the exact error output
Then I’ll give you Step 5.
done somehow, got some error too
No worries, you’re close.
Step 5 only: run this in WSL and paste the full output:
openclaw config validate
- If it says valid, reply:
validate passed - If it shows errors, paste the whole error block
- Don’t paste API keys, but this command normally won’t show them
Why this step: we need to separate config syntax problems from actual voice/runtime problems before testing.
openclaw config validate 🦞 OpenClaw 2026.4.22 (00bd2cf) — Your .zshrc wishes it could do what I do. Config valid: ~/.openclaw/openclaw.json
Nice, that’s the big hurdle. 🎉 Your config is valid.
Step 6
Restart the gateway so it picks up the new ~/.openclaw/.env API keys:
openclaw gateway restart
Then paste the output, or if it’s clean just reply:
restart done
Why this step: config hot-reloads, but new env keys usually need a gateway restart.
restart done
Nice. Step 7: test a one-off TTS reply in Discord.
In this thread, try:
/voice audio Hello from OpenClaw
If /voice doesn’t show up yet, use the text command instead:
/tts audio Hello from OpenClaw
Expected result
You should get back an audio reply.
Reply with just one of these:
worked- or paste the exact error
Talk voice is not configured.
Missing: talk.provider and talk.providers.<provider>.
Set it on the gateway, then retry.
That error is actually helpful, we hit the wrong voice feature. Let me verify the command mapping so I don’t send you in circles.
Ah, good catch, that means Discord routed you into Talk mode, not reply TTS.
So for now, ignore /voice.
Step 7, retry
Use this exact text command in chat:
/tts audio Hello from OpenClaw
Important
- Do not use
/voicefor this test /voicehit the Talk feature- We’re testing message TTS replies first
Expected result
You should get back an audio reply using your configured voice.
Reply with one of:
worked- or paste the exact error
If this works, I’ll do the Discord call/voice channel step next.
Got it. Then the Discord command routing is probably the problem, not the TTS config itself.
I gave you the wrong test surface earlier, sorry.
Step 8
Run this in WSL, then paste the full output:
openclaw infer tts convert --text "hello from openclaw" --output /tmp/hello.mp3 --json
What this does
- tests TTS directly on the gateway
- bypasses Discord slash/text command weirdness
- tells us whether ElevenLabs/OpenAI + your config are actually working
Reply with:
- the full JSON output, or
convert worked
openclaw infer tts convert --text "hello from openclaw" --output /tmp/hello.mp3 --json 11:05:16 [plugins] 1 plugin(s) failed to initialize (validation: xai). Run 'openclaw plugins list' for details. { "ok": true, "capability": "tts.convert", "transport": "local", "provider": "openai", "attempts": [ { "provider": "elevenlabs", "outcome": "failed", "reasonCode": "provider_error", "latencyMs": 2167, "error": "elevenlabs: ElevenLabs API error (401): This request exceeds your API key (Miko) quota of 10. You have 0 credits remaining, while 5 credits are required for this request. [code=quota_exceeded]" }, { "provider": "openai", "outcome": "success", "reasonCode": "success", "latencyMs": 4403 } ], "outputs": [ { "path": "/tmp/hello.mp3", "format": "mp3", "voiceCompatible": false
Nice, this tells us a lot.
What happened
- Your TTS setup works
- ElevenLabs failed because your account has no credits
- OpenClaw fell back to OpenAI successfully
So the problem is not config anymore. It’s just this:
ElevenLabs ... quota_exceeded ... 0 credits remaining
The xai warning is unrelated for this task, we can ignore it for now.
Next step, choose 1
I recommend Option A for now so you can keep moving.
Option A, recommended
Use OpenAI as primary for now, since it already works.
Option B
Add ElevenLabs credits, then keep ElevenLabs as primary.
Reply with:
- A = switch primary voice to OpenAI now
- B = keep ElevenLabs, I’ll give you the next test after you add credits
If this solved part of it, please right-click the solution message → Apps → ✅ Mark Solution so Answer Overflow can index it.
I got balance, Total
23,736 credits
Remaining
23,642
Perfect, then this is not an account-balance issue.
The error is saying your specific API key named “Miko” has its own quota cap of 10 credits, and that key is out, even though your account still has credits.
Step 9
In ElevenLabs → API Keys:
- find the key named Miko
- either raise/remove its credit limit
- or create a new API key just for OpenClaw with a bigger quota / no tiny cap
Do not paste the key here.
When that’s done, reply done, and I’ll give you the next single step to update WSL and retest.
Docs:
Oh yeah i set 10 credit limit on the API key. I removed it now. What to do now?
Nice, that should do it.
Step 10
Run the same WSL test again:
openclaw infer tts convert --text "hello from openclaw" --output /tmp/hello.mp3 --json
What we want to see
Ideally this time the JSON shows:
"provider": "elevenlabs"
or at least an attempts entry where ElevenLabs is success.
Paste the output again, or just say:
elevenlabs worked- or paste the error if it still fails
No gateway restart should be needed here since you changed the quota on the provider side.
openclaw infer tts convert --text "hello from openclaw" --output /tmp/hello.mp3 --json 11:09:03 [plugins] 1 plugin(s) failed to initialize (validation: xai). Run 'openclaw plugins list' for details. { "ok": true, "capability": "tts.convert", "transport": "local", "provider": "elevenlabs", "attempts": [ { "provider": "elevenlabs", "outcome": "success", "reasonCode": "success", "latencyMs": 2321 } ], "outputs": [ { "path": "/tmp/hello.mp3", "format": "mp3_44100_128", "voiceCompatible": false } ] }
Nice, ElevenLabs is working end to end now. 🎉
Next, pick one:
1. Test voice replies
Send the bot a real voice message (not a slash command).
- Telegram: should reply as a native voice note
- Discord text chat: will likely reply as audio, but not always the fancy waveform voice-message UI
2. Test Discord voice channel / call
We’ll do the /vc join step next.
I recommend 1 first.
Reply with 1 or 2.
Telegram is working. No reply on Discord.
Also, the voice is so different the one I selected
Good, Telegram working means the core TTS pipeline is healthy.
The different voice part is very likely because Discord can use a separate override from normal reply TTS:
- normal reply TTS:
messages.tts - Discord voice playback:
channels.discord.voice.tts
Step 11
Run this one command in WSL and paste the output:
printf 'messages.tts\n'; openclaw config get messages.tts --json; printf '\n\nchannels.discord.voice.tts\n'; openclaw config get channels.discord.voice.tts --json
Why this step:
- if those two blocks point at different providers or voice IDs, that explains the voice mismatch immediately
- then we can tackle the Discord no-reply issue separately
Alright, the issue -
Telegram sends a reply in voice, but the voice is different.
Discord doesn't send a reply.
Discord /vc works and she speaks the voice I selected, but reply is so delayed and also seems like she only replying when I ask Miko, can you hear me? Miko give me your details
That’s actually a clear signal.
- Discord
/vconly answering when you say “Miko” means mention-gating is still on - Discord
/vcdelay is partly expected, because Discord voice currently uses short recorded segments + batch transcription, not full realtime streaming - Telegram sounding different is probably a separate TTS config/prefs issue, we’ll fix that after Discord
Next step, Discord first
Run this in WSL and paste the output:
openclaw config get channels.discord.guilds --json
I want to see the current requireMention setting so I can give you the exact next command to make voice chat answer without needing “Miko …”.
Docs:
openclaw config get channels.discord.guilds --json { "1495157727146545194": { "requireMention": false, "users": [ "1302991150763999302" ] }, "1396404029680128001": { "requireMention": true, "users": [ "1302991150763999302" ] } }
Found it.
This Discord server is 1456350064065904867, and it is not in your channels.discord.guilds config right now.
That’s why it still behaves like mention-gated here.
Next step
Run this in WSL:
openclaw config set channels.discord.guilds '{"1456350064065904867":{"requireMention":false,"users":["1302991150763999302"]}}' --strict-json --merge
No restart should be needed.
Then test Discord voice again without saying “Miko” and reply:
done- or paste the error
The delay part is still somewhat normal on Discord voice, by the way. OpenClaw uses short recorded segments there, not full realtime streaming.
done
Nice.
Next step
Reconnect the Discord voice session, then test without saying “Miko”.
- Run:
/vc leave
- Then:
/vc join
- Then say something simple without the bot name, like:
what’s the weather in Tokyo
Reply with:
responded- or
still needs Miko
If it responds, we’ll fix the Telegram voice mismatch next.
She's not replying in either way. Seems like high latency. We go back to thismlayer. First, she's sending voice reply in Telegram but different voice, how to set same as the Eleven Labs?
Yep, let’s park Discord for a minute.
For Telegram, the reply voice comes from messages.tts, not from channels.discord.voice.tts.
If it sounds wrong, the two usual causes are:
- Telegram reply fell back to OpenAI
- Telegram reply used ElevenLabs, but not the exact voice/model you expect
Next step
After you trigger one Telegram voice reply, run this in WSL:
openclaw infer tts status --json
Then paste the output.
That will tell us whether Telegram’s last reply was:
- really ElevenLabs
- or a fallback
- and what provider is currently active
Don’t paste keys, this command shouldn’t show them.
Pasting in two seprat way for the test limit.
openclaw infer tts status --json { "transport": "gateway", "enabled": true, "auto": "inbound", "provider": "elevenlabs", "fallbackProvider": "openai", "fallbackProviders": [ "openai", "microsoft" ], "prefsPath": "/home/wrakeeb/.openclaw/settings/tts.json", "providerStates": [ { "id": "elevenlabs", "label": "ElevenLabs", "configured": true }, { "id": "google", "label": "Google", "configured": false },
{ "id": "microsoft", "label": "Microsoft", "configured": true }, { "id": "minimax", "label": "MiniMax", "configured": false }, { "id": "openai", "label": "OpenAI", "configured": true }, { "id": "vydra", "label": "Vydra", "configured": false
Good, that snippet already helps.
It means Telegram is currently using ElevenLabs as the active TTS provider, not OpenAI.
So the mismatch is probably one of these:
- wrong voiceId
- different modelId
- ElevenLabs voice settings changing the sound
Next step
Run this in WSL and paste the output:
openclaw config get messages.tts.providers.elevenlabs --json
If you see an apiKey field for any reason, replace just that value with REDACTED before pasting.
That will tell us exactly which voiceId/modelId/settings Telegram is using.
That’s still the rest of the tts status output, not the ElevenLabs config yet.
Next step
Run this exact command in WSL and paste the output:
printf 'voiceId: '; openclaw config get messages.tts.providers.elevenlabs.voiceId; printf '\nmodelId: '; openclaw config get messages.tts.providers.elevenlabs.modelId; printf '\nvoiceSettings:\n'; openclaw config get messages.tts.providers.elevenlabs.voiceSettings --json
This is safer because it avoids dumping the API key.
I’m looking for:
voiceIdmodelIdvoiceSettings
Then I can tell you exactly why Telegram sounds different.
remember, the /vc join on discord use the correct voiceID. SO. don't wanna mess up with that
Yep, agreed. We should not touch channels.discord.voice.tts right now.
/vc join uses the Discord voice override.
Telegram voice replies use messages.tts.
So this next step is read-only and only checks the Telegram side.
Safe next step
Run this in WSL:
printf 'messages.tts.voiceId: '; openclaw config get messages.tts.providers.elevenlabs.voiceId; printf '\nmessages.tts.modelId: '; openclaw config get messages.tts.providers.elevenlabs.modelId; printf '\nmessages.tts.voiceSettings:\n'; openclaw config get messages.tts.providers.elevenlabs.voiceSettings --json
This does not change anything.
It does not touch /vc join.
Paste the output, and then I’ll tell you the exact single change, if any, for Telegram only.
printf 'messages.tts.voiceId: '; openclaw config get messages.tts.providers.elevenlabs.voiceId; printf '\nmessages.tts.modelId: '; openclaw config get messages.tts.providers.elevenlabs.modelId; printf '\nmessages.tts.voiceSettings:\n'; openclaw config get messages.tts.providers.elevenlabs.voiceSettings --json messages.tts.voiceId: 🦞 OpenClaw 2026.4.22 (00bd2cf) — Welcome to the command line: where dreams compile and confidence segfaults. Config path not found: messages.tts.providers.elevenlabs.voiceId messages.tts.modelId: 🦞 OpenClaw 2026.4.22 (00bd2cf) — Automation with claws: minimal fuss, maximal pinch. Config path not found: messages.tts.providers.elevenlabs.modelId messages.tts.voiceSettings: Config path not found: messages.tts.providers.elevenlabs.voiceSettings
Yep, and I’m not touching /vc here.
I gave you the wrong path shape earlier, sorry. Your config is probably stored in a different TTS shape than I assumed.
Safe next step, read-only
Run this in WSL:
printf 'TELEGRAM / messages.tts\n'; openclaw config get messages.tts --json; printf '\n\nDISCORD VC / channels.discord.voice.tts\n'; openclaw config get channels.discord.voice.tts --json
If any apiKey shows up, replace just that value with REDACTED before pasting.
Why this step:
messages.tts= Telegram voice replieschannels.discord.voice.tts=/vc joinvoice- then I can tell you the exact single Telegram-only change to match the Discord voice, without messing up
/vc
printf 'TELEGRAM / messages.tts\n'; openclaw config get messages.tts --json; printf '\n\nDISCORD VC / channels.discord.voice.tts\n'; openclaw config get channels.discord.voice.tts --json TELEGRAM / messages.tts { "auto": "inbound", "provider": "elevenlabs" } DISCORD VC / channels.discord.voice.tts { "provider": "elevenlabs", "providers": { "elevenlabs": { "apiKey": { "source": "__OPENCLAW_REDACTED__", "provider": "__OPENCLAW_REDACTED__", "id": "__OPENCLAW_REDACTED__" }, "voiceId": "ocZQ262SsZb9RIxcQBOj" }, "openai": { "voice": "alloy" } } }
Found it.
Why Telegram sounds different:
messages.tts has no ElevenLabs voiceId set, while Discord /vc does:
- Telegram replies:
messages.tts→ only hasprovider: "elevenlabs" - Discord
/vc:channels.discord.voice.tts.providers.elevenlabs.voiceId = "ocZQ262SsZb9RIxcQBOj"
So Telegram is using ElevenLabs, but not pinned to the same voice.
Next step
Run this in WSL:
openclaw config set messages.tts.providers '{"elevenlabs":{"voiceId":"ocZQ262SsZb9RIxcQBOj"}}' --strict-json --merge
This is Telegram/reply-TTS only. It does not change /vc.
Then send a Telegram voice message again and reply with:
same now- or
still different
openclaw config set messages.tts.providers '{"elevenlabs":{"voiceId":"ocZQ262SsZb9RIxcQBOj"}}' --strict-json --merge 🦞 OpenClaw 2026.4.22 (00bd2cf) — I speak fluent bash, mild sarcasm, and aggressive tab-completion energy. Config overwrite: /home/wrakeeb/.openclaw/openclaw.json (sha256 d0e89381e3361833dabdc6aaa61f76e6aa8fd8d85ebc83b7b469b2623926008f -> fee0425810310a9edd5ed50cd7542c40365811088f68a7aecaf86b9ed39ffdd8, backup=/home/wrakeeb/.openclaw/openclaw.json.bak) Updated messages.tts.providers. Restart the gateway to apply.
Perfect. That changed the Telegram reply voice config only, not /vc.
Next step
Run this in WSL:
openclaw gateway restart
Then send one Telegram voice message to the bot again and tell me:
same now- or
still different
Yeap it's working. Let's change the voice ID.
New ID: jqcCZkN6Knx8BJ5TBdYR
Nice. Let’s do this without touching /vc yet.
Step 1
Change the Telegram / reply TTS voice only:
openclaw config set messages.tts.providers '{"elevenlabs":{"voiceId":"jqcCZkN6Knx8BJ5TBdYR"}}' --strict-json --merge
This updates:
- Telegram voice replies
- other normal reply-TTS paths
It does not change:
- Discord
/vc joinvoice
Run that and reply done. Then I’ll give you the next single step.
Works!. Now let's set this on Discord too
Nice. Now let’s change the Discord /vc voice only.
Step 1
Run this in WSL:
openclaw config set channels.discord.voice.tts.providers.elevenlabs.voiceId "jqcCZkN6Knx8BJ5TBdYR"
This changes:
- Discord voice channel /
/vc joinvoice
It does not change:
- Telegram reply voice
- your API key setup
Reply done after that, and I’ll give you the next step to make the live voice session use it.
Alright, the voice is set. We got 2 issues right now.
-
No voice reply in discord DM. Telegram works
-
The latency is so high on Discord /voice . I followed the log and she renegade the reply but speaks in the voice after a long time and only for 1/2 sec
Yep, I think these are two different buckets:
-
Discord DM “voice reply”
If you mean the same native voice-note bubble as Telegram, that’s the catch. Current TTS docs only promise native voice-message delivery for Telegram / Feishu / Matrix / WhatsApp. Discord voice messages are a separate path, and Discord also does not allow text + voice message in the same payload. So Telegram working and Discord DM not matching it is, unfortunately, expected on2026.4.22. -
Discord
/vclatency
This part is real. Docs say Discord voice currently uses short recorded segments + batch transcription, not full realtime streaming, so it will lag more.
Next step, low-risk speedup
Run this in WSL:
openclaw config set tools.media.audio.models '[{"provider":"openai","model":"gpt-4o-mini-transcribe"}]' --strict-json
This should make Discord voice input faster. It does not change your ElevenLabs voice.
After that:
/vc leave/vc join
Then test again and tell me:
latency bettersame latencystill only 1/2 sec
Docs:
openclaw config set tools.media.audio.models '[{"provider":"openai","model":"gpt-4o-mini-transcribe"}]' --strict-json 🦞 OpenClaw 2026.4.22 (00bd2cf) — If you're lost, run doctor; if you're brave, run prod; if you're wise, run tests. Config overwrite: /home/wrakeeb/.openclaw/openclaw.json (sha256 6ac66f1ad5906cfa7338add7d83f257b7c2cf1c03b34ed04f2767fd1d07fe6eb -> 12ed5f2d63f3d7d541351c135a81fc2b371b0462005ec48f12a5ad9cadbd89be, backup=/home/wrakeeb/.openclaw/openclaw.json.bak) Updated tools.media.audio.models. Restart the gateway to apply.```
So, no way to do voice reply in Discord DM?
Short answer: not the same way Telegram does it, no.
On 2026.4.22:
- Telegram: built-in auto-TTS can come back as a native voice note
- Discord DM: OpenClaw can send Discord voice messages, but that is a separate path
- The built-in auto reply TTS is not wired like Telegram voice-note replies for Discord
Important Discord constraint from docs:
- a Discord voice message needs local audio file +
asVoice=true - and Discord does not allow text + voice message in the same payload
So:
- native Discord voice-note style auto replies: not out of the box
- regular audio attachment replies: yes
- explicit Discord voice-message send path: yes, but separate/custom
Docs:
Next step
Your faster STT change has not applied yet. Do this now:
openclaw gateway restart
Then:
/vc leave/vc join- test again
Reply with:
latency bettersame- or
still 1/2 sec
If this helped, please right-click the solution message → Apps → ✅ Mark Solution so Answer Overflow can index it.
Yeah, then I think we’ve hit the current Discord voice-channel limit, not a bad STT model choice.
On 2026.4.22, Discord /vc is still short recorded segments + batch transcription, not true realtime duplex voice, so there is no documented low-latency toggle I can point you to here.
The “only 1/2 sec plays” part is not normal though, so that’s the part worth debugging.
Next step
In WSL, run:
openclaw logs --follow
Then do one short /vc test in Discord, wait for the bad reply, stop the log with Ctrl+C, and paste the relevant lines here.
I’m looking for anything mentioning:
discord voicetts
Yep, then the honest answer is:
Discord /vc is currently not low-latency on 2026.4.22.
The docs explicitly say Discord voice records short segments and uses batch transcription, not realtime streaming, so there is no supported config knob to make it feel fast like a live call.
Docs:
So there are really 2 separate things now:
- High latency: expected limitation today
- Only 1/2 sec of audio plays: that part is not expected
Next step
If you want to debug the cut-off playback part, run this in WSL:
openclaw logs --follow
Then do one /vc test, stop the log with Ctrl+C, and paste the lines around:
discord voicettsffmpegopusDAVE
If your main goal is just low-latency Discord voice, I’d call that a current product gap and ask you to file it here: https://github.com/openclaw/openclaw/issues
6:24:38+00:00 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} [[tts:I’m Miko — calm, sharp, playful, and a little nerdy.]] 06:25:03+00:00 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} Yep — sounds like the voice path is glitching and dropping the transcript text. 06:25:23+00:00 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} Sort crop? What do you mean? 06:25:30+00:00 warn media-understanding {"subsystem":"media-understanding"} audio: failed (0/1) reason=Audio transcription response missing text 06:25:30+00:00 warn discord/voice {"subsystem":"discord/voice"} discord voice: processing failed: Audio transcription response missing text 06:25:49+00:00 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} [[tts:Yeah, I can hear you 😌]] 06:26:01+00:00 info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ node.list 2509ms conn=6158e9d7…a1d5 id=9ae9edb5…383e 06:26:09+00:00 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} [[tts:Yeah, I can hear you 😌]] 06:26:26+00:00 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} [[tts:Yeah, I can hear you 😌]] 06:26:29+00:00 warn media-understanding {"subsystem":"media-understanding"} audio: failed (0/1) reason=Audio transcription response missing text 06:26:29+00:00 warn discord/voice {"subsystem":"discord/voice"} discord voice: processing failed: Audio transcription response missing text
I see the actual failure now. Let me verify whether that gpt-4o-mini-transcribe change is the culprit before I send you the next step.
Thanks, this log is actually useful.
The failure is not ElevenLabs. It is this line:
discord voice: processing failed: Audio transcription response missing text
So the break is happening on the Discord voice STT step, before TTS playback.
I also do not see DAVE, opus, or ffmpeg errors here, which is good.
Next step
Let’s undo my last “speed” change and go back to the more reliable STT model:
openclaw config set tools.media.audio.models '[{"provider":"openai","model":"gpt-4o-transcribe"}]' --strict-json
Reply done after that.
This probably will not make Discord low-latency, because /vc is still batch-based today, but it should help with the chopped / failed replies.
Docs:
Yep, same failure.
Those logs still show the STT side returning empty text:
Audio transcription response missing text
So the next step is still to revert Discord audio transcription back to the more reliable model.
Next step
Run this in WSL:
openclaw config set tools.media.audio.models '[{"provider":"openai","model":"gpt-4o-transcribe"}]' --strict-json
Then reply done.
Why: docs for audio say gpt-4o-mini-transcribe is the default fast option, but gpt-4o-transcribe is the higher accuracy option, and right now we need reliability more than speed.
openclaw config set tools.media.audio.models '[{"provider":"openai","model":"gpt-4o-transcribe"}]' --strict-json 🦞 OpenClaw 2026.4.22 (00bd2cf) — Ah, the fruit tree company! 🍎 Config overwrite: /home/wrakeeb/.openclaw/openclaw.json (sha256 ade1ec39a0eddbb92408af1044a61312357a91eb3b271600e22fecede44c88f2 -> 4d7616bef7bbaaadf35c7eeb3989b241be4a9a91cdaac8ff856b5fe996a69fa3, backup=/home/wrakeeb/.openclaw/openclaw.json.bak) Updated tools.media.audio.models. Restart the gateway to apply.
Good, that change is queued but not live yet.
Next step
Run this in WSL:
openclaw gateway restart
Then reply done.
After that, we’ll do one fresh /vc test and see if the cut-off / empty-transcript issue is gone.
Docs: https://docs.openclaw.ai/help/faq
done
Nice. That restart applied the more reliable STT model.
Next step
Do one clean /vc test:
/vc leave/vc join- say one short sentence, like:
hello can you hear me
Then reply with just one of these:
full replydelayed but full replydelayed and cut offno reply
If it’s still bad, we’ll debug the next layer.
No reply
Log tail truncated (increase --max-bytes). 06:45:49+00:00 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} [[tts:Yes, I can hear you.]] 06:46:08+00:00 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} [[tts:Yes, I can hear you.]] 06:46:27+00:00 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} [[tts:Yes, I can hear you, Ra Kib.]] 06:46:49+00:00 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} [[tts:One two three.]] 06:47:18+00:00 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} [[tts:Voice.]] 06:47:34+00:00 warn plugins {"subsystem":"plugins"} [plugins] 1 plugin(s) failed to initialize (validation: xai). Run 'openclaw plugins list' for details.
No reply
Log tail truncated (increase --max-bytes). 06:45:49+00:00 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} [[tts:Yes, I can hear you.]] 06:46:08+00:00 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} [[tts:Yes, I can hear you.]] 06:46:27+00:00 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} [[tts:Yes, I can hear you, Ra Kib.]] 06:46:49+00:00 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} [[tts:One two three.]] 06:47:18+00:00 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} [[tts:Voice.]] 06:47:34+00:00 warn plugins {"subsystem":"plugins"} [plugins] 1 plugin(s) failed to initialize (validation: xai). Run 'openclaw plugins list' for details.```
Good, this narrows it down a lot.
Those [[tts:...]] lines mean OpenClaw did:
- hear/transcribe you
- generate a reply
- format it for speech
So the problem is now the Discord voice playback/session layer, not the reply model.
Also, that xai plugin warning is unrelated here.
Next step
In Discord, run:
/vc status
Then paste the result here.
That should tell us whether the bot still thinks it is connected and speaking in the current voice session.
Docs: https://docs.openclaw.ai/channels/discord#voice-channels
GPT gave me this:
Do this in Ubuntu/WSL terminal.
- Backup first
Bash
mkdir -p ~/Backups
openclaw backup create --output ~/Backups/openclaw-before-voice-fix.tar.gz --verify - Update + repair
Bash
openclaw update
openclaw doctor --repair --yes
openclaw config validate
Your setup uses normal openclaw commands on WSL2, not old Docker oc commands. �
OpenClaw-WSL2.txt - Check voice/TTS status in Discord
In Discord, type:
Plain text
/voice status
OpenClaw uses /voice on Discord because Discord already owns /tts. �
OpenClaw - Change STT away from GPT-4o
Use ElevenLabs streaming STT instead. OpenClaw docs say Voice Call streaming STT uses:
JSON
"streaming.provider": "elevenlabs"
with scribe_v2_realtime. �
OpenClaw
Open config:
Bash
nano ~/.openclaw/openclaw.json
Find the voice/call/streaming section and set it like this idea:
JSON
"streaming": {
"provider": "elevenlabs"
}
Save: CTRL + O → Enter → CTRL + X - Make ElevenLabs TTS faster
In the same config, make sure ElevenLabs is set under messages.tts:
JSON
"messages": {
"tts": {
"auto": "always",
"provider": "elevenlabs",
"providers": {
"elevenlabs": {
"apiKey": "${ELEVENLABS_API_KEY}",
"voiceId": "YOUR_VOICE_ID",
"modelId": "eleven_multilingual_v2"
}
}
}
}
Docs confirm ElevenLabs TTS lives under messages.tts, and the default model is eleven_multilingual_v2. �
OpenClaw +1 - Add ElevenLabs key to WSL
Bash
nano ~/.bashrc
Add at bottom:
Bash
export ELEVENLABS_API_KEY="your_key_here"
Then:
Bash
source ~/.bashrc - Force short voice replies
Use a short voice prompt/rule. In your bot/persona/system instruction, add:
Plain text
For Discord voice calls, reply in 1 short sentence unless the user asks for details.
This fixes the 1–2 sec cutoff problem because shorter audio is easier to stream. - Validate + restart
Bash
openclaw config validate
openclaw gateway restart
Your saved command list confirms openclaw config validate, openclaw doctor, and openclaw gateway restart are the right commands. �
OpenClaw-Commands.txt - Test
In Discord VC say:
Plain text
Say only: testing voice
Expected result: reply should come within seconds, not 1–2 minutes.
That GPT advice is mixing a few different OpenClaw features together. I’m going to verify the exact docs first so I don’t send you down the wrong path again.
A bunch of that GPT advice is mixing the wrong OpenClaw features together. I would not follow it as-is.
What’s wrong in that list
/voice status: wrong command for Discord voice channels.
/voice(native/talkvoice) is for Talk voice config, not Discord VC control.plugins.entries.voice-call.config.streaming...: wrong feature.
That is for the separate@openclaw/voice-calltelephony plugin, not Discord/vc.- Blind
doctor --repair --yes: too broad for this issue. - Putting
ELEVENLABS_API_KEYin~/.bashrc: unnecessary if your current TTS is already working. Service env on WSL can be different anyway. messages.tts.auto: "always": not the fix for Discord/vclatency/playback.
What was okay
openclaw backup createis realopenclaw config validateis realopenclaw gateway restartis real
The correct mental model
You have 3 separate surfaces here:
- Discord voice channels →
/vc join|leave|status+channels.discord.voice - Talk voice config →
/talkvoice(not your issue here) - Voice Call plugin / telephony →
plugins.entries.voice-call...(also not your issue here)
Right now your logs show OpenClaw is generating TTS text fine, but Discord VC is not playing it back reliably.
Next step
In Discord, run:
/vc status
Then paste the result here.
GPT reply Part 1:
Krill is right here. My previous answer mixed Discord VC, Talk Voice, and telephony voice-call plugin settings, so don’t apply those config changes blindly. I’ll reset this into the correct Discord /vc troubleshooting path.
The main correction: for Discord VC, use channels.discord.voice and /vc, not plugins.entries.voice-call. For transcript/STT, OpenClaw docs show Discord voice-channel segments use tools.media.audio, so that is the safe place to change your “GPT-4o transcript” setup.
Thought for 1m 23s
You should follow Krill’s correction, not my previous mixed setup. For Discord VC, the correct surface is:
Discord VC = /vc + channels.discord.voice + tools.media.audio for transcription
OpenClaw docs say Discord voice channels use /vc join|leave|status, require native commands, channels.discord.voice, and Connect + Speak permission. They also say voice.tts overrides messages.tts only for voice playback.
Step-by-step full fix
1. Backup first
Run this in Ubuntu/WSL:
mkdir -p ~/Backups
openclaw backup create --output ~/Backups/openclaw-before-discord-vc-fix-$(date +%F_%H-%M-%S).tar.gz --verify
Your current setup uses normal openclaw commands on WSL2, not the old Docker oc shortcut.
2. Check if OpenClaw config is healthy
openclaw --version
openclaw config validate
openclaw gateway status
openclaw status --deep
Your saved command list confirms openclaw config validate, openclaw gateway status, and openclaw status --deep are valid checks.
3. Check Discord VC status correctly
In Discord, run:
/vc status
Do not use /voice status for this issue. /vc is the Discord voice-channel command.
If /vc does not show up, fix native commands first.
4. Make sure native Discord commands are enabled
Run:
openclaw config get commands.native
openclaw config get channels.discord.commands.native
If one of them is disabled, enable Discord native commands:
openclaw config set channels.discord.commands.native true --strict-json
openclaw config validate
openclaw gateway restart
Docs say Discord VC requires native commands through commands.native or channels.discord.commands.native.
5. Make sure Discord voice is enabled
Run:
openclaw config get channels.discord.voice --json
Then set the safe basics:
openclaw config set channels.discord.voice.enabled true --strict-json
openclaw config validate
openclaw gateway restart
Voice is enabled by default, but setting it explicitly is fine if your config was changed before.
6. Fix the transcript/STT side
Since you said Transcript is 4o, switch Discord voice transcription away from OpenAI GPT-4o transcribe and test ElevenLabs Scribe instead.
First dry-run:
openclaw config set tools.media.audio.enabled true --strict-json --dry-run
openclaw config set tools.media.audio.models '[{"provider":"elevenlabs","model":"scribe_v2"}]' --strict-json --dry-run
If both pass, apply:
openclaw config set tools.media.audio.enabled true --strict-json
openclaw config set tools.media.audio.models '[{"provider":"elevenlabs","model":"scribe_v2"}]' --strict-json
openclaw config validate
openclaw gateway restart
Why: OpenClaw’s OpenAI provider docs say gpt-4o-transcribe is used for inbound audio transcription, including Discord voice-channel segments. ElevenLabs docs show Scribe v2 under tools.media.audio for inbound audio/voice transcription.
Part 2
7. Fix VC playback/TTS separately
Check current VC-only TTS config:
openclaw config get channels.discord.voice.tts --json
For testing, use OpenAI TTS only for Discord VC playback first:
openclaw config set channels.discord.voice.tts '{"provider":"openai","openai":{"voice":"alloy"}}' --strict-json
openclaw config validate
openclaw gateway restart
Now join VC and test:
/vc leave
/vc join
Say:
Say only: test one two three
If OpenAI TTS plays fully but ElevenLabs cuts off, the problem is your ElevenLabs VC playback config / latency / voice model. If OpenAI also cuts off, the problem is more likely Discord VC playback, permissions, decrypt/audio receive issue, or OpenClaw runtime.
8. If OpenAI TTS works, switch VC back to ElevenLabs carefully
Use your actual ElevenLabs voice ID:
openclaw config set channels.discord.voice.tts '{"provider":"elevenlabs","elevenlabs":{"voiceId":"YOUR_VOICE_ID","modelId":"eleven_multilingual_v2"}}' --strict-json
openclaw config validate
openclaw gateway restart
OpenClaw docs confirm ElevenLabs TTS uses voiceId and modelId, and TTS config normally lives under messages.tts; Discord VC can override playback through channels.discord.voice.tts.
9. Check logs while testing
Open a second Ubuntu/WSL terminal and run:
openclaw logs --follow --local-time
Then join VC and talk. Watch for these words:
discord
vc
voice
tts
transcribe
audio
elevenlabs
openai
decrypt
DecryptionFailed
error
failed
OpenClaw docs say live log tailing is done with openclaw logs --follow, and logs can reveal Discord/voice/TTS failures.
10. If logs show decrypt errors
If you see this repeatedly:
DecryptionFailed(UnencryptedWhenPassthroughDisabled)
Try:
openclaw config set channels.discord.voice.daveEncryption true --strict-json
openclaw config set channels.discord.voice.decryptionFailureTolerance 24 --strict-json
openclaw config validate
openclaw gateway restart
Then:
/vc leave
/vc join
Docs mention Discord VC decrypt failures and say OpenClaw may auto-recover by leaving/rejoining after repeated failures; that exact DecryptionFailed(...) message may be an upstream @discordjs/voice receive bug.
Best fix order for your exact issue
Do it in this order:
/vc status
openclaw logs --follow --local-time
Change STT from OpenAI gpt-4o-transcribe to ElevenLabs scribe_v2
Test VC playback with OpenAI TTS
If OpenAI TTS works, switch VC playback back to ElevenLabs
If both TTS providers cut off, inspect decrypt/audio logs
Most likely cause in your case: Discord VC is receiving/transcribing slowly through GPT-4o transcript + ElevenLabs playback is either slow or being cut by VC playback, so we isolate STT and TTS separately instead of changing random voice-call plugin settings.
That revised GPT answer is much closer, but I would not do step 6 yet.
Why:
- we already know
/vcexists - we already know Discord voice is enabled
- we already know OpenClaw is generating TTS text
- so I do not want to churn STT providers again before checking the actual VC session state
So the next step is still the same, and it is read-only:
/vc status
Paste the exact result here.
That will tell us whether the bot thinks it is:
- connected
- in the right voice channel
- currently active/stuck
Docs: https://docs.openclaw.ai/channels/discord#voice-channels
/vc status
Unable to resolve guild for this command.
Part 2 is better, but I still would not apply it yet.
Two corrections:
-
Do not switch STT providers again yet
- we already proved OpenClaw is generating
[[tts:...]]replies - so first we need to inspect the Discord VC session state
- we already proved OpenClaw is generating
-
That TTS JSON shape is stale
- for
channels.discord.voice.tts, current config usesproviders - so this kind of shape is the modern one:
{ "provider": "openai", "providers": { "openai": { "voice": "alloy" } - for
}
}
So yes, GPT Part 2 is closer, but the next step is still the same.
Next step
In Discord, run:
/vc status
Then paste the exact output here.
Docs: https://docs.openclaw.ai/channels/discord#voice-channels
Ah, that’s a much better clue. I’m checking where that exact /vc status error comes from so I can tell you the right next step instead of guessing.
Perfect, that explains a lot.
That error means /vc is being invoked without guild context.
In practice, that usually means:
- you ran it in a DM, or
- Discord treated the current surface like it wasn’t a normal guild channel for this command
/vc is guild-only because voice channels live inside a server.
Next step
Run /vc status again, but do it in a regular text channel in the same Discord server as the voice channel, not in DM.
If you’re currently in a thread/forum post, use the parent channel or another normal server text channel just for this test.
Then paste the result.
/vc status
• The Finals (guild 1396404029680128001)
That’s useful. I want to confirm exactly what /vc status is showing before I tell you to bounce the session.
Nice, that tells us the bot does think it still has an active voice session in that server.
So this is no longer a guild-resolution problem. It looks more like a stale / stuck VC session.
Next step
In that same regular server text channel, run:
/vc leave
Then reply with just:
left- or the exact error if it gives one
Docs: https://docs.openclaw.ai/channels/discord#voice-channels
Good.
Next step
Now run:
/vc join
Wait for it to reconnect, then say one short sentence only in voice, like:
hello can you hear me
Then reply with just one:
full replydelayed but full
cut offno reply
no reply. waited 3 mins
Okay, that means the fresh rejoin didn’t fix it.
Next step
In WSL, run this:
openclaw logs --follow --local-time
Then do one short /vc test, wait about 20 seconds, press Ctrl+C, and paste the new lines from that test only.
I’m looking for anything with:
discord/voicemedia-understandingttsDecryptionFailedfailed
The last log you pasted was truncated and older, so I want the exact lines from this fresh no-reply attempt.
14:05:28+06:00 info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ config.set 234ms conn=22ae15c0…b8f9 id=5e3dce58…fc97 14:05:28+06:00 info gateway/reload {"subsystem":"gateway/reload"} config change detected; evaluating reload (agents.list, meta.lastTouchedAt) 14:05:28+06:00 info gateway/reload {"subsystem":"gateway/reload"} config hot reload applied (agents.list) 14:05:28+06:00 info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ config.get 101ms conn=22ae15c0…b8f9 id=f24d01d6…4caa 14:05:31+06:00 info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ commands.list 210ms conn=22ae15c0…b8f9 id=559b2255…90d5 14:05:31+06:00 info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ chat.history 236ms conn=22ae15c0…b8f9 id=29b63f21…6735 14:05:31+06:00 info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ models.list 217ms conn=22ae15c0…b8f9 id=5018f3dc…922c 14:05:54+06:00 info gateway/ws {"subsystem":"gateway/ws"} webchat disconnected code=1001 reason=n/a conn=22ae15c0-fb01-4e39-a737-372ce63db8f9 14:05:55+06:00 info gateway/ws {"subsystem":"gateway/ws"} webchat connected conn=a14db923-7afb-43e7-8f9d-98e9cae9f703 remote=127.0.0.1 client=openclaw-control-ui webchat v2026.4.2214:05:56+06:00 info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ commands.list 216ms conn=a14db923…f703 id=4cf6b1ba…df76 14:05:56+06:00 info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ chat.history 249ms conn=a14db923…f703 id=eb4824a6…5c18 14:05:56+06:00 info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ models.list 222ms conn=a14db923…f703 id=e1455955…38d0 14:05:56+06:00 info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ node.list 261ms conn=a14db923…f703 id=7df83590…c5d6 14:05:56+06:00 info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ device.pair.list 261ms conn=a14db923…f703 id=0118b656…549a 14:06:12+06:00 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} [[tts:Yes, I can hear you 😌]] 14:06:21+06:00 info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ node.list 1008ms conn=a14db923…f703 id=413f4080…81d0 14:06:25+06:00 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} [[tts:Yeah, I'm here 😌]] 14:06:33+06:00 info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ node.list 495ms conn=a14db923…f703 id=8a935376…f157 14:06:39+06:00 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} No reply from agent. 14:06:56+06:00 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} No reply from agent.
I see the shape now. I’m checking what OpenClaw means by No reply from agent. here, because that’s the first log line that really explains the silence.
I’m checking one more thing in the OpenClaw source: whether No reply from agent. in voice mode means “the model stayed silent” or “it generated something that Discord VC then dropped.” That changes the next fix a lot.
I just noticed one big clue: the active voice session is in a different guild than this support thread. I’m verifying the Discord guild policy/requireMention behavior before I tell you to change it.
That log changed the diagnosis.
No reply from agent. means this is not just audio playback anymore. OpenClaw had an active VC session, but for those turns it decided to send no outbound payload.
And the biggest clue is this:
- support thread guild:
1456350064065904867 - active voice guild from
/vc status:1396404029680128001(The Finals)
So the earlier requireMention: false we set for the support server does not help the actual voice server.
Docs say Discord voice sessions follow the same guild policy / mention rules as normal Discord commands:
https://docs.openclaw.ai/channels/discord#voice-channels
Next step
Run this in WSL:
printf 'groupPolicy: '; openclaw config get channels.discord.groupPolicy; printf '\nThe Finals guild entry:\n'; openclaw config get channels.discord.guilds.1396404029680128001 --json
If the second part says Config path not found, paste that too. That result is useful.
printf 'groupPolicy: '; openclaw config get channels.discord.groupPolicy; printf '\nThe Finals guild entry:\n'; openclaw config get channels.discord.guilds.1396404029680128001 --json groupPolicy: 🦞 OpenClaw 2026.4.22 (00bd2cf) — Finally, a use for that always-on Mac Mini under your desk. allowlist The Finals guild entry: { "requireMention": true, "users": [ "1302991150763999302" ] }
Found it.
Your actual voice guild is set to:
groupPolicy: allowlistrequireMention: true
So in The Finals, the bot is allowed to see you, but it still waits for a mention before replying. In voice mode, that turns into the No reply from agent. behavior you saw.
Next step
Run this in WSL:
openclaw config set channels.discord.guilds.1396404029680128001.requireMention false --strict-json
Then reply done.
openclaw config set channels.discord.guilds.1396404029680128001.requireMention false --strict-json 🦞 OpenClaw 2026.4.22 (00bd2cf) — I don't sleep, I just enter low-power mode and dream of clean diffs. Config overwrite: /home/wrakeeb/.openclaw/openclaw.json (sha256 aad42b66c73b7c2bc4b6d4b613d7f1ad7095256980aa8cbe6f7078e70c631f5b -> 1ee11c4d5b04d43fbba414f6515cd7e3b815a0266e3697702c3ae457bb97b4fc, backup=/home/wrakeeb/.openclaw/openclaw.json.bak) Updated channels.discord.guilds.1396404029680128001.requireMention. Restart the gateway to apply.
Perfect.
Next step
Run this in WSL:
openclaw gateway restart
Then reply done.
After that we’ll do one fresh /vc test in The Finals and see if the silence is gone.