#Voice access/chat
1 messages · Page 1 of 1 (latest)
Yep.
There are 3 different “voice” things in OpenClaw, and Discord VC is one of them:
- Discord voice channels: yes, OpenClaw can join a VC for live conversation via
/vc join|leave|status - Voice messages / spoken replies: also yes, via TTS
- Phone calls: separate feature, via the Voice Call plugin, not Discord VC
For Discord VC, you need:
- native commands enabled
channels.discord.voiceconfigured- bot permissions for Connect + Speak
Docs:
If you want, send openclaw --version and I’ll help you wire up the exact Discord config.
I can connect her to the vc but she don't talk. How can I fully enable the VC and chat with her and select the voice? I need step bys tep process and one step at a time
Yep, let’s do it one step at a time.
Step 1 only: please run this and paste the output here:
openclaw --version
If she can already join VC but not speak, we’re probably very close, but I don’t want to guess the wrong config for your version.
Send that first, then I’ll give you Step 2.
OpenClaw version: OpenClaw 2026.4.20 (115f05d)
OS: Windows 11
Installed on: WSL2 Ubuntu Linux
Nice, that version is recent enough.
Step 2 only: let’s check whether voice output is actually configured.
Please run these and paste the output:
openclaw config get channels.discord.voice
openclaw config get messages.tts
We’re checking the part that makes her speak, not just join. After you paste that, I’ll give you Step 3.
First confirm me, can she join vc and talk? Like If I ask her something in my mic and can she hear and reply with voice?
Yes, that’s the intended feature.
With Discord VC set up correctly, she can:
- join the voice channel
- hear / transcribe what you say into your mic
- generate a reply
- speak the reply back in the VC
So if she can join but not talk, that means the feature exists, but your setup is missing part of the voice output config.
If you want, I’ll keep going one step at a time and help you enable the speaking part next.
Alright, lets do it step by step
Cool.
Step 2 only: run these two commands and paste the output here:
openclaw config get channels.discord.voice
openclaw config get messages.tts
That tells us whether the voice reply part is configured, not just the VC join part.
Once you paste that, I’ll give you Step 3.
openclaw config get channels.discord.voice openclaw config get messages.tts 🦞 OpenClaw 2026.4.20 (115f05d) — I autocomplete your thoughts—just slower and with more API calls. Config path not found: channels.discord.voice 🦞 OpenClaw 2026.4.20 (115f05d) — Ah, the fruit tree company! 🍎 Config path not found: messages.tts
Perfect, that explains it.
channels.discord.voice and messages.tts are not configured yet, so she can join VC, but there’s nothing set up for voice output.
Step 3 only: run this:
openclaw config set channels.discord.voice.enabled true
Then paste the output here.
This step just creates the Discord voice config block cleanly.
After that, I’ll give you Step 4 to make her actually speak.
openclaw config set channels.discord.voice.enabled true 🦞 OpenClaw 2026.4.20 (115f05d) — Your terminal just grew claws—type something and let the bot pinch the busywork. Config overwrite: /home/wrakeeb/.openclaw/openclaw.json (sha256 ecf7bb8e44697b979c62d2fbd0b2f9dc5e699f1873ff8a294c8c6788088fcc1e -> 3a0a9cfbdc5bff0442dc0888d95af53076b69e97cff9c8199cfe4a5f7b893088, backup=/home/wrakeeb/.openclaw/openclaw.json.bak) Updated channels.discord.voice.enabled. Restart the gateway to apply.
Nice, that worked.
Do not restart yet.
Step 4 only: pick the speech provider.
If you already have OpenAI available, run this:
openclaw config set channels.discord.voice.tts.provider openai
Then paste the output here.
If you are not using OpenAI, stop here and just tell me which one you have instead:
- ElevenLabs
- Microsoft
- MiniMax
Do not paste any API keys.
I'm using OpenRouter GMT 5.4 Mini. Does it supports Voice?
Not by itself.
Your chat model and voice output are two separate things in OpenClaw:
- OpenRouter GPT 5.4 Mini = the brain/text model
- TTS provider = the speaking voice in Discord VC
So yes, you can keep OpenRouter for chat, but you still need a separate voice provider for speech, usually:
- OpenAI (easy)
- ElevenLabs (best voices)
- Microsoft
- MiniMax
So the short answer is: OpenRouter can stay, but it does not replace the voice/TTS setup.
Next step: tell me which TTS provider you want to use for speaking:
- OpenAI
- ElevenLabs
- Microsoft
- MiniMax
Do not paste any API keys.
Let's go with the OpenAI
Perfect.
Step 4 only: run this:
openclaw config set channels.discord.voice.tts.provider openai
Then paste the output here.
Still don’t restart yet.
After this, I’ll give you Step 5 to choose the actual voice.
Let's go with Elevenlabs
No problem, ElevenLabs is a solid choice.
If you already ran the OpenAI command earlier, this will just overwrite it.
Step 4 only: run this:
openclaw config set channels.discord.voice.tts.provider elevenlabs
Then paste the output here.
Still don’t restart yet.
After that, I’ll give you Step 5 for the voice ID.
openclaw config set channels.discord.voice.tts.provider elevenlabs 🦞 OpenClaw 2026.4.20 (115f05d) — Finally, a use for that always-on Mac Mini under your desk. Config overwrite: /home/wrakeeb/.openclaw/openclaw.json (sha256 3a0a9cfbdc5bff0442dc0888d95af53076b69e97cff9c8199cfe4a5f7b893088 -> 9c0b2ead8dce559e9af1aaa3457393f68372ed38a4dbd35021dbed2d4edc728e, backup=/home/wrakeeb/.openclaw/openclaw.json.bak) Updated channels.discord.voice.tts.provider. Restart the gateway to apply.
Nice.
Small correction, before choosing the voice ID, we need to wire in the ElevenLabs API key safely.
Step 5 only: if you already have ELEVENLABS_API_KEY set in your environment, run this:
openclaw config set channels.discord.voice.tts.providers.elevenlabs.apiKey --ref-provider default --ref-source env --ref-id ELEVENLABS_API_KEY
Then paste only the command output here, not the key.
If you do not already have ELEVENLABS_API_KEY set, just say:
I don’t have it set
and I’ll give you the next safe step.
i got the key, give me the code as an example with dummy code
Yep, like this, with a fake key example:
export ELEVENLABS_API_KEY="sk_dummy_1234567890abcdef"
openclaw config set channels.discord.voice.tts.providers.elevenlabs.apiKey --ref-provider default --ref-source env --ref-id ELEVENLABS_API_KEY
Replace the dummy value with your real key locally.
Do not paste the real key here.
After you run it, paste only the command output, and I’ll give you the next step.
export ELEVENLABS_API_KEY=HIDDEN openclaw config set channels.discord.voice.tts.providers.elevenlabs.apiKey --ref-provider default --ref-source env --ref-id ELEVENLABS_API_KEY :lobster: OpenClaw 2026.4.20 (115f05d) — Running on your hardware, reading your logs, judging nothing (mostly). Config overwrite: /home/wrakeeb/.openclaw/openclaw.json (sha256 9c0b2ead8dce559e9af1aaa3457393f68372ed38a4dbd35021dbed2d4edc728e -> bb959f2c528b0335416fe8fc906a6650efcb8d6504a89f98eeb65849e7e3647d, backup=/home/wrakeeb/.openclaw/openclaw.json.bak)
export ELEVENLABS_API_KEY=HIDDEN openclaw config set channels.discord.voice.tts.providers.elevenlabs.apiKey --ref-provider default --ref-source env --ref-id ELEVENLABS_API_KEY :lobster: OpenClaw 2026.4.20 (115f05d) — Running on your hardware, reading your logs, judging nothing (mostly). Config overwrite: /home/wrakeeb/.openclaw/openclaw.json (sha256 9c0b2ead8dce559e9af1aaa3457393f68372ed38a4dbd35021dbed2d4edc728e -> bb959f2c528b0335416fe8fc906a6650efcb8d6504a89f98eeb65849e7e3647d, backup=/home/wrakeeb/.openclaw/openclaw.json.bak)
That looks like the API key step again, so that part is fine 👍
Next step only: run the voice ID command:
openclaw config set channels.discord.voice.tts.providers.elevenlabs.voiceId "EXAVITQu4vr4xnSDxMaL"
Then paste the output here.
Still don’t restart yet. After that, I’ll give you the restart + test step.
wait, where to get the voice ID?
You can get it in the ElevenLabs dashboard:
- open Voices
- click the voice you want
- copy the Voice ID
But honestly, for the first test, you do not need to go hunting for one yet, just use this starter ID:
EXAVITQu4vr4xnSDxMaL
So next step only is still:
openclaw config set channels.discord.voice.tts.providers.elevenlabs.voiceId "EXAVITQu4vr4xnSDxMaL"
Then paste the output here.
openclaw config set channels.discord.voice.tts.providers.elevenlabs.voiceId "EXAVITQu4vr4xnSDxMaL" 🦞 OpenClaw 2026.4.20 (115f05d) — You had me at 'openclaw gateway start.' Config overwrite: /home/wrakeeb/.openclaw/openclaw.json (sha256 bb959f2c528b0335416fe8fc906a6650efcb8d6504a89f98eeb65849e7e3647d -> 3169dcc5343aa6220de1328fec34f568172b2dcc9eaa0954c479d3a4e13ee0a8, backup=/home/wrakeeb/.openclaw/openclaw.json.bak) Updated channels.discord.voice.tts.providers.elevenlabs.voiceId. Restart the gateway to apply.
Nice, config is in place now.
Step 7 only: restart the gateway:
openclaw gateway restart
Then paste the output here.
After that, I’ll give you the VC test step.
openclaw gateway restart 🦞 OpenClaw 2026.4.20 (115f05d) — I'll do the boring stuff while you dramatically stare at the logs like it's cinema. Restarted systemd service: openclaw-gateway.service Gateway restart failed after 13s: service stayed stopped and port 18789 stayed free. Service runtime: status=stopped, state=activating, lastExit=1 Gateway port 18789 status: free. Gateway restart failed after 13s: service stayed stopped and health checks never came up. Tip: openclaw gateway status --deep Tip: openclaw doctor
Ah, I think we found it.
The restart likely failed because the Gateway runs as a systemd user service, and it does not inherit the export ELEVENLABS_API_KEY=... you set in your shell.
So:
Step 8 only: put your ElevenLabs key into this file:
~/.openclaw/.env
Add this line inside it:
ELEVENLABS_API_KEY=your_real_key_here
- if the file does not exist, create it
- do not paste the real key here
After you’ve done that, just reply:
done
Then I’ll give you the restart step again.
how? What to put first and where?
Yep, do this exactly:
Step 8 only:
-
Run:
nano ~/.openclaw/.env -
In that file, put this on the first line:
ELEVENLABS_API_KEY=your_real_key_here -
Save it:
- press Ctrl+O
- press Enter
- press Ctrl+X
If the file opens blank, that’s fine.
Do not paste the real key here.
After you save it, reply with:
done
done
Perfect.
Step 9 only: restart the gateway again:
openclaw gateway restart
Then paste the output here.
After that, I’ll give you the final VC test step.
openclaw gateway restart 🦞 OpenClaw 2026.4.20 (115f05d) — I'll do the boring stuff while you dramatically stare at the logs like it's cinema. Restarted systemd service: openclaw-gateway.service
Nice, that looks much better.
Step 10 only: test it in Discord VC
-
If she is already in the voice channel, run:
/vc leave -
Then run:
/vc join -
After she joins, say this out loud in your mic:
Hello, can you hear me and answer me with your voice?
Then tell me exactly what happens:
- joins but stays silent
- replies in text only
- replies with voice
- any error message
We’re very close now.
Silent
Got it. If she joins but stays silent, we need the logs now.
Step 11 only:
Run this in your terminal:
openclaw logs --follow
Then, while that is running:
- do
/vc leave - do
/vc join - say one short sentence into your mic
Then copy the last 30-50 log lines that appear and paste them here.
If the logs keep scrolling, press Ctrl+C first, then paste the relevant part.
We’re looking for anything mentioning:
discord voicettselevenlabsdaveerror
03:25:17+00:00 info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ chat.history 2281ms conn=70db1e16…b523 id=4f74518e…241b 03:25:17+00:00 info gateway/channels/telegram {"subsystem":"gateway/channels/telegram"} [default] starting provider (@Claw_Miko_by_Wrakeeb_Bot) 03:25:17+00:00 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} [default] starting provider (@Miko) 03:25:18+00:00 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} discord channels resolved: 1495157727146545194 (guild:Ra Kib's Workspace; aliases:guild:1495157727146545194), 1396404029680128001 (guild:BS Top Dawgs 🔗; aliases:guild:1396404029680128001) 03:25:18+00:00 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} discord channel users resolved: 1302991150763999302 03:25:20+00:00 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} discord client initialized as 1495620344759718029 (Miko); awaiting gateway readiness Log tail truncated (increase --max-bytes).
Those logs are just the startup, not the voice test yet.
We need the logs from the exact moment you join VC and speak.
Step 12 only: run this:
openclaw logs --follow --max-bytes 200000
Then immediately:
/vc leave/vc join- say one short sentence into the mic
Now paste the log lines that appear right after that test.
If possible, grab the part that mentions any of:
discord voicettselevenlabsdaveerrorwarn
We’re looking for the real failure now, not the startup noise.
03:33:02+00:00 warn discord/voice {"subsystem":"discord/voice"} discord voice: receive error: The operation was aborted 03:33:06+00:00 warn discord/voice {"subsystem":"discord/voice"} discord voice: receive error: The operation was aborted 03:33:07+00:00 warn discord/voice {"subsystem":"discord/voice"} discord voice: receive error: The operation was aborted 03:33:08+00:00 warn discord/voice {"subsystem":"discord/voice"} discord voice: receive error: The operation was aborted 03:33:14+00:00 warn discord/voice {"subsystem":"discord/voice"} discord voice: receive error: The operation was aborted 03:33:18+00:00 warn discord/voice {"subsystem":"discord/voice"} discord voice: receive error: The operation was aborted 03:33:20+00:00 warn discord/voice {"subsystem":"discord/voice"} discord voice: receive error: The operation was aborted 03:33:24+00:00 warn discord/voice {"subsystem":"discord/voice"} discord voice: receive error: The operation was aborted
I think I may have sent you down one slightly wrong test path.
In Discord servers, voice may still obey mention/name gating, so just saying “hello” might not trigger a reply.
Step 13 only: test again, but this time say her name out loud first.
For example:
Miko, can you hear me and answer me with your voice?
Then tell me what happens:
- still silent
- replies in voice
- replies in text
- different logs appear
If that works, the issue was likely mention gating, not TTS.
Still silent
Okay, then I think we’re past config now and into a Discord voice dependency issue.
Step 14 only: run these commands and paste the output:
ffmpeg -version
ffprobe -version
npm ls -g @discordjs/opus @snazzah/davey
We’re checking 4 common blockers now:
ffmpegffprobe@discordjs/opus@snazzah/davey
If one of those is missing, that would explain the VC joining but staying silent.
ffprobe' not found, but can be installed with: sudo apt install ffmpeg
penclaw@2026.4.20 ├─┬ @discordjs/voice@0.19.2 │ └── @snazzah/davey@0.1.11 deduped └── @snazzah/davey@0.1.11
Yep, that’s a real blocker.
On Ubuntu/WSL, the ffmpeg package gives you both:
ffmpegffprobe
Step 15 only: run this exactly:
sudo apt update && sudo apt install -y ffmpeg
Then paste the last few lines of the output here.
After that, I’ll give you the restart + retest step.
Good, that confirms @snazzah/davey is installed.
So the clearly missing piece right now is ffprobe, which comes with the ffmpeg package.
Step 15 only: run this:
sudo apt update && sudo apt install -y ffmpeg
Then paste the last few lines of the output here.
I also suspect @discordjs/opus may be missing, but let’s fix the confirmed blocker first.
LAst few line
Setting up libsndfile1:amd64 (1.2.2-1ubuntu5.24.04.1) ... Setting up liblilv-0-0:amd64 (0.24.22-1build1) ... Setting up libopenmpt0t64:amd64 (0.7.3-1.1build3) ... Setting up libswscale7:amd64 (7:6.1.1-3ubuntu5) ... Setting up libpulse0:amd64 (1:16.1+dfsg1-2ubuntu10.1) ... Setting up libavformat60:amd64 (7:6.1.1-3ubuntu5) ... Setting up libsphinxbase3t64:amd64 (0.8+5prealpha+1-17build2) ... Setting up libsdl2-2.0-0:amd64 (2.30.0+dfsg-1ubuntu3.1) ... Setting up libpocketsphinx3:amd64 (0.8.0+real5prealpha+1-15ubuntu5) ... Setting up libavfilter9:amd64 (7:6.1.1-3ubuntu5) ... Setting up libavdevice60:amd64 (7:6.1.1-3ubuntu5) ... Setting up ffmpeg (7:6.1.1-3ubuntu5) ... Processing triggers for man-db (2.12.0-4build2) ... Processing triggers for libc-bin (2.39-0ubuntu8.7) ...
Nice, ffmpeg is in now.
The next likely blocker is @discordjs/opus, because it did not show up in your global package list.
Since your OpenClaw global install showed up under npm ls -g, use npm here on purpose so it lands in the same global tree.
Step 16 only: run this:
npm install -g @discordjs/opus
Then paste the last few lines of the output here.
After that, we’ll do the final restart + VC retest.
Processing triggers for man-db (2.12.0-4build2) ... Processing triggers for libc-bin (2.39-0ubuntu8.7) ... wrakeeb@DESKTOPMRKJH:~$ npm install -g @discordjs/opus npm warn deprecated inflight@1.0.6: This module is not supported, and leaks memory. Do not use it. Check out lru-cache if you want a good and tested way to coalesce async requests by a key value, which is much more comprehensive and powerful. npm warn deprecated npmlog@5.0.1: This package is no longer supported. npm warn deprecated rimraf@3.0.2: Rimraf versions prior to v4 are no longer supported npm warn deprecated glob@7.2.3: Old versions of glob are not supported, and contain widely publicized security vulnerabilities, which have been fixed in the current version. Please update. Support for old versions may be purchased (at exorbitant rates) by contacting i@izs.me npm warn deprecated are-we-there-yet@2.0.0: This package is no longer supported. npm warn deprecated gauge@3.0.2: This package is no longer supported. npm warn deprecated tar@6.2.1: Old versions of tar are not supported, and contain widely publicized security vulnerabilities, which have been fixed in the current version. Please update. Support for old versions may be purchased (at exorbitant rates) by contacting i@izs.me added 59 packages in 30s 4 packages are looking for funding run `npm fund` for details
Nice, that’s a good sign.
At this point you’ve fixed the two biggest likely blockers:
ffmpeg/ffprobe@discordjs/opus
Step 17 only: restart the gateway again:
openclaw gateway restart
Then paste the output here.
After that, we’ll do the VC voice test one more time.
Restart done
Perfect.
Step 18 only: test VC again
-
run:
/vc leave -
then:
/vc join -
once she joins, say clearly:
Miko, can you hear me and answer with your voice?
Then tell me exactly what happens now:
- still silent
- replies in voice
- replies in text
- new error/logs appear
I’m hoping this is the one.
Still Silent. IS it the voice model issue?
Possibly, yes, but more precisely:
- ElevenLabs = the speaking voice (output)
- Discord VC also needs something that can hear / transcribe your voice (input)
So the likely remaining issue is audio understanding / transcription, not the ElevenLabs voice itself.
Step 19 only: run these and paste the output:
openclaw config get agents.defaults.model.primary
openclaw config get tools.media.audio
That will tell us what model is trying to listen, and whether audio transcription is configured at all.
Yep, just these exact two commands:
openclaw config get agents.defaults.model.primary
openclaw config get tools.media.audio
Then paste whatever they print.
If one says Config path not found, paste that too, that still helps.
openclaw config get agents.defaults.model.primary openclaw config get tools.media.audio 🦞 OpenClaw 2026.4.20 (115f05d) — Making 'I'll automate that later' happen now. openrouter/openai/gpt-5.4-mini 🦞 OpenClaw 2026.4.20 (115f05d) — Runs on a Raspberry Pi. Dreams of a rack in Iceland. Config path not found: tools.media.audio
Yep, this is now looking like a transcription/input issue.
What you have right now:
- chat model is coming through OpenRouter
tools.media.audiois not configured- ElevenLabs only covers the speaking/output side
So Discord VC still needs a way to hear and transcribe your mic.
Best next options
- OpenAI transcription , easiest and most reliable
- Local Whisper CLI , free, but more setup
My recommendation: use OpenAI for transcription.
Next step: reply with just one word:
OpenAI
or
Whisper
And I’ll give you the next single step.
OpenAI
Good choice.
Step 20 only: add your OpenAI key to the same file as the ElevenLabs key:
~/.openclaw/.env
Add this on a new line:
OPENAI_API_KEY=your_real_openai_key_here
So that file should now look like:
ELEVENLABS_API_KEY=your_real_elevenlabs_key_here
OPENAI_API_KEY=your_real_openai_key_here
Do not paste the real key here.
After you save it, reply with:
done
~/.openclaw/.env -bash: /home/wrakeeb/.openclaw/.env: Permission denied
Ah, you accidentally tried to run the file.
You want to edit it instead.
Step 20 again, but correctly:
Run:
nano ~/.openclaw/.env
Then add this on a new line:
OPENAI_API_KEY=your_real_openai_key_here
Then save with:
- Ctrl+O
- Enter
- Ctrl+X
Do not paste the real key here.
After that, reply:
done
done
Perfect.
Step 21 only: run these two commands exactly:
openclaw config set tools.media.audio.enabled true
openclaw config set tools.media.audio.models '[{"provider":"openai","model":"gpt-4o-mini-transcribe"}]' --strict-json
Then paste the output here.
This is the part that should let her hear/transcribe your mic, not just speak back.
openclaw config set tools.media.audio.enabled true openclaw config set tools.media.audio.models '[{"provider":"openai","model":"gpt-4o-mini-transcribe"}]' --strict-json\ :lobster: OpenClaw 2026.4.20 (115f05d) — If you can describe it, I can probably automate it—or at least make it funnier. Config overwrite: /home/wrakeeb/.openclaw/openclaw.json (sha256 3169dcc5343aa6220de1328fec34f568172b2dcc9eaa0954c479d3a4e13ee0a8 -> 4e122ddf1c301d53ea03303423fe738d52ed5b6e191f131b364a481d2fa4cc92, backup=/home/wrakeeb/.openclaw/openclaw.json.bak) Updated tools.media.audio.enabled. Restart the gateway to apply.
Nice catch, the first command worked.
But the second one did not run, because there’s a trailing \ at the end.
Step 22 only: run this second command again, exactly as written here:
openclaw config set tools.media.audio.models '[{"provider":"openai","model":"gpt-4o-mini-transcribe"}]' --strict-json
Then paste the output here.
After that, we’ll do the final restart + VC retest again.
openclaw config set tools.media.audio.models '[{"provider":"openai","model":"gpt-4o-mini-transcribe"}]' --strict-json 🦞 OpenClaw 2026.4.20 (115f05d) — If it works, it's automation; if it breaks, it's a "learning opportunity." Config overwrite: /home/wrakeeb/.openclaw/openclaw.json (sha256 4e122ddf1c301d53ea03303423fe738d52ed5b6e191f131b364a481d2fa4cc92 -> 049cccfe23f11064bfe2f2f6c1ecfe9f3abd19f0d36d5fcc644efea2af07e6d6, backup=/home/wrakeeb/.openclaw/openclaw.json.bak) Updated tools.media.audio.models. Restart the gateway to apply.
Perfect.
Step 23 only: restart the gateway again:
openclaw gateway restart
Then paste the output here.
After that, we’ll do the VC test one more time.
openclaw gateway restart 🦞 OpenClaw 2026.4.20 (115f05d) — Pairing codes exist because even bots believe in consent—and good security hygiene. Restarted systemd service: openclaw-gateway.service
Nice.
Step 24 only: test VC again
-
run:
/vc leave -
then:
/vc join -
once she joins, say:
Miko, can you hear me and answer with your voice?
Then tell me exactly what happens now:
- still silent
- replies in voice
- replies in text
- new logs/errors appear
We’ve now fixed config, output voice, dependencies, and transcription config, so this should be a much better test.
I can hear the voice, the it's so laggy and slow
Nice, that means the core setup is working now 🎉
At this point it’s latency tuning, not a broken setup.
A bit of lag is normal because the pipeline is:
your speech → transcription → LLM reply → TTS → Discord playback
Best next options
-
Switch TTS from ElevenLabs to OpenAI
Fastest improvement, usually lower latency, slightly less “nice” voice -
Use a faster chat model
Helps overall response time -
Keep ElevenLabs
Best voice quality, but usually slower
My recommendation: do #1 first, switch the speaking voice to OpenAI TTS and keep OpenAI transcription too.
If you want, reply:
do #1
and I’ll walk you through it one step at a time.
i think it's OpenAI now in TTS.
I don’t think so.
From the steps we set earlier:
- voice output / TTS →
elevenlabs - audio transcription / hearing →
openai
So OpenAI should be handling the hearing/transcribe part, not the speaking part.
Step 25 only: run this and paste the output:
openclaw config get channels.discord.voice.tts.provider
That will confirm which TTS provider is actually active right now.
openclaw config get channels.discord.voice.tts.provider 🦞 OpenClaw 2026.4.20 (115f05d) — Ah, the fruit tree company! 🍎 elevenlabs
But it's consuming the balance from OpenAI
Yep, that’s expected.
Right now your setup is:
- ElevenLabs = speaking voice
- OpenAI = hearing/transcription
So when you talk in VC, OpenAI balance is used for transcription, even though the spoken reply comes from ElevenLabs.
If you want to stop using OpenAI balance
You need to change the transcription/input side, not the TTS side.
Best options:
- Local Whisper , cheapest/free, more setup
- Another transcription provider like Groq/Deepgram/Google
- Keep OpenAI , easiest, but costs money
My recommendation: switch to Local Whisper if your goal is to save money.
If you want that, reply:
switch to Whisper
and I’ll walk you through it one step at a time.
Let it be like this. How can I change the voice and gpt model to 5.4 for faster reponse
Small correction, full 5.4 will usually be slower, not faster.
Right now, if speed is the goal, I’d keep the smaller 5.4 variant.
Two separate knobs:
Change the voice
Set a new ElevenLabs voice ID:
openclaw config set channels.discord.voice.tts.providers.elevenlabs.voiceId "YOUR_NEW_VOICE_ID"
Then restart:
openclaw gateway restart
Change the chat model
This is the setting:
openclaw config set agents.defaults.model.primary "openrouter/openai/gpt-5.4"
openclaw gateway restart
But again, that is likely slower than what you have now.
My recommendation
- if you want better voice → change the ElevenLabs
voiceId - if you want faster replies → do not switch to full 5.4
If you want, I can give you the next exact step for either:
- change voice
- change model
openclaw config set channels.discord.voice.tts.providers.elevenlabs.voiceId ocZQ262SsZb9RIxcQBOj
Like this?
Yep, that’s correct.
I’d just put the voice ID in quotes, like this:
openclaw config set channels.discord.voice.tts.providers.elevenlabs.voiceId "ocZQ262SsZb9RIxcQBOj"
Next step only: run that, then paste the output here.
After that, I’ll give you the restart step.
openclaw config set channels.discord.voice.tts.providers.elevenlabs.voiceId "ocZQ262SsZb9RIxcQBOj" 🦞 OpenClaw 2026.4.20 (115f05d) — I've survived more breaking changes than your last three relationships. Config overwrite: /home/wrakeeb/.openclaw/openclaw.json (sha256 985a74e88c4d4e9db103f6f6bf3d8ae6c599693fe6a29350c1598303c9637a30 -> 69f8c60a4e402c8e880ad2278cf9b9783144a9d0a23345499cc1260d78a04b5b, backup=/home/wrakeeb/.openclaw/openclaw.json.bak) Updated channels.discord.voice.tts.providers.elevenlabs.voiceId. Restart the gateway to apply.
Perfect.
Next step only: restart the gateway:
openclaw gateway restart
Then test the voice in VC again.
If it sounds wrong or still uses the old voice, tell me what happened and we’ll keep tuning it.
seems like using the old voice and also missing lines
Got it. Let’s isolate the voice issue first, then the missing lines issue.
Next step only: run these and paste the output:
openclaw config get channels.discord.voice.tts.provider
openclaw config get channels.discord.voice.tts.providers.elevenlabs.voiceId
That will confirm whether OpenClaw is actually pointing at the new ElevenLabs voice ID right now.
openclaw config get channels.discord.voice.tts.provider openclaw config get channels.discord.voice.tts.providers.elevenlabs.voiceId 🦞 OpenClaw 2026.4.20 (115f05d) — I can't fix your code taste, but I can fix your build and your backlog. elevenlabs 🦞 OpenClaw 2026.4.20 (115f05d) — More integrations than your therapist's intake form. ocZQ262SsZb9RIxcQBOj ```
The config looks correct.
So OpenClaw is now pointing at:
- provider:
elevenlabs - voiceId:
ocZQ262SsZb9RIxcQBOj
That means the “old voice” problem is not the config anymore. Most likely:
- the new voice just sounds similar, or
- the live VC session needed a full refresh and still had cached behavior
The missing lines issue is a separate problem, and that one is often just interruption.
Next step only: test one short prompt, then stay completely silent until she finishes speaking.
Say something short like:
Miko, tell me one sentence about the moon.
Then do not speak at all until she is done.
Tell me whether:
- she finishes the whole sentence
- she still cuts off / misses lines
- the voice still sounds wrong
That will tell us whether the missing lines are caused by interruption or not.
Thw eleven labs balance and credit should consume too right> I see only GPT balance
Yes, if ElevenLabs is really the active TTS provider, then ElevenLabs credits should be consumed for the spoken reply.
Right now the expected split is:
- OpenAI balance = transcription / hearing
- ElevenLabs balance = spoken voice output
So if you only see GPT/OpenAI balance moving, then one of these is happening:
- the reply is not actually using ElevenLabs
- ElevenLabs usage is delayed on their dashboard
- the spoken output is still coming from somewhere else
Next step only: let’s verify with logs.
Run:
openclaw logs --follow --max-bytes 200000
Then trigger one spoken reply in VC, and paste the lines that mention any of:
ttselevenlabsopenaidiscord voice
That will tell us which provider is actually being used for speech.
ool bits: - It doesn’t make its own light; it reflects sunlight - It takes about 27.3 days to orbit Earth - Its gravity is much weaker than Earth’s - That’s why astronauts could jump so high there Tiny lunar vibe: quiet, dusty, and kinda dramatic at night 🌙 Log tail truncated (increase --max-bytes). 04:25:36+00:00 info gateway/channels/discord {"subsystem":"gateway/channels/discord"} The Moon is Earth’s natural satellite. - It reflects sunlight, so it looks bright - It drives ocean tides - It goes around Earth in about 27 days - Gravity there is weak, so jumping is wild Small, dusty, iconic 🌙 04:25:41+00:00 warn media-understanding {"subsystem":"media-understanding"} audio: failed (0/1) reason=Audio transcription response missing text 04:25:41+00:00 warn discord/voice {"subsystem":"discord/voice"} discord voice: processing failed: Audio transcription response missing text 04:26:21+00:00 warn media-understanding {"subsystem":"media-understanding"} audio: failed (0/1) reason=Audio transcription response missing text 04:26:21+00:00 warn discord/voice {"subsystem":"discord/voice"} discord voice: processing failed: Audio transcription response missing text
Yes, ElevenLabs should consume credits for the spoken reply.
But your logs show the more important issue right now is actually this:
Audio transcription response missing text
So the current problem is on the OpenAI transcription/input side, not the ElevenLabs speaking side.
That’s also why you’re mostly noticing OpenAI activity right now.
Next step only: let’s make the transcription side more reliable by switching from the mini transcription model to the full one:
openclaw config set tools.media.audio.models '[{"provider":"openai","model":"gpt-4o-transcribe"}]' --strict-json
Then paste the output here.
After that, I’ll give you the restart step.
~openclaw config set tools.media.audio.models '[{"provider":"openai","model":"gpt-4o-transcribe"}]' --strict-jsonon 🦞 OpenClaw 2026.4.20 (115f05d) — I'm not AI-powered, I'm AI-possessed. Big difference. Config overwrite: /home/wrakeeb/.openclaw/openclaw.json (sha256 69f8c60a4e402c8e880ad2278cf9b9783144a9d0a23345499cc1260d78a04b5b -> 71be7035f2d7cf2c8e496c57498c2c761ce4a400fa23d83d03abdd932bb8525e, backup=/home/wrakeeb/.openclaw/openclaw.json.bak) Updated tools.media.audio.models. Restart the gateway to apply.
Perfect, that update worked.
Next step only: restart the gateway:
openclaw gateway restart
Then test VC again once more.
If it still acts weird after that, paste the new logs and we’ll keep drilling in.
It's the same voice everytime, and the reply is slow and cuts at the middle
I think we’ve crossed from config problem into current Discord VC rough-edge / bug territory.
Why I’m saying that:
- we confirmed the VC TTS provider is ElevenLabs
- we confirmed the voice ID is set
- we fixed the obvious dependency blockers
- it does speak now
- but it still feels slow, cuts off, and the voice change doesn’t sound like it’s really taking
So at this point, the best 2 paths are:
1. Practical fix for today, recommended
Switch Discord VC TTS from ElevenLabs → OpenAI
Why:
- faster replies
- easier to verify voice changes
- less likely to feel laggy/cut-off
2. Keep ElevenLabs and treat this as a bug
Since the config is correct but behavior is still off, I’d consider this worth a GitHub issue:
https://github.com/openclaw/openclaw/issues
My recommendation: do #1 if your goal is a usable VC experience today.
If you want, reply:
switch VC TTS to OpenAI
and I’ll give you the exact commands, one step at a time.
Alright, we fix today
Alright, we fix today
Good. Fastest path is to switch Discord VC TTS from ElevenLabs → OpenAI.
Step 1 only: run this:
openclaw config set channels.discord.voice.tts.provider openai
Then paste the output here.
After that, I’ll give you the voice selection step.
openclaw config set channels.discord.voice.tts.provider openai 🦞 OpenClaw 2026.4.21 (f788c88) — I can't fix your code taste, but I can fix your build and your backlog. Config overwrite: /home/wrakeeb/.openclaw/openclaw.json (sha256 bfd3953d7697e4f1c32bccce8d5038a2d3aaf66519be56a199d59ba8e073bcfe -> a8e194dce13bbd8780c56e7ee3d9958e24666aac6735108fd85a9df265aa93de, backup=/home/wrakeeb/.openclaw/openclaw.json.bak) Updated channels.discord.voice.tts.provider. Restart the gateway to apply.
Nice.
Step 2 only: set the OpenAI voice. Use alloy first, it’s a good default:
openclaw config set channels.discord.voice.tts.providers.openai.voice "alloy"
Then paste the output here.
After that, I’ll give you the restart + retest step.
openclaw config set channels.discord.voice.tts.providers.openai.voice "alloy" 🦞 OpenClaw 2026.4.21 (f788c88) — I speak fluent bash, mild sarcasm, and aggressive tab-completion energy. Config overwrite: /home/wrakeeb/.openclaw/openclaw.json (sha256 a8e194dce13bbd8780c56e7ee3d9958e24666aac6735108fd85a9df265aa93de -> 8aa1d935c3a7b7add6d246c6de1d3651b53d9eec2a6140064e219328c4548e78, backup=/home/wrakeeb/.openclaw/openclaw.json.bak) Updated channels.discord.voice.tts.providers.openai.voice. Restart the gateway to apply.