I’m trying to enable automatic transcription of Discord voice notes in OpenClaw using local Whisper.
What is already confirmed:
local Whisper works correctly on the real .ogg files saved by OpenClaw
the voice files are successfully downloaded to disk
the base Discord listener does receive the voice note as a real event
the voice note arrives with this shape:
type=0
flags=8192
attachment audio/ogg
filename=voice-message.ogg
waveform present
durationSecs present
The problem:
that event does not continue into the normal handler
it does not reach preflightDiscordMessage
it does not reach processDiscordMessage
so it never enters the native media/audio pipeline and never posts back the transcript
We already added patches/logging across several layers of the compiled runtime, and the drop seems to happen between DiscordMessageListener.handle(...) and the real handler / debounce / preflight stage.
Specific questions for help:
Are Discord voice notes (flags=8192) routed through a different path in OpenClaw / Carbon?
Is there a filter or special listener type that prevents those messages from reaching the normal MESSAGE_CREATE flow?
What is the correct place to hook them and force them into the same pipeline as a normal attachment message?