Hey team! We're experiencing some static/distortion in our Voice Agent audio when using Twilio and could use some help diagnosing.
Setup:
Twilio media streams → Voice Agent API (wss://agent.deepgram.com/v1/agent/converse)
Listen: flux-general-en
Speak: aura-2-thalia-en
Think: OpenAI gpt-4.1
Config (matching the docs):
{ "type": "Settings", "audio": { "input": { "encoding": "mulaw", "sample_rate": 8000 }, "output": { "encoding": "mulaw", "sample_rate": 8000, "container": "none" } }}
What we've verified:
✅ SettingsApplied received after sending config
✅ Using container: "none" (to avoid WAV header issues)
✅ mulaw @ 8kHz (matching Twilio's format)
✅ Audio buffered until SettingsApplied before forwarding
✅ Raw bytes forwarded directly (no transcoding)
Issue:
Subtle but noticeable static/distortion in TTS output throughout the call (not just at the start).
Sample for debugging:
Request ID: 75c60da2-a3f9-40c0-bf2d-1ca96b8451b8
Timestamp: 2025-12-30 21:55:05 UTC
Questions:
Anything in SettingsApplied we should check to confirm audio config was applied?
Any known issues with flux-general-en or aura-2-thalia-en causing artifacts?
Can someone check the request ID above for anything unusual?
Happy to provide recordings or additional logs. Thanks!
Want me to shorten it further or add anything?