#no output config for voice agent python sdk

1 messages · Page 1 of 1 (latest)

thin magnetBOT
#

Hi @eager crane. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

You are correct that in the Deepgram Python SDK v5, the AgentV1SettingsMessage (and related settings objects) require you to specify both input and output audio configurations for the Voice Agent API. The output configuration is not optional and should be included in your settings message.

Here’s how you should structure your settings using the SDK, including both input and output audio settings:

from deepgram.clients.agent.v1.websocket.options import SettingsOptions

options = SettingsOptions()
# Audio input configuration
options.audio.input.encoding = "linear16"
options.audio.input.sample_rate = 44100
# Audio output configuration
options.audio.output.encoding = "linear16"
options.audio.output.sample_rate = 44100
options.audio.output.container = "wav"  # or "none", depending on your needs
# ... other agent settings

Or, if you are constructing the message directly:

settings = AgentV1SettingsMessage(
    audio=AgentV1Audio(
        input=AgentV1AudioInput(encoding="linear16", sample_rate=44100),
        output=AgentV1AudioOutput(encoding="linear16", sample_rate=44100, container="wav")
    ),
    # ... other agent settings
)

You must provide both input and output audio settings. The output configuration is required to define how the agent will synthesize and return audio (TTS) to you. If you omit the output configuration, the API will not know how to format the TTS audio, and your request may fail or use defaults that may not suit your application.

This is confirmed in the documentation, which shows both input and output audio settings as required parts of the Settings message for the Voice Agent API V1 Configure the Voice Agent, Voice Agent Settings.

#

If you are not seeing an output field in your SDK object, ensure you are using the correct version of the SDK and the correct class imports. The official documentation and code samples show both input and output audio settings are supported and required in the current API and SDK versions Build a Basic Voice Agent, Putting it all together.

If you have a specific code snippet or error, please share it for more targeted help!

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: