#Missing "model-uuid" prop when using new aura-2 voice in python sdk

1 messages · Page 1 of 1 (latest)

inner ruin
#

This is in the context of a pipecat agent app. I have tried updating the python sdk version to latest but it is still giving errors about the model-uuid prop being missing in the REST response.

Here's the relevant line in the sdk: https://github.com/deepgram/deepgram-python-sdk/blob/main/deepgram/clients/speak/v1/rest/client.py#L187

GitHub

Official Python SDK for Deepgram. Contribute to deepgram/deepgram-python-sdk development by creating an account on GitHub.

still tundraBOT
#

Hi @inner ruin. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

It appears you're encountering an issue with the model-uuid property when using the new Aura-2 voice with the Deepgram Python SDK. This issue might be related to changes in the API response structure or SDK version compatibility.

Based on the knowledge sources, there are a few potential causes and solutions:

  1. SDK Version: You mentioned updating to the latest Python SDK version, which is good. The current recommended version is v3.7.6 or later. Older versions (like 2.12.0) may have compatibility issues with newer features.

  2. API Endpoint Changes: There was a transition from beta to production endpoints for the TTS API. If your code is still using the beta endpoint, you should update it to use the production endpoint:

# Change from
DEEPGRAM_URL = f"<https://api.beta.deepgram.com/v1/speak?model={self.MODEL_NAME}&performance=some&encoding=linear16&sample_rate=24000">

# To
DEEPGRAM_URL = f"<https://api.deepgram.com/v1/speak?model={self.MODEL_NAME}&performance=some&encoding=linear16&sample_rate=24000">
  1. Response Headers: According to the documentation, the model UUID is returned in the response headers as dg-model-uuid, not in the response body. The SDK might be looking for this information in the wrong place.

    From the Text-to-Speech documentation, the response headers include:

    • dg-model-uuid: The unique identifier of the model that processed the request
    • dg-model-name: The name of the model used to process the request

If you're still experiencing issues after checking these points, you might want to:

#
  1. Check if you're using the correct model name format for Aura voices (e.g., aura-asteria-en)
  2. Consider opening an issue on the Deepgram GitHub repository with details about the specific error you're encountering

The line you referenced in the SDK is looking for the model_uuid in the response headers, which aligns with the documentation, so the issue might be with how the headers are being processed or a change in the API response format.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@inner ruin Kapa.ai is still learning and improving, please let me know how I did by reacting below.

inner ruin
#

The library is supposed to be handling those dg- prefixed header values and putting it in the result so I think there is a bug.

graceful geyser
#

I'm experiencing this as well fwiw

graceful geyser
#

Did some digging. I found that the python sdk expects model-uuid and char-count to be in the result from the post request here:
https://github.com/deepgram/deepgram-python-sdk/blob/main/deepgram/clients/speak/v1/rest/client.py#L184

The result dict is set here:
https://github.com/deepgram/deepgram-python-sdk/blob/main/deepgram/clients/common/v1/abstract_sync_rest.py#L290
file_result seems to expect dg-model-uuid and dg-char-count in the response header, but these are not returned by the deepgram api in the post request

Going to do a bit more digging to see what the response looks like, but a quick fix for me was to make this modification in client.py:

        resp = SpeakRESTResponse(
            content_type=str(result["content-type"]),
            request_id=str(result["request-id"]),
            model_uuid=str(result["model-uuid"]) if "model-uuid" in result else "",
            model_name=str(result["model-name"]),
            characters=int(str(result["char-count"])) if "char-count" in result else 0,
            transfer_encoding=str(result["transfer-encoding"]),
            date=str(result["date"]),
            stream=cast(io.BytesIO, result["stream"]),
            stream_memory=cast(io.BytesIO, result["stream"]),
        )

Just adding some if statements around the missing fields. If you're curious @inner ruin . Obv not a long-term solution though

GitHub

Official Python SDK for Deepgram. Contribute to deepgram/deepgram-python-sdk development by creating an account on GitHub.

GitHub

Official Python SDK for Deepgram. Contribute to deepgram/deepgram-python-sdk development by creating an account on GitHub.

#

For the deepgram guys reading this. Aura 1 models return these headers from the TTS post request:

< HTTP/2 200 
< content-type: audio/mpeg
< dg-model-name: aura-asteria-en
< dg-model-uuid: ecb76e9d-f2db-4127-8060-79b05590d22f
< dg-char-count: 32
< vary: origin, access-control-request-method, access-control-request-headers
< vary: accept-encoding
< access-control-allow-credentials: true
< access-control-expose-headers: dg-model-name,dg-model-uuid,dg-char-count,dg-request-id,dg-error
< dg-request-id: 53ff6fd3-6502-49db-b205-a6c3c4e7005f
< date: Tue, 15 Apr 2025 18:20:01 GMT

Specifically this includes model-uuid and char-count

Aura 2 models return much fewer headers:

< HTTP/2 200 
< content-type: audio/mpeg
< x-dg-request-id: 6f01ad8b-0f80-4686-98aa-e0aaf205a48c
< x-dg-model-name: aura-2-thalia-en
< x-dg-characters: 32
< date: Tue, 15 Apr 2025 18:15:10 GMT

Notably, model-uuid is missing and it seems char-count is replaced with characters

You should probably either make the response headers consistent across aura 1 and aura 2. If you can't for some reason, then in the python sdk, you should handle the response for aura 1 and aura 2 models differently

For aura 2 models this works:

return_vals = [
            "content-type",
            "request-id",
            "model-name",
            "characters",
            "transfer-encoding",
            "date",
        ]
...
        resp = SpeakRESTResponse(
            content_type=str(result["content-type"]),
            request_id=str(result["request-id"]),
            model_name=str(result["model-name"]),
            characters=int(str(result["characters"])),
            transfer_encoding=str(result["transfer-encoding"]),
            date=str(result["date"]),
            stream=cast(io.BytesIO, result["stream"]),
            stream_memory=cast(io.BytesIO, result["stream"]),
        )