Quick error for cartesia python mp3 | Deepgram | Page 1

celest flameBOT Mar 19, 2025, 8:45 PM

#

Hi @safe yew. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

fickle ospreyBOT Mar 19, 2025, 8:45 PM

#

Hey there! This channel allows you to ask Kapa.ai for assistance. But it looks like you haven't connected your Discord profile to your Deepgram account yet. You can do so at https://community.deepgram.com - by doing so, you'll be able to ask questions in #1115960287183511643 if you need human assistance!

Deepgram Community

Deepgram’s community offers tons of resources to support you on your voice AI journey

celest flameBOT Mar 19, 2025, 8:45 PM

#

I need more specific information about the error you're experiencing with Deepgram in Python when working with MP3 files. Based on the knowledge sources, there are several common issues that could occur:

Timeout errors when processing larger audio files
SSL/connection errors when making API requests
MIME type specification issues
File encoding problems

If you're getting an "INTERNAL_ERROR" message or a timeout when processing MP3 files, you might need to:

Increase the timeout length for larger files as mentioned in this discussion
Use the correct MIME type (audio/mpeg instead of audio/mp3) as noted in this discussion
Check your file encoding and ensure the audio file isn't corrupted

Could you please provide more details about the specific error message you're seeing? This would help me provide a more targeted solution for your Deepgram Python implementation.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@safe yew Kapa.ai is still learning and improving, please let me know how I did by reacting below.

safe yew Mar 19, 2025, 8:45 PM

#

@celest flame
File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 917, in full_dispatch_request
2025-03-19T20:43:10.224309508Z rv = self.dispatch_request()
2025-03-19T20:43:10.224310499Z File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 902, in dispatch_request
2025-03-19T20:43:10.224311572Z return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args) # type: ignore[no-any-return]
2025-03-19T20:43:10.224312743Z File "/usr/local/lib/python3.10/site-packages/flask_sock/init.py", line 62, in websocket_route
2025-03-19T20:43:10.224313816Z f(ws, *args, **kwargs)
2025-03-19T20:43:10.224314787Z File "/app/main.py", line 52, in landing_message
2025-03-19T20:43:10.224315813Z audio_base64 = base64.b64encode(generate_audio(full_response)).decode('utf-8')
2025-03-19T20:43:10.224316819Z File "/app/fonctions/tts.py", line 7, in generate_audio
2025-03-19T20:43:10.224317836Z return client.tts.bytes(model_id="sonic-2",
2025-03-19T20:43:10.224318887Z File "/usr/local/lib/python3.10/site-packages/cartesia/tts.py", line 53, in bytes
2025-03-19T20:43:10.224319979Z request_body = _construct_tts_request(
2025-03-19T20:43:10.224321001Z File "/usr/local/lib/python3.10/site-packages/cartesia/utils/tts.py", line 52, in _construct_tts_request
2025-03-19T20:43:10.224322074Z "encoding": output_format["encoding"],
2025-03-19T20:43:10.224323167Z KeyError: 'encoding'

It seems that return client.tts.bytes(model_id="sonic-2",
transcript=transcript,
voice_id='0418348a-0ca2-4e90-9986-800fb8b3bbc0',
language='fr',
output_format={
"container": "mp3",
"bit_rate": 96000,
"sample_rate": 22050,
})

Needs encoding parameters but in the api documentations tts bytes doesnt take encoding

mp3
object
Required

Hide 3 properties
container
"mp3"
Required
bit_rate
integer
Required
The bit rate of the audio in bits per second. Supported bit rates are 32000, 64000, 96000, 128000, 192000.

sample_rate
integer
Required

celest flameBOT Mar 19, 2025, 8:45 PM

#

Let me check the knowledge base for you again. One moment! :books:

#

Based on the error message you're encountering, it appears you're having an issue with the Deepgram Text-to-Speech API when using the Cartesia client library. The error occurs because the output_format dictionary you're providing is missing the required encoding parameter.

Looking at the knowledge sources, I can see that when working with Deepgram's Text-to-Speech API, you need to specify the encoding parameter even when using MP3 as the container format. This is evident from several code examples and discussions about the Deepgram API.

For example, in the Text-to-Speech Conversion Using Deepgram API documentation, the SpeakOptions structure includes both encoding and container parameters.

To fix your issue, you should modify your code to include the encoding parameter in the output_format dictionary:

return client.tts.bytes(model_id="sonic-2",
                        transcript=transcript,
                        voice_id='0418348a-0ca2-4e90-9986-800fb8b3bbc0',
                        language='fr',
                        output_format={
                            "container": "mp3",
                            "encoding": "mp3",  # Add this line
                            "bit_rate": 96000,
                            "sample_rate": 22050,
                        })

The error occurs in the _construct_tts_request function which is trying to access the "encoding" key that doesn't exist in your output_format dictionary.

#

If this doesn't resolve the issue, you might need to check the specific requirements of the Cartesia client library you're using, as it might have different parameter expectations than the standard Deepgram SDK.

#Quick error for cartesia python mp3

Could you please provide more details about the specific error message you're seeing? This would help me provide a more targeted solution for your Deepgram Python implementation.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

If this doesn't resolve the issue, you might need to check the specific requirements of the Cartesia client library you're using, as it might have different parameter expectations than the standard Deepgram SDK.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: