#Quick error for cartesia python mp3
1 messages · Page 1 of 1 (latest)
Hey there! This channel allows you to ask Kapa.ai for assistance. But it looks like you haven't connected your Discord profile to your Deepgram account yet. You can do so at https://community.deepgram.com - by doing so, you'll be able to ask questions in #1115960287183511643 if you need human assistance!
Deepgram’s community offers tons of resources to support you on your voice AI journey
I need more specific information about the error you're experiencing with Deepgram in Python when working with MP3 files. Based on the knowledge sources, there are several common issues that could occur:
- Timeout errors when processing larger audio files
- SSL/connection errors when making API requests
- MIME type specification issues
- File encoding problems
If you're getting an "INTERNAL_ERROR" message or a timeout when processing MP3 files, you might need to:
- Increase the timeout length for larger files as mentioned in this discussion
- Use the correct MIME type (
audio/mpeginstead ofaudio/mp3) as noted in this discussion - Check your file encoding and ensure the audio file isn't corrupted
Could you please provide more details about the specific error message you're seeing? This would help me provide a more targeted solution for your Deepgram Python implementation.
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@safe yew Kapa.ai is still learning and improving, please let me know how I did by reacting below.
@celest flame
File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 917, in full_dispatch_request
2025-03-19T20:43:10.224309508Z rv = self.dispatch_request()
2025-03-19T20:43:10.224310499Z File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 902, in dispatch_request
2025-03-19T20:43:10.224311572Z return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args) # type: ignore[no-any-return]
2025-03-19T20:43:10.224312743Z File "/usr/local/lib/python3.10/site-packages/flask_sock/init.py", line 62, in websocket_route
2025-03-19T20:43:10.224313816Z f(ws, *args, **kwargs)
2025-03-19T20:43:10.224314787Z File "/app/main.py", line 52, in landing_message
2025-03-19T20:43:10.224315813Z audio_base64 = base64.b64encode(generate_audio(full_response)).decode('utf-8')
2025-03-19T20:43:10.224316819Z File "/app/fonctions/tts.py", line 7, in generate_audio
2025-03-19T20:43:10.224317836Z return client.tts.bytes(model_id="sonic-2",
2025-03-19T20:43:10.224318887Z File "/usr/local/lib/python3.10/site-packages/cartesia/tts.py", line 53, in bytes
2025-03-19T20:43:10.224319979Z request_body = _construct_tts_request(
2025-03-19T20:43:10.224321001Z File "/usr/local/lib/python3.10/site-packages/cartesia/utils/tts.py", line 52, in _construct_tts_request
2025-03-19T20:43:10.224322074Z "encoding": output_format["encoding"],
2025-03-19T20:43:10.224323167Z KeyError: 'encoding'
It seems that return client.tts.bytes(model_id="sonic-2",
transcript=transcript,
voice_id='0418348a-0ca2-4e90-9986-800fb8b3bbc0',
language='fr',
output_format={
"container": "mp3",
"bit_rate": 96000,
"sample_rate": 22050,
})
Needs encoding parameters but in the api documentations tts bytes doesnt take encoding
mp3
object
Required
Hide 3 properties
container
"mp3"
Required
bit_rate
integer
Required
The bit rate of the audio in bits per second. Supported bit rates are 32000, 64000, 96000, 128000, 192000.
sample_rate
integer
Required
Let me check the knowledge base for you again. One moment! :books:
Based on the error message you're encountering, it appears you're having an issue with the Deepgram Text-to-Speech API when using the Cartesia client library. The error occurs because the output_format dictionary you're providing is missing the required encoding parameter.
Looking at the knowledge sources, I can see that when working with Deepgram's Text-to-Speech API, you need to specify the encoding parameter even when using MP3 as the container format. This is evident from several code examples and discussions about the Deepgram API.
For example, in the Text-to-Speech Conversion Using Deepgram API documentation, the SpeakOptions structure includes both encoding and container parameters.
To fix your issue, you should modify your code to include the encoding parameter in the output_format dictionary:
return client.tts.bytes(model_id="sonic-2",
transcript=transcript,
voice_id='0418348a-0ca2-4e90-9986-800fb8b3bbc0',
language='fr',
output_format={
"container": "mp3",
"encoding": "mp3", # Add this line
"bit_rate": 96000,
"sample_rate": 22050,
})
The error occurs in the _construct_tts_request function which is trying to access the "encoding" key that doesn't exist in your output_format dictionary.