#Deepgram-Python-SDK only returning partial transcripts

1 messages · Page 1 of 1 (latest)

lunar aspen
#

Many of our response transcripts being returned via the Deepgram API only contain a small portion of the expected transcript. E.g. 45 minute files are returning just 15 minutes of transcription. The majority is missing. I have confirmed programmatically that the files being sent via the API are in fact ~45mins. However, within the UI logs, the requests show only ~15mins in audio length. I'm not sure what is happening on the Deepgram side to only read in a small portion of the audio file. This is heavily impacting our business, occurring about 30% of the time, so this is an urgent issue for us. Example request id: b3dc923a-70bf-4119-89d8-9aa84e44105b.

Using deepgram-sdk==2.4.0
Code attached.

somber berryBOT
#

Thanks for asking your question. Please be sure to reply with as much detail as possible so we can assist you efficiently.

#

It looks like we're missing some important information to help debug your issue. Would you mind providing us with the following details in a reply?

  • The deepgram product you are using (e.g Speech to Text, Agent API)
lunar aspen
#

Deepgram Nova2, speech to text

honest valve
#

Hi @lunar aspen, do you have another other example request IDs where this happened? We saw your post here and are looking into it

lunar aspen
#

Hey @honest valve , thanks for looking into it. Here are a few others; 'ba6f931c-70a2-4dae-868b-d2a76ea82b8c', 'df44794f-e0e4-4c8c-9535-04502c7c8f2d', '45d888bb-615e-4a90-9fc8-be33f46ff6d4'

honest valve
#

@lunar aspen thanks, I looked into all of these and I cannot reproduce the behavior. If you rerun the same request, do you get the full transcript back?

lunar aspen
#

It's happening intermittently. Here are the logs from one of the runs this affected today:

2024-11-19 21:47:35 INFO Downloading S3 mp3 file locally to pass to transcribe function from front_end/mp3_files/e52f92d7b8b347c592545c06dc114518.mp3
2024-11-19 21:47:36 INFO Kicking off transcribe function
2024-11-19 21:47:36 INFO Connecting to Deepgram API
2024-11-19 21:47:36 INFO LOCAL Audio duration: 55.60 minutes for ./downloads/e52f92d7b8b347c592545c06dc114518podcast_s3_downloaded.mp3
2024-11-19 21:47:36 INFO Connected. Passing local mp3 podcast file - ./downloads/e52f92d7b8b347c592545c06dc114518podcast_s3_downloaded.mp3 - to API
2024-11-19 21:47:37 INFO API response from Deepgram returned
2024-11-19 21:47:37 INFO {'metadata': {'transaction_key': 'deprecated', 'request_id': '04e3a98d-0289-430d-be02-ee2f85d72771'...
2024-11-19 21:48:07 INFO Grabbing full transcription and word index from Deepgram API
2024-11-19 21:48:07 INFO Posting to slack
2024-11-19 21:48:07 INFO Message: ❌ ERROR: transcript from Deepgram less than 4000 characters
2024-11-19 21:48:07 INFO <Response [200]>

This request only returned the first few sentences for a 55 minute file -- 46 words

I just kicked it off again and it returned the full transcript (10,729 words) as expected. Logs below:

Downloading S3 mp3 file locally to pass to transcribe function from front_end/mp3_files/e52f92d7b8b347c592545c06dc114518.mp3
Kicking off transcribe function
Connecting to Deepgram API
LOCAL Audio duration: 55.60 minutes for ./downloads/e52f92d7b8b347c592545c06dc114518podcast_s3_downloaded.mp3
Connected. Passing local mp3 podcast file - ./downloads/e52f92d7b8b347c592545c06dc114518podcast_s3_downloaded.mp3 - to API
API response from Deepgram returned
{'metadata': {'transaction_key': 'deprecated', 'request_id': '
5543dcb0-f4d0-4ad8-a9a0-940d30f49b53'...
Grabbing full transcription and word index from Deepgram API

#

Some thoughts:
Is the python code we're using to pass the files to your API best practice?
We can do a hacky fix where we retry the requests if the transcription differs by x amount from the audio duration on our end. Not super ideal since this doesn't scale well.

honest valve
#

Your python code looks good. One surprising finding is that Deepgram received different audio data for the two requests that you showed logs for: 04e3a98d-0289-430d-be02-ee2f85d72771 and 5543dcb0-f4d0-4ad8-a9a0-940d30f49b53.

What is the SHA256 sum for the file /downloads/e52f92d7b8b347c592545c06dc114518podcast_s3_downloaded.mp3 ?

For the 04a request, Deepgram received an audio file with the sha256sum cb010671fb30a9c11e41d66460082d0d9884b3d1b30681c8a8e2e9766ccf459b.
For the 554 request, Deepgram received an audio file with the sha256sum 653b24fd0a56312711138ebb233f0edd4f09dd35508602e5f1bc16bbf77a0164.

honest valve
#

Hi @lunar aspen , in addition to the above, please ensure the content-length header is being set. You can check by inspecting the request headers. You'll need to dig into the SDK code to do this -- we'd do it for you but we cannot reproduce the behavior.

The relevant information should be somewhere around here: https://github.com/deepgram/deepgram-python-sdk/blob/2.4.0/deepgram/_utils.py#L80

Our logs show your requests are not sending the content-length header, and that is likely to be the culprit. The Deepgram SDK has been updated significantly since 2.4.0 so this bug is very likely to be fixed in later versions if you can upgrade.

lunar aspen
#

ahh gotcha! thanks for the suggestion, i'll update the sdk and start passing it!

lunar aspen
#

Posting here for others who may run into this issue, as it took me a little to figure out how to pass context-length.

This was solved (so far no more issues, fingers crossed) by updating to deepgram sdk 3.7.7 and replacing my original code with the attached snippet

#

the imports:
from deepgram import (
DeepgramClient,
PrerecordedOptions,
)