#Blank Transcript problem

1 messages · Page 1 of 1 (latest)

torpid lichen Jan 30, 2026, 1:16 AM

Hi! Our systems (which use the prerecorded audio STT API) got 2 transcripts back today that had 0 paragraphs, 0 utterances, and an empty transcript. Both audio files play fine and have clear voices. I'm not sure you can see the request, but the request id is 1be11706-e666-48c4-b38d-3df543bd801a. Also here is the JSON we got back from y'all webhook:

{"metadata":{"transaction_key":"deprecated","request_id":"1be11706-e666-48c4-b38d-3df543bd801a","sha256":"807dfa72dc419b133177f39bd79cde58c310514fc95b5dc5d66961046f059594","created":"2026-01-29T17:30:05.870Z","duration":32994.133,"channels":1,"models":["2187e11a-3532-4498-b076-81fa530bdd49"],"model_info":{"2187e11a-3532-4498-b076-81fa530bdd49":{"name":"general-nova-3","version":"2025-07-31.0","arch":"nova-3"}}},"results":{"channels":[{"alternatives":[{"transcript":"","confidence":0.0,"words":[],"paragraphs":{"transcript":"\n","paragraphs":[]}}]}],"utterances":[]}}

any ideas what's going on here?

uneven nimbusBOT Jan 30, 2026, 1:16 AM

Hi @torpid lichen. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

There’s no mention of your specific request ID or nova-3 in the knowledge sources, so I can’t see exactly what happened with that call. I can only summarize known causes of “empty transcript, confidence 0” responses and how Deepgram support typically investigates them.

From prior cases, an empty transcript with a 200 status and non‑zero duration generally falls into a few buckets:

Language / model mismatch
- If the audio language doesn’t match the language parameter (or the model’s default), Deepgram can return an empty transcript. This is explicitly documented for Spanish audio transcribed with the default English model, which produced exactly the same pattern: transcript: "", confidence: 0, words: []. [Spanish empty]
- Fix in that case was to set language=es or enable detect_language=true. [Spanish empty]
Audio format / encoding issues
- For streaming, incorrect sample_rate or encoding is a common cause of empty transcripts. [RTP empty; Twilio mulaw]
- For prerecorded, sending base64 instead of raw audio, or a wrong/odd MIME/content-type (e.g., S3 serving application/zip) has caused failures or unusable audio. [Unsupported media; Twilio mulaw]
- In one prerecorded nova‑2 case, removing an explicit mimetype field fixed blank transcripts. [Blank transcripts nova-2]

Acoustic conditions / incomprehensible audio
- Deepgram has confirmed that if audio is very noisy, far‑field, or otherwise “incomprehensible,” nova‑2 can return an empty transcript even though the file “plays fine.” [Nova-2 Spanish long]
- In that example, a far‑field Spanish lecture with lots of background noise produced empty output with nova‑2, while enhanced was recommended instead. [Nova-2 Spanish long]
Very long / chunked usage patterns
- There is no documented hard duration limit other than a 2 GB file size cap, but long or oddly chunked usage has produced empty results in some nova‑2 scenarios (e.g., long Spanish recordings; very sparse “2–10 frames per minute” usage). [Nova-2 Spanish long; 8‑minute issue]
- In the 8‑minute case, the guidance was to send more continuous audio rather than very sparse frames. [8‑minute issue]
Model / feature interactions or transient issues
- There have been past bugs where certain combinations (large stereo files + nova + diarize + smart_format + detect_language) caused missing paragraphs or partial results; Deepgram fixed those after investigation. [Paragraphs missing]
- Another user reported nova‑2 prerecorded suddenly returning blank transcripts on previously working code; support suggested removing the explicit mimetype and re‑trying. [Blank transcripts nova-2]

What you can do next

Because I can’t see internal logs for 1be11706-e666-48c4-b38d-3df543bd801a, the best path is to treat this like a support/debug case:

Verify request parameters for that job
- Exact API URL and query params (model, language/detect_language, any diarization/paragraphs flags, etc.).
- How you’re sending the audio (raw bytes vs URL; any Content-Type or mimetype you set).
Check for the common pitfalls above
- Confirm the spoken language vs language/detect_language. [Spanish empty]
- Confirm the file is standard PCM/MP3/M4A/etc. and not base64‑wrapped or served with a non‑audio content-type. [Unsupported media; Twilio mulaw]
- If you’re explicitly setting a mimetype in the SDK, try a run without it. [Blank transcripts nova-2]
Reproduce via Playground or a minimal cURL
- Upload the same file to the Deepgram Playground with the same model/language options to see if it also returns empty. This is exactly what support suggested in a similar “not fully transcribed” case. [Pre-recorded not fully]
- If Playground also returns empty, it strongly suggests either an audio‑content or model/language issue rather than your integration.

Contact Deepgram Support with full context
Deepgram’s own guidance for issues like this is to provide: [Support info]
- The request_id (you already have it).
- The audio file itself.
- The full JSON response (you have it).
- The code or exact HTTP request you used (including headers and query params).
That’s usually enough for them to check internal logs and confirm whether it’s:
- A language/model mismatch,
- An audio/encoding/content-type problem,
- Or a model‑side issue that needs an engineering fix.

Given that:

Your duration is ~32,994 seconds (~9.16 hours), which is much longer than the 30–60 minute examples in other threads, and
Prior long‑audio nova‑2 Spanish cases have produced empty transcripts due to acoustic conditions, not hard limits, [Nova-2 Spanish long]

it’s plausible this is either:

A long‑audio edge case with nova‑3, or
A language/acoustics mismatch that nova‑3 is treating as non‑speech.

But the knowledge sources don’t mention nova‑3 or a 9‑hour prerecorded limit, so I can’t say that with certainty.

If you share:

The language you requested,
The actual spoken language,
How you’re calling /v1/listen (URL + params),

I can help you narrow down which of the known categories this most likely falls into, but final confirmation will require Deepgram Support to inspect that specific request.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

@torpid lichen Kapa.ai is still learning and improving, please let me know how I did by reacting below.