#Troubleshooting transcription frames

1 messages · Page 1 of 1 (latest)

sweet garnet
#

Hey Deepgram team đź‘‹
I’m testing a new app which streams mic audio through a Render-hosted relay to wss://api.deepgram.com/v1/listen?model=nova-3&language=en-AU&interim_results=true&smart_format=true&punctuate=true. The relay signs each browser connection, forwards WebM/Opus chunks (MediaRecorder, 48 kHz mono) and logs show ~130 KB of audio per session:

[relay:0462cf29-e1f1-4000-8b79-b98272ffbc89] client connected (model=nova-3, tier=)
[relay:0462cf29-e1f1-4000-8b79-b98272ffbc89] Deepgram socket open
[relay:0462cf29-e1f1-4000-8b79-b98272ffbc89] received first audio chunk (4290 bytes)
[relay:0462cf29-e1f1-4000-8b79-b98272ffbc89] Deepgram closed code=1000 reason= bytesForwarded=133118

However, the browser websocket never receives any transcription frames—only the initial {"type":"ready"}—so nothing renders in the UI. In the Deepgram dashboard the same sessions show request IDs 546f37ed-15c5-46df-9e6a-a0550990cfda and 91aa5a45-614a-4fc1-b596-a570e29cfbf9, status 101 OK, but “Audio duration 00:00:00”.

Could you confirm that nova-3 streaming is fully enabled for my project key (masked_id) and check why transcripts aren’t being emitted even though audio bytes are flowing? Let me know if you need more logs or headers—happy to provide them. Thanks!

peak stagBOT
#

Hi @sweet garnet. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

Based on the knowledge sources, when you see "Audio duration 00:00:00" with a 101 OK status, this indicates that the WebSocket connection was successfully established, but Deepgram did not receive any audio data that it could decode and process.

Here are the most likely causes and troubleshooting steps:

Audio Format Issue

The primary issue is likely that you're sending WebM/Opus containerized audio, but you haven't specified the correct encoding parameters. According to the troubleshooting documentation, when Deepgram cannot decode the audio payload, you may see:

  • No transcription results
  • Zero audio duration
  • WebSocket closes with code 1008 and DATA-0000 error (payload cannot be decoded as audio)

Required Parameters

For containerized audio formats like WebM/Opus, you typically need to specify:

  • encoding parameter (e.g., opus or the appropriate codec)
  • sample_rate parameter (48000 in your case)

Your current URL doesn't include these parameters. Try updating your connection URL to:

wss://api.deepgram.com/v1/listen?model=nova-3&language=en-AU&interim_results=true&smart_format=true&punctuate=true&encoding=opus&sample_rate=48000

Additional Troubleshooting Steps

As mentioned in the troubleshooting guide:

  1. Verify audio format: Write the audio data to a file to confirm it contains valid audio that can be played back
  2. Check encoding parameters: Ensure the encoding and sample_rate parameters match your actual audio format
  3. Validate raw vs containerized: Confirm whether you're sending raw audio or containerized audio (WebM is containerized)
#

According to a community discussion, when there's no transcript despite a successful connection, "it is either because there was no speech, or, more likely, because the audio format is not set correctly."

The fact that your relay is forwarding ~130 KB of data but Deepgram shows 0 duration strongly suggests the audio format parameters are missing or incorrect, preventing Deepgram from decoding the audio stream.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@sweet garnet Kapa.ai is still learning and improving, please let me know how I did by reacting below.

hearty pecanBOT
# sweet garnet Hey Deepgram team 👋 I’m testing a new app which streams mic audio through a Ren...

Hi @sweet garnet, thank you for your detailed explanation of the issue you're experiencing with the nova-3 model. It seems you've already received some insights from kapa.ai. If you haven't done so already, I recommend checking the audio encoding and ensuring that the audio is being sent correctly. If you continue to face issues, please provide additional logs or headers as needed to assist in troubleshooting further. Let us know how it goes!

sweet garnet
#

@peak stag following up on a nova-3 realtime relay issue. The WebSocket opens and streams WebM/Opus audio (MediaRecorder, 48 kHz mono) but we never receive transcription frames back. Details:

Project key: ed5ec41e-699d-4be7-8a9e-a2d317e0b4e5
Request ID: 632e3c7a-6af4-4671-b114-710a0a26eba1
Endpoint: wss://api.deepgram.com/v1/listen?model=nova-3&language=en-AU&interim_results=true&smart_format=true&punctuate=true&encoding=opus&sample_rate=48000

Relay logs for that session:
[relay:46417af7-7b99-4b19-9160-d01255c09080] client connected (model=nova-3, tier=)
[relay:46417af7-7b99-4b19-9160-d01255c09080] Deepgram socket open
[relay:46417af7-7b99-4b19-9160-d01255c09080] received first audio chunk (4290 bytes)
[relay:46417af7-7b99-4b19-9160-d01255c09080] client closed code=1005 reason= totalBytes=90192
[relay:46417af7-7b99-4b19-9160-d01255c09080] Deepgram closed code=1005 reason= bytesForwarded=90192

Browser devtools only show the initial {"type":"ready"} from Deepgram, then our client sends {"type":"CloseStream"} because the upstream connection closes. Dashboard shows each request as 101 OK with “Audio duration 00:00:00” even though ~90 KB of Opus audio was forwarded.

Could you confirm that nova-3 streaming is fully enabled for this project and investigate why Deepgram is closing the socket with code 1005 without emitting transcripts? If the AI can’t dig into the request logs, please escalate this to a human engineer. Happy to provide any additional headers or packet captures you need. Thanks!

peak stagBOT
#

Let me check the knowledge base for you again. One moment! :books:

#

Based on the knowledge sources and your updated information, the issue persists even after adding the encoding=opus and sample_rate=48000 parameters. Here's what the evidence suggests:

WebSocket Close Code 1005

The close code 1005 is significant - according to WebSocket specifications, this means "No Status Received" and typically indicates an abnormal closure where no close frame was sent. This is different from the typical Deepgram error codes documented in the troubleshooting guide.

Audio Duration of 0 Seconds

According to a community discussion, when you see "an audio duration of 0 with a 101 response code means that you successfully opened a websocket connection to Deepgram, but we did not receive any audio through the connection."

This is particularly concerning because your relay logs show bytesForwarded=90192, indicating that data is being sent to Deepgram, but Deepgram's logs show 0 audio duration.

Possible Causes

Based on the troubleshooting documentation, the most likely issues are:

  1. Audio format mismatch: Even with encoding=opus specified, there may be an issue with how the WebM container is being handled. According to a discussion about containerized audio, "the encoding parameter is only required for raw audio and needs to be one of the supported options. Since the audio coming through is containerized (WebM), the decoding information does not need to be included, as Deepgram's system will parse the container information."
#
  1. Corrupt or unsupported audio data: A similar case showed that when no results are returned despite a successful connection, Deepgram logged "an error on our side that we received corrupt or unsupported audio data in that stream, causing us to close the stream."

Recommended Actions

  1. Try removing the encoding parameters: Since you're sending containerized WebM/Opus audio, try connecting without encoding=opus&sample_rate=48000 and let Deepgram parse the container information automatically.

  2. Verify audio integrity: Write the audio chunks to a file and confirm they can be played back correctly before sending to Deepgram.

  3. Contact Deepgram Support: Given that this requires investigation of server-side logs for request ID 632e3c7a-6af4-4671-b114-710a0a26eba1, you should contact Deepgram Support directly with this request ID so they can examine what's happening on their end when your audio arrives.


Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: