#streaming

1 messages · Page 1 of 1 (latest)

dusky ice
#

is it better to send like a small phrase to deepgram or like many of few words for true continous

remote foxBOT
#

Hey there! This channel allows you to ask Kapa.ai for assistance. But it looks like you haven't connected your Discord profile to your Deepgram account yet. You can do so at https://community.deepgram.com - by doing so, you'll be able to ask questions in #1115960287183511643 if you need human assistance!

young ginkgoBOT
#

Hi @dusky ice. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

Based on the knowledge sources provided, it's generally better to send larger, more continuous chunks of audio to Deepgram for streaming transcription rather than very short phrases or few words. Here's why:

  1. Context: Deepgram uses context to improve accuracy. According to the official answer in this discussion, sending very short snippets of audio (less than 5 seconds) can result in missing context, which may lower accuracy.

  2. Optimal chunk size: The official answer in another discussion recommends sending between 20ms and 250ms of data at once for streaming. This suggests that continuous chunks within this range are preferred.

  3. Transcription timing: As mentioned in the official answer here, final results are streamed on average once every 3-5 seconds. This indicates that Deepgram's models work best with longer audio segments to provide more accurate transcriptions.

  4. Avoiding empty transcriptions: In a discussion about streaming WAV files, it was noted that sending very small, separate WAV files resulted in empty transcriptions for all but the first file. This further supports the idea of sending larger, continuous chunks.

  5. Persistent connections: While not strictly necessary, using a persistent connection for a conversation can be beneficial. The official answer in this discussion suggests that a stream should last at least the length of one full turn in a conversation.

#

In summary, for true continuous streaming, it's better to send larger phrases or continuous audio chunks to Deepgram rather than many small snippets of few words. This approach provides better context for the transcription model and aligns with Deepgram's recommended practices for streaming audio.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: