#Doubled words in transcript

1 messages · Page 1 of 1 (latest)

grizzled kernel
#

There seems to have been a change today or yesterday and the transcript quality from nova-3-general is degraded. It is possible that this is also an issue on other models, I didn't check. It seems to consistently return doubled words in the output. This is using the WebSocket STT realtime transcription with PCM 16khz audio with the NodeJS SDK.

e.g. I say 'Tell me why the sky is blue', the transcript I receive is 'Tell tell me why why this sky is blue blue.'

Request ID: dd0797b6-74fa-4f52-a074-69dfd502e3d2

glass muralBOT
#

Thanks for asking your question. Please be sure to reply with as much detail as possible so the community can assist you efficiently.
-# If you haven't done so, ensure your Discord and Github profiles are linked to Deepgram so you can earn points to redeem on cool stuff just by being active!

#

It looks like we're missing some important information to help debug your issue. Would you mind providing us with the following details in a reply?

  • The programming language you are working in (e.g. JavaScript, Python).
tight caveBOT
# grizzled kernel There seems to have been a change today or yesterday and the transcript quality ...

Hi, siued! It sounds like you're experiencing some unexpected behavior with Deepgram's real-time transcription service. Here are a few steps to help address and investigate the issue:

  1. Verify the Audio Source: Double-check the audio input to ensure that the issue is not caused by any device or software on your end. This can sometimes happen if the audio stream is being duplicated before it reaches the Deepgram service.

  2. Test with Different Audio Files: If possible, try using different audio files or sources to see if the problem persists across various inputs.

  3. Check API Integration: Review your WebSocket integration code to see if there might be any unintended manipulations of the transcription results.

  4. Reach Out to Support: Since this seems like it could be a broader issue, I recommend reaching out to Deepgram's support team with the specifics, including your Request ID dd0797b6-74fa-4f52-a074-69dfd502e3d2. They can investigate if there are any current known issues or changes affecting transcription quality.

Feel free to let me know if you need guidance on any of these steps or if there's anything else you'd like to explore!
-# Please use the buttons below to provide feedback on the accuracy of the response ONLY. Please do NOT use this system to indicate your satisfaction with the answer itself.

grizzled kernel
#

I'm using the NodeJs SDK

tight caveBOT
grizzled kernel
#

This is the get-some-help channel. I specifically did not ask my question in the ask-ai channel, because I believe that this is a bug in Deepgram which cannot be addressed by AI. The AI suggested I reach out to support, which is exactly what I'm doing. Why am I getting a AI response?

wise oysterBOT
#

:wave: Our AI Bot tried to answer this question and wasn't all that helpful, sorry about that. :sweat_smile:

I looked up your request ID dd0797b6-74fa-4f52-a074-69dfd502e3d2 and didn't see anything helpful in our logs.

Between yesterday today and today did you upgrade the version of the JS SDK? Did you change any code on your end that might of caused this issue you are seeing?

I'll also share this with our PM of STT to see if any model changes were deployed in the past 48 hours.

wise oysterBOT
#

Took a look at the audio and could be a bad/corrupted audio issue (words are duplicated/weirdly echoing in audio). I would recommend first checking your audio, encoding, sample rate, etc locally and ensuring it matches with what you're specifying via your API calls.

linear16 at 16kHz should be compatible. Word duplication like "Tell tell me why why…" — is often caused by duplicated audio buffers being sent. You should also take a look at the exact stream logic in your NodeJS pipeline:
• Make sure no audio chunks are being pushed twice (easy bug with streams and buffers)
• Confirm each chunk is new, non-overlapping, and correctly timed
• Log or save one full buffer and inspect it — if you play it back and hear doubling, the issue is upstream from Deepgram
This isn’t likely caused by the STT model itself, especially if it’s consistently reproducible across utterances.

solar verge
grizzled kernel
#

I did realize that the audio issue seems to stem from my end, that's why I closed this thread. Sorry about that.