#Bluetooth routing and quality.

1 messages · Page 1 of 1 (latest)

dim vortex
#

Hey team! We're building a real-time voice coaching app using Pipecat + Daily WebRTC + Deepgram
nova-3 streaming STT.

Problem: When users connect via Bluetooth headsets, their mic audio is captured via BT SCO (8kHz
CVSD or 16kHz mSBC, narrow-band). The audio arrives at Deepgram as 16kHz linear16 (WebRTC upsamples
it), but the actual frequency content is limited to 4-8kHz. We're seeing significantly degraded
WER — garbled transcriptions like "exercises to redeem my short of aim" instead of "exercises to
reduce my shortness of breath."

Questions:

  1. For narrow-band BT SCO audio upsampled to 16kHz, should we use nova-2-phonecall instead of
    nova-3?
  2. Does nova-3 have any built-in robustness to narrow-band/upsampled audio, or is it strictly
    optimized for wideband?
  3. Should we pass sample_rate: 8000 when we know the source is BT SCO, even though the actual
    data is 16kHz container?
  4. Any other recommendations for handling degraded BT mic input?

Usage params: model=nova-3, encoding=linear16, sample_rate=16000, channels=1, interim_results=true,
smart_format=true, endpointing=500ms. Also if there are any tips on bluetooth package/libs, pls let us know. Backend in Python

indigo abyssBOT
#

Thanks for asking your question. Please be sure to reply with as much detail as possible so the community can assist you efficiently.
-# If you haven't done so, ensure your Discord and Github profiles are linked to Deepgram so you can earn points to redeem on cool stuff just by being active!