#What is the best audio file type to use for real time browser transcription for medical?
1 messages · Page 1 of 1 (latest)
Hi @round wraith. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:
For real-time browser transcription in a medical context, the best audio format depends on your browser environment and Deepgram's compatibility.
Real-time Browser Transcription
For real-time browser transcription, the following formats work well with Deepgram:
- Chrome/Android: audio/webm with opus codec
- Firefox: audio/ogg with opus codec
- Safari/iOS: audio/mp4
These formats are mentioned in a discussion about streaming issues on iOS, where different browsers use different formats for microphone capture GitHub Discussions.
When streaming audio to Deepgram, make sure you're sending continuous data and never send empty data objects, as this can cause Deepgram to disconnect.
Storage in S3 for Deepgram STT
For storing audio in S3 that works best with Deepgram STT, you have several good options:
Deepgram supports over 40 audio and video formats Deepgram Pricing FAQ, including:
- WAV
- MP3
- MP4
- FLAC
- Opus
- WebM
- M4A
For the highest quality transcription, WAV files are often recommended. When extracting audio from video files (like MP4), it's considered best practice to convert to WAV format before sending to Deepgram GitHub Discussions.
If you're dealing with raw PCM data, you should convert it to a non-raw format like WAV before sending it to Deepgram for batch transcription GitHub Discussions.
For medical applications where accuracy is critical, I recommend using WAV format (PCM, 16-bit, 44.1kHz or 48kHz) for storage, as it's lossless and widely supported by Deepgram. This will give you the highest quality transcription results, which is particularly important for medical terminology.