Based on the knowledge sources, Deepgram’s endpointing feature is designed to send a finalized transcript (speech_final: true) whenever a sufficiently long pause in speech is detected. This happens regardless of whether the transcript is a complete sentence or just a partial utterance. The purpose of endpointing is to indicate a likely endpoint in speech, not necessarily a complete or “good” transcript:
Endpointing is not intended to return complete utterances or to be a reliable indicator of a complete thought or sentence. It is intended to hint to low-latency NLP engines that intermediate processing may be useful. For utterance segmentation, you’ll need to implement a simple timing heuristic appropriate to your domain.
Using Endpointing (speech_final)
Configuration Example:
To always receive a transcript after each detected endpoint (pause), you can use the following configuration:
const connection = deepgram.listen.live({
language: 'en',
encoding: 'mulaw',
model: 'nova-2',
sample_rate: 8000,
interim_results: false,
endpointing: 700,
});
- Set
interim_results: false to only receive finalized transcripts.
- Set
endpointing to your desired pause duration in milliseconds (e.g., 700ms).
- Deepgram will send a transcript with
speech_final: true after each detected endpoint, regardless of transcript completeness.
Note:
If you want to receive a transcript after every endpoint, you do not need to enable any special parameter beyond endpointing. Deepgram will send the transcript at each endpoint by default, even if the transcript is incomplete or mid-sentence.