#High latency with endpoint detection
1 messages · Page 1 of 1 (latest)
Thanks for asking your question. Please be sure to reply with as much detail as possible so we can assist you efficiently. Such as:
- Provide the
request_idif you've a question about a transcription response. - The options you used or the api.deepgram.com URL you sent your request to, including parameters.
- Any code snippets you can include.
- Any audio you can include, or if you can't share it here please email it to us at [email protected] and provide a link to this thread.
👋 @crude sequoia Reducing the latency from 1.3 seconds to around 500ms with endpoint detection is possible. By adjusting the endpointing parameter in the request to the Deepgram API, you can control when an endpoint is identified.
The default value for endpointing is 10 milliseconds, but you can adjust it to a lower value for rapid finalization of transcripts or a higher value for longer pauses between utterances.
However, it's important to note that endpointing is not intended to return complete utterances or be a reliable indicator of a complete thought or sentence. It is meant to hint to low-latency NLP engines that intermediate processing may be useful.
For utterance segmentation, you'll need to implement a timing heuristic appropriate to your specific use case.
To achieve lower latency, you may also need to consider other factors such as the features you are using and the size of the audio chunks you are sending.
check this doc out for more information: https://developers.deepgram.com/docs/understand-endpointing-interim-results#controlling-endpointing
@fierce bramble do you have any low-latency NLP libraries you could recommend for us to investigate?