#is it possible to do context best formatting with streaming stt?
1 messages · Page 1 of 1 (latest)
Hey there! This channel allows you to ask Kapa.ai for assistance. But it looks like you haven't connected your Discord profile to your Deepgram account yet. You can do so at https://community.deepgram.com - by doing so, you'll be able to ask questions in #1115960287183511643 if you need human assistance!
Deepgram’s community offers tons of resources to support you on your voice AI journey
Hi @waxen egret. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:
Yes, you can get context-based (“best effort”) formatting with Deepgram’s streaming STT, but there are some details and limits.
1. Smart Formatting in Streaming
Deepgram’s smart_format works for both pre-recorded and streaming STT. When enabled, it applies context-aware formatting such as punctuation, paragraphs, and (especially for English) things like dates, times, currency, phone numbers, emails, and URLs. [Smart format; Streaming features]
For streaming, you just add:
smart_format=true
For example, with the JS SDK:
const connection = deepgram.listen.live({
model: "nova-3",
language: "en-US",
smart_format: true,
});
2. Context-Aware Finalization Behavior
With streaming + smart formatting, Deepgram waits briefly to gather enough context around entities (like numbers or dates) before finalizing and formatting them:
- It waits until the speaker continues to non-entity speech, or
- Finalizes after about 3 seconds of silence
and then formats based on the available context. [Smart format]
You can override this behavior with:
no_delay=true→ forces immediate finalization but often skips formatting. [No delay]
3. Voice Agent STT Context Formatting
If you’re using the Voice Agent API with Deepgram as the listen provider, there’s also:
agent.listen.provider.smart_format: true
This is specifically to improve transcript readability for UI or downstream processing in a live agent setting. [Voice agent smart_format]
4. Self-Hosted / NER-Based Formatting (Advanced)
For self-hosted Nova-3 or Nova-2, you can enable NER-based formatting (Named Entity Recognition), which uses an entity detector plus non-formatted models to apply human-like formatting (e.g., turning “six zero eight…” into (608) 415-2097). [Self-hosted NER]
That is also context-aware, but it requires:
- Specific self-hosted models (entity-detector + non-formatted STT)
- Feature flags like
streaming_ner = truein your config
This is not exposed as a separate cloud parameter beyond smart_format; it’s part of how newer models handle formatting.
So in summary:
- Yes, Deepgram streaming STT supports context-based formatting via
smart_format=true. - Formatting decisions are context-sensitive and tied to streaming finalization timing.
- You can trade off latency vs. formatting quality using the default behavior vs.
no_delay=true.