I'm trying to setup Deepgram transcription in my project.
Request ID: f83352fa-1bfe-4ac9-baeb-775817efcf97
I already have my own VAD setup, so I'm trying to use that. My flow is roughly:
- Open the websocket
- Send some audio buffers
- Wait for my VAD silence event
- Send the
{type: "Finalize"}message
I only get one message back from Deepgram, that indicates is_final, but the transcript is empty
[debug] Received WebSocket message from Deepgram: "{\"type\":\"Results\",\"channel_index\":[0,1],\"duration\":0.7,\"start\":0.0,\"is_final\":true,\"channel\":{\"alternatives\":[{\"transcript\":\"At\",\"confidence\":0.38793945,\"words\":[{\"word\":\"at\",\"start\":0.32,\"end\":0.7,\"confidence\":0.38793945,\"punctuated_word\":\"At\"}]}]},\"metadata\":{\"request_id\":\"f83352fa-1bfe-4ac9-baeb-775817efcf97\",\"model_info\":{\"name\":\"general-nova-3\",\"version\":\"2025-01-09.0\",\"arch\":\"nova-3\"},\"model_uuid\":\"bf05427e-a1f1-4ced-a976-38b2f3533d8d\"},\"from_finalize\":true}", metadata: line=104 pid=<0.710.0> file=lib/smartvox/speech_to_text/stt_deepgram_streaming_client.ex domain=elixir application=smartvox mfa=Smartvox.SpeechToText.DeepgramStreamingClient.handle_frame/2
My body params when opening the socket. Am I configuring this incorrectly?
params = %{
encoding: "linear16",
sample_rate: Keyword.get(opts, :sample_rate, 16000),
channels: 1,
model: "nova-3",
language: Keyword.get(opts, :language, "en"),
interim_results: false,
smart_format: true,
endpointing: false,
utterances: false # Enable utterance detection
}