I'm using STT with a Next.js frontend and Twilio for my AI agent. Everything works fine in Twilio, but when running the same code with my frontend, the final utterance transcript slows down over time—initially fast, then gradually slower, and after 3-4 conversations, it stops generating utterance at all. Interim transcripts work fine. The backend is managed with Wrangler on Cloudflare, handling the full conversation while the frontend sends audio data for stt
Deepgram stt configuration is like below
const deepgram: any = deepgramClient.listen.live({
model: 'nova-2-general',
language: languageCode,
smart_format: true,
encoding: sttVoiceConfig[voiceCallStates.phoneNumberProviderService].encoding,
sample_rate: sttVoiceConfig[voiceCallStates.phoneNumberProviderService].sampleRate,
channels: 1,
multichannel: false,
no_delay: true,
interim_results: true,
endpointing: 300,
utterance_end_ms: 1000,
// sentiment: true,
});