Hi! I finally got STT working on esp32 coding in c++ but the issue I am running into now is that any .wav file longer than 4 seconds just either times out after 12 seconds or returns nothing at all. On the other hand anything in the ❤️ seconds range returns a great transcription in less than a second. Not sure what to do, I can attach code if you want just wondering if this is a common issue!
#STT returns nothing if .wav any longer than 4 seconds.
1 messages · Page 1 of 1 (latest)
Thanks for asking your question. Please be sure to reply with as much detail as possible so we can assist you efficiently.
It looks like we're missing some important information to help debug your issue. Would you mind providing us with the following details in a reply?
- The programming language you are working in (e.g. JavaScript, Python).
- A request ID that triggered your error or issue.
The 12 seconds seems like you are trying to use a WAV file with a websocket which is usually not typical use for the websocket. Have you tried using the REST API for submitting precorded audio?
https://developers.deepgram.com/reference/listen-file
Deepgram Docs
Use Deepgram's speech-to-text API to transcribe and analyze pre-recorded audio or video.
Yea thats exactly what I am doing haha. Okay sounds good one quick question though how fast is the REST API compared to using the websockets approach? I was getting like near instant responses from websockets when he recording was short enough. Also can I follow up if I have similar issues with the REST API?