#Speech to text realtime streaming websocket issue in new version 3.1.1

1 messages · Page 1 of 1 (latest)

vague streamBOT
#

Thanks for asking your question. Please be sure to reply with as much detail as possible so we can assist you efficiently. Such as:

  • Provide the request_id if you've a question about a transcription response.
  • The options you used or the api.deepgram.com URL you sent your request to, including parameters.
  • Any code snippets you can include.
  • Any audio you can include, or if you can't share it here please email it to us at [email protected] and provide a link to this thread.
tulip pagoda
tulip pagoda
#

request_id is 6c8198cf-de94-405f-9833-b0f4bc7dfb8c

tulip pagoda
#

UPDATE

I found the issue regarding the error above; the utterance_end_ms parameter value cannot be <1000.
The documentation mentions that it's not recommended:

documentation

In practice, you should set the value of utterance_end_ms to be 1000 ms or higher. Deepgram's Interim Results are typically sent every 1 second, so using a value of less than 1 second will not offer any benefits.

however, having a utternance_end_ms value <1000 yields a 400 bad request and I think Deepgram's error handling could be more informative.

I had to do some digging to to get the raw response:

HTTP/1.1 400 Bad Request'
status_code 400 reason Bad Request headers content-type: application/json
dg-error: Invalid 'utterance_end_ms' value of '999'.
content-length: 145
dg-request-id: 587b3227-f6ca-4818-bb4d-028182adf4ea
...

After adjusting the utterance_end_ms to 1000 and changing self.dg_connection = self.deepgram_client.listen.live.v("1") to self.dg_connection = self.deepgram_client.listen.asynclive.v("1") then i'm getting this error:

deepgram/clients/live/v1/async_client.py:219: RuntimeWarning: coroutine 'AsyncLiveClient._emit' was never awaited
  self._emit(LiveTranscriptionEvents.Error, error)
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
Task exception was never retrieved
future: <Task finished name='Task-16' coro=<AsyncLiveClient._start() done, defined at deepgram/clients/live/v1/async_client.py:97>
Traceback (most recent call last):
  File "deepgram/clients/live/v1/async_client.py", line 114, in _start
    await self._emit(
  File "deepgram/clients/live/v1/async_client.py", line 95, in _emit
    await handler(self, *args, **kwargs)
TypeError: object NoneType can't be used in 'await' expression
Deepgram Docs

Learn how to use End of Speech when transcribing live streaming audio with Deepgram.

median bane
#

Yup, we will error if you attempt to do <1000 or >5000 (I believe). However our SDKs also support on-prem, which can arguably make those values whatever their hardware is happy to support (low or high).

Docs and a better error back from the API would be ideal, and I will feed this back to the team.

tulip pagoda
#

Thanks @median bane for the quick response.

#

Would you be able to help me with the latter issue?

After adjusting the utterance_end_ms to 1000 and changing self.dg_connection = self.deepgram_client.listen.live.v("1") to self.dg_connection = self.deepgram_client.listen.asynclive.v("1") then i'm getting this error:

deepgram/clients/live/v1/async_client.py:219: RuntimeWarning: coroutine 'AsyncLiveClient._emit' was never awaited
self._emit(LiveTranscriptionEvents.Error, error)
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
Task exception was never retrieved
future: <Task finished name='Task-16' coro=<AsyncLiveClient._start() done, defined at deepgram/clients/live/v1/async_client.py:97>
Traceback (most recent call last):
File "deepgram/clients/live/v1/async_client.py", line 114, in _start
await self._emit(
File "deepgram/clients/live/v1/async_client.py", line 95, in _emit
await handler(self, *args, **kwargs)
TypeError: object NoneType can't be used in 'await' expression

signal kettle
#

I haven't touched Python in a long time, but I think it should be...
self.dg_connection = **await **self.deepgram_client.listen.asynclive.v("1")

#

@tulip pagoda

median bane
#

I think that might be right ☝️ I will ask @lime condor to have a quick look when he logs in today

#

(I'm not a Python nerd)

tulip pagoda
#

Awesome thanks let me try that now i'll update you 🙂

#

hmmm nope that did not work:

    self.dg_connection = await self.deepgram_client.listen.asynclive.v("1")
TypeError: object AsyncLiveClient can't be used in 'await' expression

lime condor
#

I think this was also posted to a github issue and was answered (maybe?)

tulip pagoda
#

Yes i'm the same person!

#

however the latter was not addresses

lime condor
#

there was a bug in 3.0.0 where you needed an async for the on_message

tulip pagoda
#

Would you mind helping me out with the latter issue:

After adjusting the utterance_end_ms to 1000 and changing self.dg_connection = self.deepgram_client.live.live.v("1") to self.dg_connection = self.deepgram_client.listen.asynclive.v("1") then i'm getting this error:

deepgram/clients/live/v1/async_client.py:219: RuntimeWarning: coroutine 'AsyncLiveClient._emit' was never awaited
self._emit(LiveTranscriptionEvents.Error, error)
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
Task exception was never retrieved
future: <Task finished name='Task-16' coro=<AsyncLiveClient._start() done, defined at deepgram/clients/live/v1/async_client.py:97> exception=TypeError("Logger._log() got an unexpected keyword argument 'error'")>
Traceback (most recent call last):
File "deepgram/clients/live/v1/async_client.py", line 114, in _start
await self._emit(
File "deepgram/clients/live/v1/async_client.py", line 95, in _emit
await handler(self, *args, **kwargs)
TypeError: object NoneType can't be used in 'await' expression

lime condor
#

for all the on_ hooks really

tulip pagoda
#

I am using 3.1.1

signal kettle
lime condor
tulip pagoda
#
class TranscriptionService:
    def __init__(self, on_transcript_callback, on_interrupt_callback, audio_playing_callback, is_beginning_of_call):
        self.deepgram_client = DeepgramClient(api_key=DEEPGRAM_API_KEY, config=DeepgramClientOptions(verbose=20))

    async def start_transcription(self):
        self.dg_connection = self.deepgram_client.listen.asynclive.v("1")

        def on_message(_, result, **kwargs):
            print(f"RESULT: {result}")

        def on_utterance_end(_, utterance_end):
            print(f"[UTTERNANCE_END]: \n\n{utterance_end}\n\n")

        def on_metadata(_, metadata):
            if metadata is None:
                return
            print(f"\n\nMetadata: {metadata}\n\n")

        def on_error(_, error):
            if error is None:
                return
            print(f"\n\nError: {error}\n\n")

        self.dg_connection.on(LiveTranscriptionEvents.Transcript, on_message)
        self.dg_connection.on(LiveTranscriptionEvents.Metadata, on_metadata)
        self.dg_connection.on(LiveTranscriptionEvents.Error, on_error)
        self.dg_connection.on(LiveTranscriptionEvents.UtteranceEnd, on_utterance_end)

        options = LiveOptions(
            model="nova-2",
            interim_results=False,
            language="en-US",
            encoding="mulaw",
            sample_rate="8000",
            punctuate=True,
            # endpointing=400,
            # utterance_end_ms=1000
        )

        await self.dg_connection.start(options)

    async def send_audio_data(self, data):
        if not self.dg_connection:
            print("Deepgram live transcription object is 'None'; cannot send data.")
            return  # Early exit if the transcription isn't started
        await self.dg_connection.send(data)

    async def stop_transcription(self):
        if self.dg_connection:
            await self.dg_connection.finish()
            self.dg_connection = None
#

Let me try adding async to the event functions

#

Ok nice that worked!

signal kettle
#

Glad to hear it.

median bane
#

superb Clap

lime condor
#

got it working?

tulip pagoda
#

yes it worked 🙂

lime condor
#

on async, it isnt implemented which means, you need to currently send them yourself

tulip pagoda
#

here's the working implementation:

class TranscriptionService:
    def __init__(self, on_transcript_callback, on_interrupt_callback, audio_playing_callback, is_beginning_of_call):
        self.deepgram_client = DeepgramClient(api_key=DEEPGRAM_API_KEY, config=DeepgramClientOptions(verbose=20))

    async def start_transcription(self):
        self.dg_connection = self.deepgram_client.listen.asynclive.v("1")

        async def on_message(_, result, **kwargs):
            print(f"RESULT: {result}")

        async def on_utterance_end(_, utterance_end):
            print(f"[UTTERNANCE_END]: \n\n{utterance_end}\n\n")

        async def on_metadata(_, metadata):
            if metadata is None:
                return
            print(f"\n\nMetadata: {metadata}\n\n")

        async def on_error(_, error):
            if error is None:
                return

        self.dg_connection.on(LiveTranscriptionEvents.Transcript, on_message)
        self.dg_connection.on(LiveTranscriptionEvents.Metadata, on_metadata)
        self.dg_connection.on(LiveTranscriptionEvents.Error, on_error)
        self.dg_connection.on(LiveTranscriptionEvents.UtteranceEnd, on_utterance_end)

        options = LiveOptions(
            model="nova-2",
            interim_results=False,
            language="en-US",
            encoding="mulaw",
            sample_rate="8000",
            punctuate=True,
            # endpointing=400,
            # utterance_end_ms=1000
        )

        await self.dg_connection.start(options)

    async def send_audio_data(self, data):
        if not self.dg_connection:
            print("Deepgram live transcription object is 'None'; cannot send data.")
            return  # Early exit if the transcription isn't started
        await self.dg_connection.send(data)

    async def stop_transcription(self):
        if self.dg_connection:
            await self.dg_connection.finish()
            self.dg_connection = None
lime condor
#

yea, yourself as in you need to implement this

tulip pagoda
#

Cool but where would I need to pass { "type": "KeepAlive" }?

lime condor
#

using send()

tulip pagoda
#

ahh I see

#

thanks

median bane
#

It should be sent as a string!

tulip pagoda
#

It doesn't seem that it takes options params:

    async def send(self, data):
        """
        Sends data over the WebSocket connection.
        """
        self.logger.spam("AsyncLiveClient.send ENTER")
        self.logger.spam("data: %s", data)

        if self._socket:
            await self._socket.send(data)
            self.logger.spam("data sent")

        self.logger.spam("AsyncLiveClient.send LEAVE")
lime condor
#

data can be sent as a string

tulip pagoda
#

this is in deepgram/client/live/v1/async_client.py

lime condor
#

that how the underlying protocol knows to write as a message vs binary

tulip pagoda
#

ok but what should the format be since i'm sending the audio media payload as data:

await transcription_service.send_audio_data(chunk)
    async def send_audio_data(self, data):
        if not self.dg_connection:
            print("Deepgram live transcription object is 'None'; cannot send data.")
            return  # Early exit if the transcription isn't started
        await self.dg_connection.send(data)
lime condor
#

a string

#

just send the keepalive as a string message

#

no encoding, no nothing

tulip pagoda
#

do I need to send it in addition to the audio chunk? Sorry for all these questions; i'm confused as to how to send it while i'm sending audio chunks in realtime

lime condor
#

you obviously cant send the message at the same time

#

send(audio data)
send(audio data)
send(audio data)
send(keepalive)
send(audio data)
send(audio data)
send(audio data)
send(keepalive)
send(audio data)
send(audio data)

tulip pagoda
#

i see!

so exactly like this:
send('{ "type": "KeepAlive" }')

lime condor
#

in between sending the audio data, you need to send the keepalive

tulip pagoda
#

?

lime condor
#

yes

#

you are over thinking this a lot

tulip pagoda
#

where { "type": "KeepAlive" } is a string

lime condor
#

ok, I have other things I need to do. I think you will figure this out eventually

tulip pagoda
#

I am although the documentation isn't very clear but thanks a lot.

#

Sure; I am trying to get this right before we launch to production as we are moving to a Growth plan with Deepgram

lime condor
#

it's all good

#

if you have any suggestions on how to improve the docs, let me know

tulip pagoda
#

Yes I have a few suggestions will write them up for you

lime condor
#

you can also use the sync/threaded version as a guide since it's implemented there. we do have an issue to implement in async, but just havent gotten around to it yet

tulip pagoda
#

got you thanks for the suggestion

wide shuttle
#

Can someone help

lime condor
#

hi @wide shuttle this is a super old thread. can you start a new request in #1115960287183511643 and include all details for what you are trying to do. minimally the SDK version, the deepgram options being used, and information about the audio stream (where is it coming from and what is the encoding of that audio)