STT custom integration doesn't get audio right | Home Assistant | Page 1

#

Reformatting the message so it's easier to read :

#

Hi I'm trying to integrate the whisper openai API (the distant one not the self hosted) in the assist pipeline (custom integration), the code is pretty simple :

async def async_process_audio_stream(
    self, metadata: SpeechMetadata, stream
) -> SpeechResult:
    audio_data = b""
    async for chunk in stream:
        audio_data += chunk
    # ... send the audio to openai API

#

Whether I send the audio_data or a fileObject of it (io.BytesIO(audio_data)) the API doesn't recognize the audio as valid.

BUT if I try to send a "static" audio file it will work, steps above :

set the stt part of the pipeline to local whisper
activate the debug_recording
call the assistant
get the generated wav file and copy it to config
hardcode the sending of this file in my script open('audio.wav', 'rb')
switch to my script in voice assistant settings and call the assistant
it works!

So it leaves me to believe there is 3 possible explanations here :

io.BytesIO(audio_data) is not strictly equal to open(audio_file, 'rb') but I was under the impression that it is (my knowledge of python is very limited)
the code that gather the stream (see above) isn't right
the stream isn't passed at all to async_process_audio_stream (but I'm able to ouput bytes from here so most likely not this)

I would like to rule out this last point but I can't seem to have the permission to write a file anywhere to the filesystem from the script. Can someone point me in the right direction please? Thanks

#

Also I figure it's expected that my custom script doesn't output a wav file when debug_recording is on ? Does it need a dedicated code to do so ?

#STT custom integration doesn't get audio right