#Whisper API doesn't work for audio recorded on Safari.

40 messages · Page 1 of 1 (latest)

sacred basin
#

Audio recorded on Safari does not work through whisper. This is also documented here: https://community.openai.com/t/whisper-api-cannot-read-files-correctly/93420/13

Does anyone have a solve? Current solutions all seem to be to load ffmpeg which seems unideal.

I am having similar issues on firefox, but I have not seen similar comments elsewhere and that might just be a personal issue.

OpenAI API Community Forum

I’m also facing the same issue when using the MediaRecorder in Safari with MP4s.

still tide
#

Hi David, sorry to chime in - I don't have a solution for your problem, but maybe you can help me out with mine, as I am getting a "Bad request" on my calls - how do you send the audio file over via API? Thanks!

gaunt mantle
#

seems to be broken for me too. according to the api "file=<path to audio>" but I am getting "1 validation error for Request\nbody -> file\n Expected UploadFile, received: <class 'str'> (type=value_error)" so do I send the file's body or the file path?

gaunt mantle
#

I just managed to post a file successfully. the file parameter is the file data.

sacred basin
#

@still tide I'm using node, and I'm sending the audio in formdata using an axios post request

#
  
  formData.append("file", audioStream)```
#
      headers: {
        Authorization: `Bearer ${token}`,
        "Content-Type": `multipart/form-data; boundary = ${formData._boundary}`,

      }
    })```
#

But worth noting that I was getting bad request on all my calls with Firefox and safari audio - still havent gotten a solution on that, so if you're trying it out, make sure to be trying it on chrome

still tide
#

thank you David - I am actually recording on Firefox, I'll follow your advice and try it with Chrome before making changes to the backend
(I am also using Node)

sacred basin
#

Actually, after seeing your comment I noticed that the openai node package now works for whisper

#

their example works without edits

still tide
#

does it work for you on Chrome?

#

because I was getting a Bad Request also with their npm package, but I'll now test it with crhome

#

which mimeType are you using? Maybe that's a topic as well

#

wait a second

#

now it works on Chrome

#

ok, the game is on!

#

thanks already, let's figure this out

#

found a way to not have to save it to disk

sacred basin
#

I got timed-out for my last message but yeah! Mp3, webm, and wav seem to work fine on chrome. The link I have in my top message is active with some people trying to solve it for safari - There is a fix, but tbd on firefox

still tide
#

const { Readable } = require('stream');

sacred basin
#

Oh i might try that -i've been just saving and flushing

still tide
#

Perhaps if the browsers have different internal webm formats, how about converting them on the backend - will check with ffmpeg

sacred basin
#

yeah someone did that in the support forum - it works but is slow if i remember

still tide
#

hmmm

#

not sure, but it looks like the files generated by Firefox are just in the ogg format (just opened the file in the text editor) - that might be an issue

still tide
#

does this help?

sacred basin
#

supposedly, but so far it's just broken all my recording logic

still tide
#

unfortunately the browsers have different formats, and OpenAI seems to be very strict to what they accept

#

or they don't accept ogg

still tide
#

I think I was able to make it work in FF with the following setting:

#

making some more tests - I can't try it on Safari Desktop yet

sacred basin
#

oh neat - is that using the polyfill media recorder?

sacred basin
#

Woohoo! That polyfill solution completely fixed my issues, it's now working on safari and firefox and chrome

#

One caveat - using onstop or ondataavailable don't seem to work - you have to add eventhandlers for them instead (which they loosely cover in their github

still tide
deep vapor