I'm not sure what I'm doing wrong here but I can't seem to get speech to text working. I've got a Vue 3 app that does the recording, posts the audio to my server (multipart/form-data). My server is essentially a Python API running in an Azure function that then posts the autio to the OpenAi transcription endpoint. No matter what I do I always see the 400 response - Unrecognized file format. Supported formats: ['flac', 'm4a', 'mp3', 'mp4', 'mpeg', 'mpga', 'oga', 'ogg', 'wav', 'webm']. The only difference between what I'm doing and the docs is passing the file as a byte stream. The file is a .wav file. I've also tried with ogg.
@app.route(route="ProcessAudio", methods=["POST"])
def ProcessAudio(req: func.HttpRequest) -> func.HttpResponse:
logging.info('Python HTTP trigger function processing an audio request.')
file = req.files['audio']
audio_data = file.stream.read()
audio_file = BytesIO(audio_data)
audio_file.seek(0)
with open('temp2.wav', 'wb') as f: # This file plays fine, it's not corrupted
f.write(audio_data)
try:
transcript_response = client.audio.transcriptions.create(
file=audio_file,
model="whisper-1"
)
transcript = transcript_response['text']
return func.HttpResponse(transcript, status_code=200)
except Exception as e:
logging.error(f"Error from OpenAI: {str(e)}")
return func.HttpResponse(
"Failed to process the audio transcription.",
status_code=500
)