#Transcription endpoint - Unrecognized file format

1 messages · Page 1 of 1 (latest)

wide crag
#

I'm not sure what I'm doing wrong here but I can't seem to get speech to text working. I've got a Vue 3 app that does the recording, posts the audio to my server (multipart/form-data). My server is essentially a Python API running in an Azure function that then posts the autio to the OpenAi transcription endpoint. No matter what I do I always see the 400 response - Unrecognized file format. Supported formats: ['flac', 'm4a', 'mp3', 'mp4', 'mpeg', 'mpga', 'oga', 'ogg', 'wav', 'webm']. The only difference between what I'm doing and the docs is passing the file as a byte stream. The file is a .wav file. I've also tried with ogg.

@app.route(route="ProcessAudio", methods=["POST"])
def ProcessAudio(req: func.HttpRequest) -> func.HttpResponse:
    logging.info('Python HTTP trigger function processing an audio request.')

    file = req.files['audio']  
    audio_data = file.stream.read() 
    audio_file = BytesIO(audio_data)
    audio_file.seek(0) 
    
    with open('temp2.wav', 'wb') as f: # This file plays fine, it's not corrupted
        f.write(audio_data)

    try:
        transcript_response = client.audio.transcriptions.create(
            file=audio_file,
            model="whisper-1"
        )
        transcript = transcript_response['text']

        return func.HttpResponse(transcript, status_code=200)

    except Exception as e:
        logging.error(f"Error from OpenAI: {str(e)}")
        return func.HttpResponse(
            "Failed to process the audio transcription.",
            status_code=500
        )