#Is Whisper able to transcribe an audio file where multiple languages are spoken ?

1 messages · Page 1 of 1 (latest)

finite marten
#

Is Whisper able to transcribe an audio file where multiple languages are spoken ?

jagged vectorBOT
#

Hi @finite marten. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

Whisper is designed to transcribe monolingual audio inputs. It does not handle code-switching inputs (inputs that switch between different languages) very well. If you have an audio file where multiple languages are spoken, it is recommended to use third-party tools to segment the audio into sentences and then run Whisper separately to transcribe each sentence. This is especially useful if the languages spoken in the audio are known and the segments of different languages can be identified.

Here is an example of how to use Whisper for transcribing an audio file:

whisper audio.flac audio.mp3 audio.wav --model medium

If the audio file contains non-English speech, you can specify the language using the --language option:

whisper japanese.wav --language Japanese

However, this approach assumes that the entire audio file is in the specified language.

For more complex scenarios involving multiple languages, you may need to use additional tools or techniques to handle the language segmentation and transcription tasks.

Source: Whisper GitHub Discussion #1160 (https://github.com/openai/whisper/discussions/1160)

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: