#Speed differences in Whisper models?

1 messages · Page 1 of 1 (latest)

patent sequoia
#

I am using lscr.io/linuxserver/faster-whisper:gpu and was wondering if there's a big speed difference between tiny, base, small and medium models. I am running a 5060Ti 16GB (+ Ollama, OpenWebUI, etc)
I am currently using tiny-int8

#

Edit: not speed but do the bigger models with more parameters giving better responses a worthwhile trade off for more size?

royal seal
#

There's large speed and accuracy differences.

elfin hearth
#

Depending on how much memory you are willing to allocate and the desired quality, the choice will be between the small and large-turbo versions.

patent sequoia
#

Ok, tiny seems too small, I'll try out small and adjust from there

novel finch
#

With German i am forced to use the large model to get al least a usable recognition. all models below give bad results. I have it running on a RTX3090. Speed with that is instant (which is actually a requirement to have a alexa like voice experience)

uncut basin
#

similar experience in Spanish. Anything below medium is quite bad. Medium is decent, but large and large-turbo are mandatory if your hardware can handle them. That said, if even with a 3060 processes whisper large-turbo voice commands in half a second, there is is no point in going lower really. Whisper is very fast on gpus