#Whisper on a NVIDIA P40

1 messages · Page 1 of 1 (latest)

feral bone
#

Hi,
I have a server (using unraid) that has a Nvidia P40 in it that is running Ollama (in a docker) supporting my HA setup. I would like to get everything local, but faster-whisper (running on cpu) is 10x slower than home assistant cloud. To improve on that, I would like to run it on my GPU. I have looked at a number of docker options but have not found one that works with my setup. Does anyone have any suggestions?

sleek yoke
#
GitHub

This Docker image provides a convenient environment for running OpenAI Whisper, a powerful automatic speech recognition (ASR) system. - manzolo/openai-whisper-docker

Medium

Hi fellows, in this article I have talked about how to run the Whisper Large v3 Speech-to-Text(STT) model on a Docker container with GPU…

feral bone
#

Damn. I forgot to include one other part to this question. Need the solution to support the Wyoming protocol (in order to tie back into HA). Well, I at least assumed that to be the case. Verifying that now.

#

@sleek yoke Thanks for the response, BTW.

feral bone
#

@edgy pike Thanks! I actually had tried LinuxServer.io version of faster-whisper. When using the "latest" version and tiny-int8 things work out of the box. However, if I try and use the "gpu" version I get "RuntimeError: CUDA failed with error CUDA driver version is insufficient for CUDA runtime version" - which I got the following feedback on: https://github.com/SYSTRAN/faster-whisper/issues/1229. I am ok with compiling code, but not sure how I would do that for a container.

GitHub

Faster Whisper transcription with CTranslate2. Contribute to SYSTRAN/faster-whisper development by creating an account on GitHub.