#Running Whisper Large Without GPU?

1 messages · Page 1 of 1 (latest)

grand sorrel
#

Hi everyone! I’m planning to run the Whisper large model locally and I’m wondering — is a CUDA-enabled GPU strictly required for that version? Or is it possible to run it on CPU only? Any tips would be appreciated!

worn stirrup
#

If you can wait a minute or more you can use your CPU 😄
But seriously, the only way to truly find out how it behaves on your hardware is to test it.

dusty spire
#

I tried to run a larger whisper model on CPU on my 20 core xeon system... it was slow

worn stirrup
#

Did it utilize all cores/threads? I immediately switched to GPU once I saw how slow even small* was and didn't test much further.

dusty spire
#

yeah it spun everything up

#

even on the mediem model it was pretty slow

dusty spire
#

non accurate timing but for science!
for the same voice command:

1650 super gpu running large-v3-turbo
took less then 2 seconds for VPE to start responding

20 core Xeon (E5-2618L v4) running mediem.en
took just over 21 seconds for VPE to start responding

I am prepared to accept that my cpu whisper install might be unoptimised somehow but even still its a huge difference

grizzled sierra
#

Not exactly what you where looking for but thought I would add some info on my setup.

I purchased a jetson orin nano super some time ago and managed to get faster-whisper running with gpu acceleration, but only up to medium unfortunately as there is memory issues with the large instance.
From what I gather there might be ways around it but I haven't had time to try it out.
Would be nice since the jetson orin nano is fairly cheap entry level Ai dev kit that would suit this use case perfectly.

grand sorrel
#

Thanks for the insights, everyone! Based on what you've shared, it sounds like running the large model without a GPU is just not practical.
What kind of hardware with GPU would you recommend for running Whisper large smoothly?

worn stirrup
#

I'm currently using the large-v3 model and it only uses about 3.5G of VRAM. According to the github page it needs 10G 🤔
I suppose it depends on the language/voice and other things. I think a 3060 12G has a good price performance ratio.
Perhaps a bit overpowered just for whisper but it's unlikely it will stay at that. 3090 if you have the budget.

grand sorrel
worn stirrup
#

No. It's kinda hard to put a full sized PCI(e) card in a Mini PC. Your picture in the first message shows it.

#

I'd buy the GPU used, by the way,

grand sorrel
# worn stirrup I'd buy the GPU used, by the way,

Just to make sure I understand — is your entire Home Assistant setup running on a full-sized PC?
I was actually looking for something smaller that could fit in a rack-mounted case or enclosure.

worn stirrup
#

I'm running HA as a docker container in a debian VM on a Proxmox VE node. The hardware is inside a full tower.
TLDR: Yes.

grand sorrel
#

@worn stirrup I’m currently running HAOS (Generic x86-64) on a Dell Wyse 5060 with 4GB RAM (no AVX support), and I also have Home Assistant Voice installed.
I’d like to set up a voice assistant connected to ChatGPT (not running locally, but through the API).

What would you suggest I do or change in my setup to get this working well?
My budget is around 1k$

worn stirrup
#

I guess the cheapest way without changing anything would be a Nabu Casa subscription 😄
For hardware recommendations we'd need more information.

grand sorrel
worn stirrup
#

I was under the impression you can let it make your voice to text but I haven't tried. What exact goals are you trying to achieve with the hardware, what is important for it. Power consumption for example. Do you want to build yourself or not. Stuff like that.

grand sorrel
# worn stirrup I was under the impression you can let it make your voice to text but I haven't ...

So ideally, I’d like a ready-to-use device — I’m not experienced with building hardware myself. But if that’s the only way, I’m open to learning how to build it 😄
Power consumption isn’t a concern for me.

My main goal is to run a voice assistant that works reliably in Polish, using ChatGPT via API.
I haven’t worked with Docker before, but if it’s a good solution here, I’m also open to trying it.

worn stirrup
#

There are PCs with a 4560G for example. But basically you'd just go on eBay and look for tower PCs where a GPU can fit. The issue with prebuilds like that is that they might not have a powerful PSU or 8 pin power cables. HAOS can be easier but you are not as flexible with what you can run. Building your own tower gives you the most flexibility and choice but is more expensive. I built mine.

#

An alternative would be a GPU that needs less power so you can power it via the PCI(e) slot. 75W I think.

#

Giving more detailed non-biased recommendations is kinda hard.

grand sorrel
worn stirrup
#

Find a case and motherboard you like and go from there. I chose to go the AM4 route.
I recommend you get a full sized ATX mainboard and a case that can house it. I regret going mATX.

grand sorrel
#

CPU: AMD Ryzen 5 5600X

Motherboard: ASUS Prime B550-PLUS (ATX, AM4)

GPU: NVIDIA RTX 3060 12GB

Would this be a good combo, or would you suggest any changes?

worn stirrup
#

I chose the 5700G for myself because it has an iGPU and, due to its monolithic design, lower idle power consumption. It has fewer lanes through and no ECC support.
If that doesn't matter I'd pick the 5700x as you get more cores/threads for a negligible amount of price increase.