I've been trying to throw more power at Whisper so I can upgrade to a better model for more accuracy, whilst maintaining a quick response time, but I seem to be hitting a performance ceiling.
I'm running HA and the Whisper Add-On on a Proxmox machine with an i5-13500, and have 15 cores assigned to the VM. (Excessive I know, it's just for testing to see how much it will use.)
If I select a better model than tiny or base, like medium-int8, then it will always take over three seconds to transcribe any command, whilst only hitting a maximum CPU utilisation of ~30%. Is there anything more I can do?