#Do we really need a GPU for HA stuff?
1 messages · Page 1 of 1 (latest)
Well. Caviet there I guess, for general purpose HA stuff, it can be done much smaller still and that WILL run on CPU easily. We don't even need DL for some of the tasks required for HA. For example:
https://github.com/rhasspy/wyoming-vosk
Using vosk as a system allegedly is sufficient for most HA stuff. Sure, for power users where you have 100,00 of sentences it may not be, but its still possible to engineer solutions that run on cheaper hardware. And the cheaper it is, the better chance it has at competing with Alexa/Echo.
What I mean is, if you need speech trascription, like ASR, I'd agree you need a GPU, but for home automation, which seems to be the core of HA, you may not?
Thoughts?
Happy to be wrong
I'd be happy to be wrong too! It's not like we sell GPUs or are fanboys, etc.
You have good points generally. For example, we already have an approach for voice commands where the ESP-BOX supports on an device speech recognition model that is highly accurate and performant for up to 400 pre-defined speech commands.
The issue here is the HA community is more-or-less currently at the "turn a light on and off" stage of evolution and feature support. If that's all you need approaches like what you describe (or our on-device model) are probably fine. However, the commercial stuff goes way beyond this of course, and when you get to more complex sentence structure, grammar, etc for things like shopping lists, calendar entries, general question answering, etc you're back firmly in "GPU or not" territory.
In terms of being cheaper, as you have discovered with the 1050ti GPUs aren't the expensive and power-hungry PIA they are for gaming, Linux desktop support, etc. We frequently get pushback from users that have had nightmare experiences with GPU (especially on Linux desktop) and they realize very, very quickly they aren't the nightmare they were expecting based on that experience.
If you look at what a lot of people are doing in HA (putting an $80 USB microphone on a $100 Raspberry Pi) you can see the economics are actually in favor of the Willow approach. You can start from scratch, buy a used $200 desktop, throw a $100 GPU in it, and buy five Willow devices for less than $600 all-in. The voice satellite hardware alone for the approach I described above is nearly twice as much just for that hardware. Additionally, because GPUs idle at very lower power, and BOX power utilization is measured in milliamps, you actually come out ahead in terms of power utilization as well. You can sell those Raspberry Pis/USB mics (or never buy them in the first place), whatever you are using for HA currently (Pi, NUC, etc) and come out ahead in terms of cost while having a substantially better experience. Share the GPU with friends and family with our soon-to-be released "guild mode" and it's not even close.