#Voice assistant mess
1 messages · Page 1 of 1 (latest)
Does it make sense to run VA? Sure!
Does it make sense to run Whisper on a RPi? No!
Don't run local Speech To Text engine on Raspberry. Either use something beefier, or Cloud STT by NabuCasa.
What is recommended specs for home server to run whisper local and fast enough it doesn’t get annoying (2-3 seconds at most)
Something with a GPU. Even an old one like a 1070 is much better than CPU.
What about external esp32 solution like Respeaker? Seen some video about that solution. How that little stuff could handle tts and stt?
You're confusing things.
Respeaker Lite is just microphone array. ESP32S3 turns it into voice satellite, capable of 3 things: capture wake word, send speech audio stream to HA, and play response audio back. It doesn't do STT or TTS on device, there's just not enough resources for that. HA pipeline does that - with help of Whisper/Piper or other solutions.
So what's minimum adviced platform to run HA with tts and stt? Thinking about a second hand nuc mini pc
Talking about hw requirements
As tetele said, to be on safe side, use GPU.
Piper isn't that resource hungry, but Whisper needs fast processing. I use it 8th Gen, and it's not great (have to use small model).