I have my assistant hooked up to openai and it works incredibly well. The only downside is that I need to type in my conversations on both desktop and Android. I see there is a STT option if you subscribe to Nabu Casa but I'm wondering if there are any local options or any other options that can also work until my Voice PE arrives?
#Is there a STT option for non home cloud users?
1 messages Β· Page 1 of 1 (latest)
https://voice-pe.home-assistant.io/getting-started/#follow-the-wizard-steps has you covered π
You need Whisper for STT and Piper for TTS
You can use Whisper as a local STT (Speech-To-Text), but you will require good hardware to get good results. Also, the outcome depends on the language you plan to use (English works pretty well, Spanish and German too, some other languages not so much).
The arrival of your VPE has nothing to do with your Nabu Casa subscription, so you can subscribe to NC even before your VPE arrives.
Apart from NC, there are other cloud-based STT solutions which work: Google, Azure (which is internally used by NC Cloud as well).
Lastly, on Android (using the companion app) you don't need STT, as Android does it for you IIRC. But i'm not 100% sure of this
Thanks! I would prefer to not connect to NC as I like having everything offline (although minimal exposure to openai is fine until I get my LLM working). I have a 3090 so I think I should be able to run this, the main part of my question was how to get this working without NC being connected.
I don't have anything against NC other than that I would prefer to not add in a cloud factor but I'm happy to support HA if it means paying for a NC subscription.
I use Whisper myself, and performance isn't awesome - not terrible, but not awesome
I have a 3090 so I think I should be able to run this, the main part of my question was how to get this working without NC being connected.
you can ignore NC π
AMD Ryzen 5 5560U and I see it take about 4 seconds to process a typical "easy" request, vs 2 seconds for Google Home
he did say he has a GPU, though π
how about a STT -> openai -> TTS? openai is super quick!
I'm not sure how much the GPU will help - I don't know how much (if any) of Whisper is GPUable
it's VERY GPUable and it makes a ton of difference. you just need an instance which is built for CUDA cores. i haven't used one in a while now, but @surreal brook can surely suggest something
Ah, cool, that's good to know
Guess another hardware upgrade for HA is on my roadmap π
Either that or a new home for Frigate
mind you, for Whisper you'd be doing well with an old 1070 @ 8GB VRAM or something like that
I'll try setting up Whisper and can report back to this thread how it goes.
Currently I have openai working via typing in conversations; surely there should be a way to just use SST on android to send to openai?
yup I use whisper on GPU with the large-v3-turbo model and my responses are around 200ms
Will whisper or piper help with that?
it's called STT. AFAIR, the Android companion app uses Android's built-in STT so you dont need anything else. But if I'm wrong, then Whisper is the one you're interested in
Piper turns text into audio (TTS=Text-To-Speech)
I don't see a speech option in Android Assist unless Home Assistant Cloud is enabled - but I also haven't tried Whisper or enabling STT on my assistant. Maybe that's the issue.
Thanks all, I think I have some work cut out for me this evening! Super excited (and I hope my Voice PE gets shipped quicky...)
this is Day 4 with HA and so far I'm really enjoying it!
one more thing: there's no HA addon for Whisper using CUDA (of which I am aware, at least). You'd have to install it as a docker container separately from HA and use it from there. depending on how you have HA installed, that might be easier or harder for you
(In hindsight I maybe should have ordered 3 or 4 Voice PE's for various areas of my apartment)
Thanks. I'm using HA core on my main desktop which has the 3090 but will be migrating HA core to an odroid H4 this week when it arrives. I see from the Whisper integration it asks for a host so I guess the challenge boils down to getting whatever that's needed working with cuda on my desktop?
yes. and then being able to access it from the server running HA. but it's not rocket science if you know the tiniest bit about networking. it might even work out of the box
I'd need to punch a hole in the firewall to get from my iot vlan to my desktop for that specific connection but I'm not too worried about that
but until I get the odroid it all lives on the same box (my desktop)
having the HA server on the iot VLAN does not sound like a good plan if you ask me, but that's off-topic for here π
I'd be curious as to why you think that? should we open a new thread somewhere?
#1284966540617449515 is more relevant, probably, but VLANs and HA don't always play well together
Dumping everything on one VLAN (including HA) should be ok, as then effectively no VLANs are in play
Is there a STT option for non home cloud users?
is the GPU even used by the default addon or did you modify it?
What makes you think I use add-ons π
thatβs why i asked if you modified it (https://github.com/rhasspy/wyoming-faster-whisper) π
Docker all the way π
but how do you get the amd gpu to run it?
I thought Wyoming-faster-whisper used faster-whisper readme says
GPU
GPU execution requires the following NVIDIA libraries to be installed:
cuBLAS for CUDA 12
cuDNN 9 for CUDA 12
?

You may want to re-read all the above
||At no point did I say I was using the GPU||
just to be sure: the voice PE does not have STT onboard. Sorry if you know that, but it sounded like you assumed that.
because of "I'm wondering if there are any local options or any other options that can also work until my Voice PE arrives"
aaah ok so, are you referring to the wyoming-faster-whisper container itself? Because the one from the main repo does not support GPU entirely yet, it requires some modification and forcing CUDA/CUBLAS drivers into the container. However, there's a group that has already done this for us that myself and many others use with great success: https://docs.linuxserver.io/images/docker-faster-whisper/
Welcome to the home of the LinuxServer.io documentation!
They also have a CUDA accelerated Piper container: https://docs.linuxserver.io/images/docker-piper/
Welcome to the home of the LinuxServer.io documentation!
And if it wasn't made clear, the addon's in home assistant can't utilize a GPU, so you'd need to run these outside of HA on a server or computer that has docker with a GPU made available to it, and the NVIDIA Container Toolkit installed.
I have whisper installed but unsure how to adapt it to the wyoming protocol that I think HA expects... Support for Whisper devices is provided by Wyoming Protocol. Do you want to continue?
You'd have to write some sort of wrapper service to connect Wyoming protocol to your whisper server. That's what the wyoming-faster-whisper docker container is more or less
Ah. Thankfully AUR has a working wyoming-piper so that's ready to go. Will work on one of the whisper solutions.
Theres already a container that is ready to go that i posted above π
Thanks, I'll try without docker and fallback if needed π