New suggestions, planning new HA with GPU Accelleration | Home Assistant | Page 1

covert geode May 20, 2025, 12:46 AM

#

I am trying to plan out a new setup, Right now I am using an OrangePI 5 Plus with HA Supervisor install. I also have docker on the same install with PiHole, Plex, and VaultWarden.

My primary goal would be to move everything over to my spare machine that I use for AI work, running ollama for HA voice assistant and few other things
I7-7700k 64GB RAM, with an RTX 3090

I am planning on just centralizing everything into that box.

I am looking for suggestions because I would like to effectively replace my Alexa devices with HA Voice Assistant but right now Piper and Whisper are kinda slow to respond. So I was thinking if I am running on something with a GPU then it would be much more snappy.
The conversation agent is nice and snappy since that is already running on the ollama with the RTX 3090 backing it.

What would be the suggested setup?

Windows with docker?
Linux with docker and a native HA install?

Keep the OrangePI as the primary HA box and if possible running whisper, and piper on the 3090 box?

I am open for suggestions.
Any help is greatly appreciated!

glacial scroll May 20, 2025, 12:57 AM

#

covert geode I am trying to plan out a new setup, Right now I am using an OrangePI 5 Plus wi...

so a few things:
Supervised install method is being deprecated so its a good idea to move away from it.
also running additional software on the same system as a supervised install is unsupported.

on your AI box you could run proxmox as a hypervisor. this would allow you to run HAOS as a VM.
and seperatly run linux distro on a VM or LXC with the GPU passed through. you could then run your docker images on the linux install with GPU support whilst having a fully supported HAOS install

another option would be to use a linux distro and use a container install of home assistant. which does limit your ability to use native addons and integrated updating.

another options is that you could keep the orange pi as your HA system but you should move it over to HAOS instead. then have linux on your AI box and move your docker stuff over to that along with whisper

covert geode May 20, 2025, 1:03 AM

#

glacial scroll so a few things: Supervised install method is being deprecated so its a good ide...

@glacial scroll ,
I like that plan, one snag is I dont think there is an HAOS image that will work on the OrangePi 5 plus.

I didnt consider proxmox. thats a good suggestion. if running that way, the gpu passthrough is dedicated through right? you cannot share the GPU accross VMs?

glacial scroll May 20, 2025, 1:08 AM

#

covert geode <@354452014068924428> , I like that plan, one snag is I dont think there is an H...

i am not totally sure about haos on opi5+ tbh if there isnt an image then i wouldnt use it, maybe pair it with a screen and use it as the base for a dashbaord frontend somewhere instead..

if you passthrough the GPU to a specific VM then yes its used exclusivly by the VM. however if you have the drivers running on proxmox and run other linux stuff inside LXC's instead of a VM then multiple LXC's can use it.

I have a proxmox system with an ubuntu VM that has a gpu passed through to it. and i run everything that needs the GPU on that VM.
whisper/ollama/jellyfin/etc...
using proxmox is definetly what i would recommend, the exact config will depend on what exactly you want to do and have available to you

covert geode May 20, 2025, 1:10 AM

#

glacial scroll i am not totally sure about haos on opi5+ tbh if there isnt an image then i woul...

@glacial scroll ,Thank you for your imput! this is all great info. In your setup where are you running your HA? I also have a spare Raspberry Pi4b that is not doing anything but i fear it might be slow for the Voice Assistant parts

glacial scroll May 20, 2025, 1:12 AM

#

covert geode <@354452014068924428> ,Thank you for your imput! this is all great info. In you...

My HA is installed on a N150 mini PC(on proxmox), but it offloads voice stuff to ubuntu running in proxmox on a seperate system.

covert geode May 20, 2025, 1:13 AM

#

glacial scroll My HA is installed on a N150 mini PC(on proxmox), but it offloads voice stuff to...

Do you have any noticeable latency? that is one of the biggest parts i am trying to get away from. After all my wife will complain if its not fast like "Alexa" haha

glacial scroll May 20, 2025, 1:14 AM

#

covert geode <@354452014068924428> ,Thank you for your imput! this is all great info. In you...

you could run the ha server on the rpi4 and offload the voice processing to the AI box if you wanted to but theres probably no point and i dont really recomend using rpi's for HA anyway

glacial scroll May 20, 2025, 1:15 AM

#

covert geode Do you have any noticeable latency? that is one of the biggest parts i am trying...

if its just a basic "locally processed" command its pretty dam quick. if its hitting the LLM then it takes longer

#

it really depends what your goal with it is

covert geode May 20, 2025, 1:18 AM

#

glacial scroll if its just a basic "locally processed" command its pretty dam quick. if its hit...

I run Conversation agent with
allenporter/assist-llm
via ollama on the AI box and its been prob one of the best I have used

glacial scroll May 20, 2025, 1:19 AM

#

this is the debug from a command that doesnt have to it the LLM

#

and this one does hit the llm

#

the delay comes from the the LLM more than anything else

#

currently powered using qwen3:14b

#

with llm processing stuff, its probably never going to be as "fast as alexa" but my rack under the stairs is not a datacentre that costs multiple billion

#

open voice stuff is getting better constantly, there is work being put into making stuff quicker and more responsive all the time. but there is a certain amount of "pick your battles"

covert geode May 20, 2025, 1:27 AM

#

yep i agree. if i can get sub 4 seconds I think I would be happy.

glacial scroll May 20, 2025, 1:30 AM

#

i havent experimented with smaller models but that doesnt sounds like an unobtainable goal

covert geode May 20, 2025, 1:36 AM

#

question how did you get the pipeline logs like that?

glacial scroll May 20, 2025, 1:37 AM

#

covert geode question how did you get the pipeline logs like that?

#

debug on the dropdown of your pipeline in the voice assistant settings

#

can see the breakdown of the last 5 uses of the pipeline

covert geode May 20, 2025, 1:43 AM

#

neat! how did i miss that.... this is with faster-whipser and piper on the orange pi and ollama on the AI box.... its actually not as bad as i thought

#

glacial scroll May 20, 2025, 1:46 AM

#

yeah thats pretty quick

#

i feel that sometimes the tts part lies about its speed but i cant be sure 😛

#New suggestions, planning new HA with GPU Accelleration