#New suggestions, planning new HA with GPU Accelleration

1 messages · Page 1 of 1 (latest)

covert geode
#

I am trying to plan out a new setup, Right now I am using an OrangePI 5 Plus with HA Supervisor install. I also have docker on the same install with PiHole, Plex, and VaultWarden.

My primary goal would be to move everything over to my spare machine that I use for AI work, running ollama for HA voice assistant and few other things
I7-7700k 64GB RAM, with an RTX 3090

I am planning on just centralizing everything into that box.

I am looking for suggestions because I would like to effectively replace my Alexa devices with HA Voice Assistant but right now Piper and Whisper are kinda slow to respond. So I was thinking if I am running on something with a GPU then it would be much more snappy.
The conversation agent is nice and snappy since that is already running on the ollama with the RTX 3090 backing it.

What would be the suggested setup?

Windows with docker?
Linux with docker and a native HA install?

Keep the OrangePI as the primary HA box and if possible running whisper, and piper on the 3090 box?

I am open for suggestions.
Any help is greatly appreciated!

glacial scroll
# covert geode I am trying to plan out a new setup, Right now I am using an OrangePI 5 Plus wi...

so a few things:
Supervised install method is being deprecated so its a good idea to move away from it.
also running additional software on the same system as a supervised install is unsupported.

on your AI box you could run proxmox as a hypervisor. this would allow you to run HAOS as a VM.
and seperatly run linux distro on a VM or LXC with the GPU passed through. you could then run your docker images on the linux install with GPU support whilst having a fully supported HAOS install

another option would be to use a linux distro and use a container install of home assistant. which does limit your ability to use native addons and integrated updating.

another options is that you could keep the orange pi as your HA system but you should move it over to HAOS instead. then have linux on your AI box and move your docker stuff over to that along with whisper

covert geode
glacial scroll
# covert geode <@354452014068924428> , I like that plan, one snag is I dont think there is an H...

i am not totally sure about haos on opi5+ tbh if there isnt an image then i wouldnt use it, maybe pair it with a screen and use it as the base for a dashbaord frontend somewhere instead..

if you passthrough the GPU to a specific VM then yes its used exclusivly by the VM. however if you have the drivers running on proxmox and run other linux stuff inside LXC's instead of a VM then multiple LXC's can use it.

I have a proxmox system with an ubuntu VM that has a gpu passed through to it. and i run everything that needs the GPU on that VM.
whisper/ollama/jellyfin/etc...
using proxmox is definetly what i would recommend, the exact config will depend on what exactly you want to do and have available to you

covert geode
glacial scroll
covert geode
glacial scroll
glacial scroll
#

it really depends what your goal with it is

covert geode
glacial scroll
#

this is the debug from a command that doesnt have to it the LLM

#

and this one does hit the llm

#

the delay comes from the the LLM more than anything else

#

currently powered using qwen3:14b

#

with llm processing stuff, its probably never going to be as "fast as alexa" but my rack under the stairs is not a datacentre that costs multiple billion

#

open voice stuff is getting better constantly, there is work being put into making stuff quicker and more responsive all the time. but there is a certain amount of "pick your battles"

covert geode
#

yep i agree. if i can get sub 4 seconds I think I would be happy.

glacial scroll
#

i havent experimented with smaller models but that doesnt sounds like an unobtainable goal

covert geode
#

question how did you get the pipeline logs like that?

glacial scroll
#

debug on the dropdown of your pipeline in the voice assistant settings

#

can see the breakdown of the last 5 uses of the pipeline

covert geode
#

neat! how did i miss that.... this is with faster-whipser and piper on the orange pi and ollama on the AI box.... its actually not as bad as i thought

glacial scroll
#

yeah thats pretty quick

#

i feel that sometimes the tts part lies about its speed but i cant be sure 😛