#External private models

1 messages · Page 1 of 1 (latest)

candid kite
#

Hi, I have a general question/issue atm. I have a dedicated desktop/gpu setup with whisper.cpp, llama.cpp, and piper running servers, all loaded in memory. I use this setup in a separate project, but I was wondering if Assist is setup/could be modified to allow someone to use custom IP/URL endpoints for model completions?

#

It'd be a lot more convenient, as my proxmox/HAOS is running on a minimial minipc

steady stream
#

There's Ollama integration that allows it.

candid kite
#

Even for piper and whisper? I think those would not work.

Ollama lags behind llama.cpp, and while convenient for simple prototyping, is less optimal than either llama.cpp or even llamafile as a wrapper.

I can understand using ollama for mainline llm support. It’s easy and quick to download/run/integrate. It would still be extremely nice for performance/modularity/anti-lock-in reasons, to expose custom urls as a config for all components of the voice chain. Every LLM wrapper I’ve mentioned supports the OpenAI style endpoints, and I know for a fact the packets for whisper and piper are dead simple, so I don’t see a compatibility issue with a url swap. The parameters for LLM need to be supplied, but if posed as a non-default route, onus could be placed on advanced users to handle that.

I’d have no problem working on this, but I have no familiarity w/ the code base. Would it make the most sense to make add-ons that expose options to the voice assistant config?

steady stream
#

Piper/whisper can be used over the network as well. I have them in Docker on separate server

candid kite
#

what are you using for those two? I have figured out the LLM portion. I am still googling for the whisper add-on or repository that allows custom endpoints.

candid kite
#

good to know

steady stream
#

You just put Piper and Whisper to Docker, and add Wyoming integration to HA with API endpoint pointing to container.