#Can your Agent do this? Opensource buil - kdb8756
1 messages Β· Page 1 of 1 (latest)
wow... how do you do this?
that is sick
That blows my Discord VC channel out of the water lol
Mine is also all local, as a base for my home assistant. But on a crappy machine. Not sure if you're streaming that voice, you should be. But how did you do the visual avatar with a local model? That's impressive
Wow!
I was gonna say latency makes it un usable then I saw it was gemma4
Canβt wait for your OSS
The realtime is slow with the all local consumer GPU pipeline. It may function best as a utility for producing overnight longer format agent reports and research recap videos to go along with the written reports.
GitHub on this?
The ai-receptionist or homelab frontend use case is huge. I'm trying to figure out how to have a low-latency model answer whilst the smarter models work in the background to add output to the queue.
Not trivial. Like gemma-supermini instantly answers "Yeah, sure, if I think about that for a second..." and meanwhile gpt5.4-fast is collating a preliminary response "I'd say you should keep the oven temp. low to avoid burning the crust.." or something like that. It's a pretty human response to use filler phrases and words to maintain attention whilst thinking how to answer a question. For us, practically it's getting several bots response in parallel, with varying levels of depth, and stringing them together into one coherent stream.
Nice update for 4.24. Web talk with gpt-realtime including query triage and routing to main.
ive actually not gotten this to wotk, maybe you'll show me
I had it working in web chat. Do the enablement, purchase API tokens (five bucks is enough) , config GPT-realtime as your voice bot and click on the ))o(( icon in web gui chat to talk. I didn't get much further than that as I immediately switched to trying to get it to work on my home assistant puck.