#OpenAPI Realtime API for Assist?

1 messages · Page 1 of 1 (latest)

lusty juniper
#

Are there any plans to implement OpenAPI Realtime API to Assist in the future?

I understand there would be a lot of work to get it work fully, but a lot of important groundwork seems to already have been done in the past months / years. But starting a longer conversation session from my Voice PE to the Realtime API would be awesome for more advanced tasks.

finite lintel
#

I would love to be able to trigger the realtime api advanced voice mode and talk to it without having to use my phone.

naive narwhal
#

I think Home Assistant shouldn't become the LLM product. It's home automation tool. What will this functionality bring to the automation purposes?

finite lintel
# naive narwhal I think Home Assistant shouldn't become the LLM product. It's home automation to...

I would say it is more than an automation tool. It is a smart home tool. For example, add-ons like Frigate can be set up for your security cameras.
An example of how openais realtime api advanced voice mode could improve automation would be: 1. security camera is able to see your bins. 2. sends video to frigate which processes that your recycling bin hasn't been taken out on bin night. 3. sends information to voice assistant 4. voice assistant starts a conversation with you reminding you to take the bin out (this is more effective than just a phone notification as not everyone wants to have their phone on them all the time).

naive narwhal
finite lintel
#

yeah, but it sucks. I want a voice conversation with my smart home that is futuristic. OpenAi realtime api advanced voice mode achieves the same thing, but sounds better.

naive narwhal
ocean wing
# finite lintel yeah, but it sucks. I want a voice conversation with my smart home that is futur...

Within the current implementation of the Wyoming protocol, this is definitely not possible. Perhaps an external program that interacts with Home Assistant as an MCP server would be the best solution.

At the moment, sequential request-response is a sufficient solution for interacting with a smart home. We just need to wait for streaming audio generation, which will reduce latency when interacting with slow LLMs (work in progress).

stoic cave
#

Just amazon or openAI TTS and STT would be great, or google for that matter

#

And googles realtime API is more flexible than the OAI version