Wyoming protocol implementation | Home Assistant | Page 1

light gale Apr 29, 2025, 4:02 AM

#

Hey! Sorry if a bit technical or incorrect thread.
I want to dig into writing Wyoming satellite. I found protocol documentation here https://github.com/rhasspy/wyoming and example on Python for default Wyoming satellite here https://github.com/rhasspy/wyoming-satellite/. It should be enough to get me started, but i'm curious if there are any other projects or docs that i can read to better understand communication patterns between client and server, as well as exact setup for HA auto-discovery and other important niche stuff.
Sorry for pinging @worthy granite , i know that you're main guru there. 🙂

Thanks!

GitHub

GitHub - rhasspy/wyoming: Peer-to-peer protocol for voice assistants

Peer-to-peer protocol for voice assistants. Contribute to rhasspy/wyoming development by creating an account on GitHub.

GitHub

GitHub - rhasspy/wyoming-satellite: Remote voice satellite using Wy...

Remote voice satellite using Wyoming protocol. Contribute to rhasspy/wyoming-satellite development by creating an account on GitHub.

worthy granite Apr 29, 2025, 8:14 PM

#

Hey @light gale! Those are definitely the resources I would recommend (besides me of course 😄)

#

I know the docs aren't the greatest, unfortunately. Maybe we can improve them a bit together.

light gale Apr 29, 2025, 8:15 PM

#

worthy granite Hey <@299947094674767872>! Those are definitely the resources I would recommend ...

Ohhh I wish I could borrow your brain for a week

light gale Apr 29, 2025, 8:16 PM

#

worthy granite I know the docs aren't the greatest, unfortunately. Maybe we can improve them a ...

Well, at least there's working implementation! So I can refer to that.

worthy granite Apr 29, 2025, 8:17 PM

#

Something that isn't communicated well is which messages are required and which are optional in different contexts. The protocol fortunately isn't too complicated, but it can be tricky.

light gale Apr 29, 2025, 8:17 PM

#

Could it be that this project https://github.com/Nailik/rhasspy_mobile contains something too? Or it's pure Rhasspy?

GitHub

GitHub - Nailik/rhasspy_mobile: Rhasspy is a voice assistant softwa...

Rhasspy is a voice assistant software. This repository implements the functionality of a Rhasspy satellite, with local wake word recognition. - Nailik/rhasspy_mobile

#

(I'm trying to put that to Android, and probably will try to port OWW or MWW too, if it's not too hard...)

worthy granite Apr 29, 2025, 8:19 PM

#

I'm not familiar with that project, but it almost surely implements things over MQTT. Wyoming is purely peer-to-peer TCP.

light gale Apr 29, 2025, 8:20 PM

#

worthy granite I'm not familiar with that project, but it almost surely implements things over ...

Nice to know! I believe it's pure websocket connection, right? I already understand that it has JSON and raw data mixed, so it's definitely not restful 🙂

#

Good thing that I just had experience with websockets on MA mobile client.

worthy granite Apr 29, 2025, 8:21 PM

#

Nope, not even websocket. Just straight TCP. I had originally considered websocket, but I wanted it to be easy to implement on microcontrollers.

light gale Apr 29, 2025, 8:22 PM

#

Oh geez! Okay, well, I guess simple wrapper on a byte stream will do 🙂

worthy granite Apr 29, 2025, 8:23 PM

#

The low level protocol is extremely basic:

Open a TCP connection and send a line of JSON with the event type, data_length, and optionally payload_length
Write UTF-8 encoded JSON that is data_length bytes with the event data
Optionally write payload_length bytes with the binary payload (usually audio)

#

Everything else is just which events are expected when.

light gale Apr 29, 2025, 8:24 PM

#

Got you. No checksums?

worthy granite Apr 29, 2025, 8:24 PM

#

Nope, I assume TCP is working 😄

light gale Apr 29, 2025, 8:24 PM

#

Fingers crossed. 🙂
Thank you! Will try to work it out. 🙂

#

Oh, one more thing: from perspective to get it working, should I try Openwakeword or Microwakeword? They both use hell of a Python libraries, that I might stuck on - but maybe one is easier than other? 🙂
Porcupine is too expensive for my goal....

#

(basically I'd like to make everything open source and available readily)

light gale May 8, 2025, 3:50 AM

#

@worthy granite sorry for bothering you again and thanks for your help in advance! 🙂
So far i ported the events to Android, and starting to port the satellite itself. At first stage i will go with full streaming, to get it working - and then will decide what to port for wake word and VAD.

At this stage, it would be really helpful to have some test server running, to test the communication. Is there anything like this, or should i actually spin up some Home Assistant instance in Docker? That would be alright, but it's easier to debug if something is running in console... 🙂

Thanks!

#

Also, any hints on correct events flow would be super cool. I see the Wyoming Sat code, but it's Python, that i read like foreign language (enough to understand, not enough to dive into).

lone tree May 9, 2025, 3:01 PM

#

Have you looked at this implementation? https://github.com/AlexxIT/go2rtc/tree/master/internal/wyoming

GitHub

go2rtc/internal/wyoming at master · AlexxIT/go2rtc

Ultimate camera streaming application with support RTSP, RTMP, HTTP-FLV, WebRTC, MSE, HLS, MP4, MJPEG, HomeKit, FFmpeg, etc. - AlexxIT/go2rtc

light gale May 9, 2025, 3:20 PM

#

lone tree Have you looked at this implementation? https://github.com/AlexxIT/go2rtc/tree/m...

Yeah, found it too, thanks! At least some clarity 🙂

light gale May 9, 2025, 3:38 PM

#

Actually i have more luck with https://github.com/tarocco/wyoming-say/blob/main/wyoming_say/handler.py

GitHub

wyoming-say/wyoming_say/handler.py at main · tarocco/wyoming-say

Wyoming protocol server for macOS text-to-speech. Inspired by https://github.com/hugobloem/wyoming-microsoft-tts and https://github.com/rhasspy/wyoming-piper - tarocco/wyoming-say

worthy granite May 15, 2025, 8:10 PM

#

So many Wyoming things I've never seen 😄
Do you mean a test server that mimics HA's side of the communicate with a Wyoming satellite?

light gale May 15, 2025, 8:58 PM

#

worthy granite So many Wyoming things I've never seen 😄 Do you mean a test server that mimics...

Yes, exactly.
I spent couple days investigating, and i think i'm half-way there. At least i understand the flow now (looks like).. 🙂

light gale Jun 4, 2025, 1:04 AM

#

Hey @worthy granite ! I'm digging it bite by bite. 🙂
I already have satellite connected to Home Assistant, and can announce to it. However, i'm stuck with sending stream from the satellite.
(I make it iteratively, first step is PoC for always-streaming satellite)

What i have:
PCM 16BIT, 16 KHz, single channel. I write it in parallel to the WAV file for debugging, and hear the voice clearly.
The communication looks like in the original satellite:

Getting "run-satellite"
Getting "describe"
Sending "info"
Sending "streaming-started"
Sending "audio-start" with rate=16000, width=2, channels=1
Streaming "audio-chunk" continuously (Around 2k bits per chunk, the size is chosen by Android. Maybe i need to make that smaller?)

Where am i wrong? Something obvious?
Sorry for bothering, and thanks in advance for your help!

light gale Jun 4, 2025, 1:21 AM

#

Sending message means sending metadata -> new line -> data JSON bytes -> payload bytes.

#

Here's log of communication start and 1st chunk (i've reduced buffer to 256 bytes, no luck):

Received: {"type": "run-satellite", "version": "1.5.4"}
Received: {"type": "describe", "version": "1.5.4"}
Sending: {"type":"streaming-started","version":"1.0.0"}
Sending: {"type":"info","version":"1.0.0","data_length":223}
Sending data: {"satellite":{"name":"Android Wyoming Satellite","attribution":{"name":"formatbce","url":"https://github.com"},"installed":true,"description":"Wyoming satellite on Android platform","version":"0.0.1-alpha","area":"Office"}}
Sending: {"type":"audio-start","version":"1.0.0","data_length":37}
Sending data: {"rate":16000,"width":2,"channels":1}
Sending: {"type":"audio-chunk","version":"1.0.0","data_length":37,"payload_length":256}
Sending data: {"rate":16000,"width":2,"channels":1}
Sending 256 bytes
..... repeating with new data

light gale Jun 4, 2025, 2:28 PM

#

Well, believe or not, the solution came to me in the dream. I missed sending "run-pipeline" message.

#

Works now

#

Now there's another problem: how to organize correctly the flow of voice data while TTS is played, and why HA is disconnecting from satellite after TTS...

lone tree Jun 4, 2025, 4:42 PM

#

I would be very interested to learn from @worthy granite how streaming generation is planned to be implemented. I’ve reviewed all related PRs, but I still don’t see an answer to this question. Will there be a function responsible for accumulating the LLM response, splitting it into groups of sentences and transmitting them to existing TTS servers? Or is it planned to implement a TTS server that directly handles the stream of chunks?
What system components still need refinement, and is there an approximate release date for this feature?

light gale Jun 4, 2025, 7:03 PM

#

lone tree I would be very interested to learn from <@638799193586139136> how streaming ge...

Sorry, i opened this thread for conversation about already existing implementation to the Wyoming. Don't bomb it. 🙂
Moreover, i think streaming generation will come first to PE, not to Wyoming...

light gale Jun 5, 2025, 12:17 AM

#

Debugging the connection. Looks like the audio data is hanging the satellite (especially if i'm streaming continuously, and HA starts announcing back). I'm not sure how to ease this flows. Also, for some reason it's announcing once, and after that HA is just switching the satellite entity to "responding", and doesn't send following announcements at all..

steep flower Jun 5, 2025, 7:32 AM

#

lone tree I would be very interested to learn from <@638799193586139136> how streaming ge...

I am interested in the voice streaming support as well. This would give the entire voice experience the certain snappiness.

lone tree Jun 5, 2025, 2:35 PM

#

light gale Sorry, i opened this thread for conversation about already existing implementati...

Wyoming tts client in the system still does not support async_supports_streaming_input() and related things. But we will probably have to wait for the release of at least one tts service with a new function to find out what solution will be applied. By the way, have you seen this project. https://github.com/jeffc/hassmic/ The app itself is a standalone satellite that also integrates via the Wyoming protocol (unfortunately, it requires a pipeline with an external wake word service).

light gale Jun 5, 2025, 2:56 PM

#

lone tree Wyoming tts client in the system still does not support async_supports_streaming...

Thanks! Will check that app!

light gale Jun 5, 2025, 2:59 PM

#

lone tree Wyoming tts client in the system still does not support async_supports_streaming...

Unfortunately, that app is ReactNative, not Android native. Also Android part is written already deprecated, with Java...

worthy granite Jun 9, 2025, 3:09 PM

#

Sorry for the late reply, I've been out for over a week! @light gale are you sending "audio-stop" and "played" from the satellite? The "played" event is needed to tell HA when the TTS announcement has finished.

worthy granite Jun 9, 2025, 3:11 PM

#

lone tree I would be very interested to learn from <@638799193586139136> how streaming ge...

The TTS server itself will be responsible for accumulating/splitting/grouping chunks of text. Wyoming will need some additional messages for this to work. The reason for this is because each TTS system knows the best way to split text for itself. If it doesn't support streaming at all, the original Wyoming message with the full text will still be sent.

light gale Jun 9, 2025, 3:11 PM

#

worthy granite Sorry for the late reply, I've been out for over a week! <@299947094674767872> a...

Yeah thanks Mike! Currently i solved the basic communication back and forth, and even added VAD. Will be porting Openwakeword now 🙂
Main trouble was with memory, of course. And i have to be very careful with what i'm sending to server looks like. E.G. sending run-pipeline somewhere in the middle is crashing the connection 🙂

#

also i guess i don't need to send audio-start and audio-stop to server, and also streaming-started and streaming-stopped seem to be single-use too? Because with VAD i wanted to stop audio sending properly, but after streaming-stopped and streaming-started again, HA dropped connection.

worthy granite Jun 9, 2025, 3:23 PM

#

Are you sending the ping/pong messages?

light gale Jun 9, 2025, 4:15 PM

#

worthy granite Are you sending the ping/pong messages?

I'm responding to server ping, yes. For myself, i send ping only if i didn't hear from server for 5 seconds - but it never works (basically, if server didn't send ping for 5 secs to me, it's already broken connection)

light gale Jun 12, 2025, 2:55 PM

#

worthy granite Are you sending the ping/pong messages?

Okay, well, looks like server doesn't respond to ping messages at all..?
Anyways, i don't care about it, i just have watchdog now, that is considering connection broken if there's no ping from server for 10 sec.

Now i have another things... 🙂

I wanted to stop streaming audio from satellite while TTS is received back. But if i'm sending streaming-stopped, and then streaming-started after played, server breaks the connection...
Should i just stop sending audio chunks instead? It seems to be working that way. But then i don't understand, what are those streaming start/stop events...
Is there a mechanism to tell server that satellite is disconnecting? When i close connection on my device, HA still shows satellite as available...
Initial connection process isn't clear... I'm sending info for every describe, adding satellite to HA - but not getting run-satellite from server, until i physically restart my satellite (sometimes after 2 restarts). After that, it starts working properly (getting run-satellite on each restart). Should i explicitly reconnect after describe -> info?... I don't see that logic anywhere in other projects...

Thanks in advance Mike!

lone tree Jun 12, 2025, 8:08 PM

#

I have implemented streaming (with synthesize_and_stream method); perhaps this integration will be useful to you in your development. https://community.home-assistant.io/t/streaming-support-for-wyoming-tts/900708

light gale Jun 12, 2025, 11:08 PM

#

lone tree I have implemented streaming (with synthesize_and_stream method); perhaps this i...

That's really great thing!

I would ask you please (again) to create another thread for this - although it's really cool and highly appreciated, it's bombing my thread. I created it for learning about the existing protocol props, not for new feature discussions.

#

@worthy granite For some reason i stopped receiving pings and run-sattelite :

2025-06-12 16:03:16.610 DEBUG (MainThread) [homeassistant.components.wyoming.config_flow] Zeroconf discovery info: ZeroconfServiceInfo(ip_address=ZeroconfIPv4Address('192.168.1.118'), ip_addresses=[ZeroconfIPv4Address('192.168.1.118'), ZeroconfIPv6Address('fe80::bc90:edff:fec0:2fe0')], port=10700, hostname='Android_1VME4EDT.local.', type='_wyoming._tcp.local.', name='android-wyoming-satellite._wyoming._tcp.local.', properties={'': None})
2025-06-12 16:03:25.204 DEBUG (MainThread) [homeassistant.components.wyoming.assist_satellite] Connecting to satellite at 192.168.1.118:10700
2025-06-12 16:03:25.366 DEBUG (MainThread) [homeassistant.components.wyoming.assist_satellite] Connected to satellite
2025-06-12 16:03:30.370 DEBUG (MainThread) [homeassistant.components.wyoming.assist_satellite] TimeoutError: 
2025-06-12 16:03:30.370 WARNING (MainThread) [homeassistant.components.wyoming.assist_satellite] Satellite has been disconnected. Reconnecting in 10 second(s)
2025-06-12 16:03:33.371 DEBUG (MainThread) [homeassistant.components.wyoming.assist_satellite] Disconnecting from satellite
2025-06-12 16:03:33.372 DEBUG (MainThread) [homeassistant.components.wyoming.assist_satellite] Connecting to satellite at 192.168.1.118:10700
2025-06-12 16:03:33.410 DEBUG (MainThread) [homeassistant.components.wyoming.assist_satellite] Connected to satellite
2025-06-12 16:03:38.412 DEBUG (MainThread) [homeassistant.components.wyoming.assist_satellite] TimeoutError: 
2025-06-12 16:03:38.412 WARNING (MainThread) [homeassistant.components.wyoming.assist_satellite] Satellite has been disconnected. Reconnecting in 10 second(s)
...

It connects, sends describe - i'm responding with info, and that's it...

light gale Jun 13, 2025, 12:07 AM

#

Could it be that it's something with newest HA version maybe?

#

Because my logic didn't change at all, and yet i can't for life of me to get my HA register the satellite

#

It sees my info message, because without that it wouldn't let it appear in Discovered devices.

#

But it can't add satellite, trying to reconnect... And doesn't send anything to the satellite.

light gale Jun 13, 2025, 12:24 AM

#

Moved back couple days in my code - it also doesn't work, which means something changed on HA side probably?...

light gale Jun 13, 2025, 1:36 AM

#

Still struggling with it. My loop, waiting on socket input channel, returns nulls constantly, no unknown data, nothing...

#

I actually cannot remember if something was there before. As far as i remember, describe -> info pair is pretty much everything i had - after that there was just run-satellite...
But still HA says "Unable to connect". Which is ridiculous, because it just sent describe and received info on that socket....
Should i try resetting connection right after info sending?...

light gale Jun 13, 2025, 2:25 PM

#

Today i succeeded connecting satellite to HA. No code changes from my side. The config entry was added since yesterday, and tried to connect all night. Today i restarted the satellite app several times, and eventually config entry connected and shown all corresponding entities. Then i went and launched "Set up voice assistant" flow, setting up the pipeline with OWW.
However, run-satellite was sent only after another satellite restart, and i responded with run-pipeline.
But pipeline was still not ready i guess, server was just sending pings that's it.
And just next time, when i restarted satellite once more, i got run-satellite, responded with run-pipeline and got detect back, so i was able to start streaming...

#

Well huh, after next restart i didn't receive anything but describe again, so i'm on square one.

prime mist Jun 13, 2025, 10:47 PM

#

Hello, you might find my project interesting! https://github.com/roryeckel/wyoming_openai
It hosts a wyoming server

GitHub

GitHub - roryeckel/wyoming_openai: OpenAI-Compatible Proxy Middlewa...

OpenAI-Compatible Proxy Middleware for the Wyoming Protocol - roryeckel/wyoming_openai

light gale Jun 14, 2025, 3:12 AM

#

prime mist Hello, you might find my project interesting! https://github.com/roryeckel/wyomi...

Well, you're using wyoming.server as dependency, while I'm trying to implement it... 🙂

prime mist Jun 14, 2025, 4:42 AM

#

light gale Well, you're using wyoming.server as dependency, while I'm trying to implement i...

wyoming.server uses AsyncEventHandler to listen to a stream of events from (usually) a tcp socket. Look at async_read_event in https://github.com/OHF-Voice/wyoming/blob/master/wyoming/event.py

GitHub

wyoming/wyoming/event.py at master · OHF-Voice/wyoming

Peer-to-peer protocol for voice assistants. Contribute to OHF-Voice/wyoming development by creating an account on GitHub.

light gale Jun 14, 2025, 8:53 PM

#

prime mist wyoming.server uses AsyncEventHandler to listen to a stream of events from (usua...

I have that already implemented. And it works. And I already had successful session with HA as voice assistant, so server code is correct.
My struggle now is about the order of events that server/client are exchanging with. It worked before, now it doesn't for some reason.
Let me know if you need more deep-diving, and thanks for your wish to help.

light gale Jun 15, 2025, 1:32 AM

#

Okay, i found one of the problems. It's mDNS.
Looks like, if i have mDNS on, HA cannot add/connect to satellite. After each restart, looks like HA treats satellite as completely new device.
I'm not sure how to avoid this, but for now i just disabled zeroconf completely, and it sudo works.

Now another problem is, i have to restart satellite at least 2 times to get it connected first time:

Launching satellite, initiating config entry on HA side with IP/port. Exchange describe->info, HA adds config entry, in disabled - Unable to connect state. No further wyoming interaction from HA (no ping, nothing).
Restarting satellite. HA is connecting to satellite, exchange describe->info happening, config entry is initializing entities, but again - no further wyoming interaction from HA (no ping, nothing).
Restarting satellite again. HA is connecting to satellite, exchange describe->info happening, HA starts sending ping and sends run-satellite command. Flow is initialized successfully, everything works as expected after.

I need to understand, why these socket disconnections happening.
Also would be nice to have mDNS working as expected.
Also (unrelated to previous) if the TTS response is long, and i start streaming right after playing it, HA disconnects socket too... Should i wait for some time before streaming again?..

lone tree Jun 15, 2025, 1:58 AM

#

light gale Okay, i found one of the problems. It's mDNS. Looks like, if i have mDNS on, HA ...

if the TTS response is long, and i start streaming right after playing it, HA disconnects socket too..

Is this possible with the current WS client implementation? I added a few diagrams to the project repository while trying to figure out the data transfer situation—maybe you'll find them useful.

light gale Jun 15, 2025, 2:21 AM

#

lone tree >if the TTS response is long, and i start streaming right after playing it, HA d...

Thanks, the diagram seems right, but my situation is what's happening after audio-stop received. My streaming satellite is stopping sending audio-chunk while pipeline is happening, up until received audio-stop and detect - then it waits for all audio to be played by local player, sends played and starts sending audio-chunk stream again (no local wake word).
It works if speech was relatively short - but if it's fairly long, i'm getting "broken pipe" error after several packets, which means HA disconnected from socket

worthy granite Jun 16, 2025, 7:34 PM

#

Is HA disconnecting because of a missing ping/pong? There is an overall timeout per pipeline too, so maybe this is being hit.

light gale Jun 16, 2025, 8:41 PM

#

worthy granite Is HA disconnecting because of a missing ping/pong? There is an overall timeout ...

No, ping-pong doesn't happen at all - just initial describe call from server.

light gale Jun 27, 2025, 12:48 AM

#

worthy granite Is HA disconnecting because of a missing ping/pong? There is an overall timeout ...

Ok, i ripped through everything - apparently, HA is disconnecting right after receiving "info", and wanted from satellite to restart the socket. This happens several times during initial setup, but now i managed it (even onboarding is showing).
Now i'm basically done with implementation shenanigans - the only thing that's bothering me is lack of communication from server.
Can you help me please?

#

First and foremost: how to distinguish "nevermind"?
Normal communication from server is

"voice-started"
... voice...
"voice-stopped"
 "transcript"
... intent, tts...
"synthesize"
"audio_start"
... chunks ...
"audio_stop"

But if i say "nevermind", it stops after transcript. And since it might be like 10 seconds between transcript and synthesize, i can't even make decent timeout to go back to idling...
Like, the question: is there some indication that pipeline is ended?

#

Another thing is - is it possible to get info about timers on start? Like, when i restarted satellite, can i get active timers from HA? I don't know how, if at all... 🙂

#

Thank you in advance Mike! I hope it's simple questions.

light gale Jun 30, 2025, 8:37 PM

#

worthy granite Sorry for the late reply, I've been out for over a week! <@299947094674767872> a...

Hi Mike! With new streaming stuff, will it send audio stream twice now? I see it's using chunks first, but then is sending old data for compatibility. I guess relying on audio-start and audio-chunk won't be enough now for correct work?

worthy granite Jun 30, 2025, 9:40 PM

#

For the "nevermind" issue, you should get a run-end message. If no TTS events or audio has been sent, this means the pipeline ended early.

worthy granite Jun 30, 2025, 9:41 PM

#

light gale Hi Mike! With new streaming stuff, will it send audio stream twice now? I see it...

The audio shouldn't be sent twice, but the text to synthesize is: first in chunks and then all together in the final synthesize message. If you're processing the text stream, then you should ignore the final synthesize event.

#

The audio start/stop also end up being used for each audio chunk, which is why I added synthesize-stopped to indicate that the TTS system is completely done producing audio chunks.

lone tree Jun 30, 2025, 9:44 PM

#

Two methods are used for compatibility, as far as I understand, this is not a mandatory requirement. In my alternative client, I simply check the server for streaming capability. This places a little more responsibility on the user (do not change the server type on the fly).

light gale Jun 30, 2025, 9:57 PM

#

worthy granite For the "nevermind" issue, you should get a `run-end` message. If no TTS events ...

I'm not receiving run-end...

Sending: {"type":"run-pipeline","version":"1.5.4","data_length":39}
Sending data: {"start_stage":"asr","end_stage":"tts"}
Received: {"type": "transcribe", "version": "1.5.4", "data_length": 21}
Received: {"type": "voice-started", "version": "1.5.4", "data_length": 19}
Received: {"type": "voice-stopped", "version": "1.5.4", "data_length": 19}
Received: {"type": "transcript", "version": "1.5.4", "data_length": 23}

And that's it

light gale Jun 30, 2025, 10:00 PM

#

worthy granite The audio shouldn't be sent twice, but the text to synthesize is: first in chunk...

I meant this part:

Streaming:

→ synthesize-start event (required)
→ synthesize-chunk event (required)
Text chunks are sent as they're produced
← audio-start, audio-chunk (one or more), audio-stop
Audio chunks are sent as they're produced with start/stop
→ synthesize event
Sent for backwards compatibility
→ synthesize-stop event
End of text stream
← Final audio must be sent
audio-start, audio-chunk (one or more), audio-stop
← synthesize-stopped
Tells server that final audio has been sent

in here: https://github.com/OHF-Voice/wyoming

GitHub

GitHub - OHF-Voice/wyoming: Peer-to-peer protocol for voice assistants

Peer-to-peer protocol for voice assistants. Contribute to OHF-Voice/wyoming development by creating an account on GitHub.

worthy granite Jun 30, 2025, 10:03 PM

#

light gale I'm not receiving run-end... ``` Sending: {"type":"run-pipeline","version":"1.5....

The server should be sending run-end. I think I'm confused about which perspective we're talking about here.

light gale Jun 30, 2025, 10:04 PM

#

With Wyoming it's a bit confusing, eh?
Socket-wise the satellite is the server, and HA is the client. But logically HA is the server.
So if i'm saying "nevermind" that speech goes to HA, then HA should send "run-end" to satellite, so it knows that pipeline is shut. But i don't see run-end on satellite side....

light gale Jun 30, 2025, 10:09 PM

#

worthy granite The server should be sending `run-end`. I think I'm confused about which perspec...

See, i'm receiving from HA that voice started, then voice ended, then transcript - but after that it's just silence in the socket...

#

I'm logging every line i'm receiving on satellite side.

worthy granite Jun 30, 2025, 10:19 PM

#

Oh, I guess that's only internal to HA. Maybe I was expecting run-satellite to be sent multiple times 🤔
I should probably add some kind of end event there 😄

light gale Jun 30, 2025, 10:20 PM

#

worthy granite Oh, I guess that's only internal to HA. Maybe I was expecting `run-satellite` to...

Would be nice. 🙂
BTW README is really good.

#

I'd actually also made Timers a bit more robust... Other things seem to be working excellent.

lone tree Jun 30, 2025, 11:25 PM

#

worthy granite Oh, I guess that's only internal to HA. Maybe I was expecting `run-satellite` to...

Michael, have you considered adding a command to the protocol for controlling the volume on the satellite?

light gale Jun 30, 2025, 11:27 PM

#

There's also a lot of info that satellite is sending on wake words etc - would be nice to have the controls in HA actually working :))

worthy granite Jul 2, 2025, 3:23 PM

#

Definitely! I also want to have different pipelines for different wake words, a "stop" like the VPE, etc. 😄

worthy granite Jul 2, 2025, 3:24 PM

#

lone tree Michael, have you considered adding a command to the protocol for controlling th...

Yes, I have a branch with media messages that I'm working on and this is on the TODO list.

light gale Jul 3, 2025, 4:49 PM

#

worthy granite The audio shouldn't be sent twice, but the text to synthesize is: first in chunk...

Hey Mike!
After moving to HA 2025.7.0, after audio-start i'm getting this:

Received: ����|�~�{�y�y�{�|�x�x�u�v�v�w�x�v�x�x�x�z�w�w�|�~�������}�~����������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������....

#

And of course parser is failing.

light gale Jul 3, 2025, 5:04 PM

#

Oh okay, it's actually happening because of exception - that is happening because payload length on audio chunk is 700+ kilobytes. Is it normal?

#

@worthy granite i realized, that for cached responses single audio-chunk is returned with full response now. Is it intended, or is it bug? Because if it's intended, i will have to remake the data logic (right now i'm using byte array pool, but that has restricted max size)..

#

What doesn't kill us, makes us stronger, right? I'm allocating now the pools up to 1MB... Hope that's enough.

lone tree Jul 4, 2025, 12:13 AM

#

light gale <@638799193586139136> i realized, that for cached responses single audio-chunk i...

The components have a choice of which method to request from tts, depending on the type of data being sent. If it receives a string, full synthesis will occur; if it receives an asynchronous generator, streaming will begin.
Currently, there is no smart selection (as when working with LLM) for announcement actions, tts.speak, and full responses, and a full message is always sent. This may change over time.
This is what the solution looks like, returning streaming at the response stage when automation passes the full text to set_conversation_response.
To maintain compatibility with older methods, you will probably have to use a solution with an increased buffer.

light gale Jul 4, 2025, 1:00 AM

#

lone tree The components have a choice of which method to request from tts, depending on t...

That's fine. I'm just stumped with situation when first request returns 4096 bytes chunks of audio, and second identical request returns 300KB with single chunk.

#Wyoming protocol implementation