#A question about onnx vs tflite

1 messages · Page 1 of 1 (latest)

keen tapir
#

I'm noticing a divergence in voice assistant wake word architecture. I can't be the first one to comment on this, but I also can't really find anyone else talking about it. Almost all the wake word training libraries output onnx, and all the software that's being developed wants to use tflite files because, I suppose, it's better for extremely low-power devices.

Is there a server somewhere that I can set up that integrates nicely into the HA voice pipeline that's based on OpenWakeWord or something else that uses .onnx files? It's possible to convert onnx -> tflite and I've written scripts to do it, but it's a pain and depending on what software generates the onnx file there's hidden settings (flipped axes are the big one) that need to be addressed, and no particularly easy way to convert. It's just a pain, right now.

Are there any projects addressing this? Are we gonna decide on one or the other at some point?

I realize this is an esoteric question - I just don't understand why training software / scripts output one and all the wake-word software wants another.

humble lance
# keen tapir I'm noticing a divergence in voice assistant wake word architecture. I can't be ...

The devices handeling the wake word detection are often microcontrollers. To handle even small AI they have to run very light weight optimized versions of that software. In this case they run tflite, which is optimzed by google to run on very little compute/memory. But..people learn to to ML using different libraries. So they use what they know. So they can implement these older CNN's in tf, onyx, torch, etc. Thus, why you have to convert. That is the easy step....

gloomy lance
#

In the first implementations of voice satellites, the Wyoming satellite used the OWW engine, which works with the onnx format (Torch is used to train models). Later, developers from the OHF team created the MWW project, which even runs on the esp32. This project uses TF, and the models have the tflite extension. It's all quite simple. Also, all modern satellite projects (LVA fox linux, AVA for android )are switching to the use of MWW.

strange skiff
#

What I'm looking for would be HA accepting either & convert it into the required one - or provide a tool for that. That "script search" is what makes it really hard for people like me to get a customized voice assistant: I'm about to extract all German JARVIS snippets from the Marvel movies as there's only an English model out there that sounds terrible when "forced" to speak German. Then, I need to train the model, then convert it & finally set HA up to use it.

keen tapir
# humble lance The devices handeling the wake word detection are often microcontrollers. To han...

No I understand that this is the current reason why tflite gets used, and I get that current ML libraries use onnx. What I dont' get is why converting to tflite remains such a chore and why some ML libraries can't be retooled to output tflite or an easy conversion utility that deals with the flipped axis issue, an issue I only know how to resolve myself because some random redditor pointed it out. I would have never come up with that solution on my own.

humble lance
# keen tapir No I understand that this is the current reason why tflite gets used, and I get ...

I’m not sure what you are actually asking. Any of the microwake word pipelines output tflite as the last step. Kind of like turning into a pdf. It’s takes almost nothing to do.

If you are asking why we cannot convert between microwake word and open wake word, that’s not just a formatting issue. The neural networks are different and conversion as far as I know is not possible….

keen tapir
#

It's possible I'm not familiar with the ones you're using. There's the classic "always-broken dscripka" notebook, and a few other remakes that tried to address that floating around, but most of them output onnx files. Are there some you recommend?

humble lance
#

This is the easy button

#

I haven’t used the latest version that has all UI, but it’s based on his older notebooks which are great

keen tapir
#

Not sure how I missed this one, I'll certainly give it a try. I have a very large dataset from the Mycroft days (RIP) that I can use for positive data.