I’ve been playing with a Respeaker Lite. It’s all working using formatBCE’s YAML (which is great, and not relevant to the issues I’m seeing), however I was wondering if anyone had any tips to improve the micro wakeword detection. When it hears the word it’s marvellous, but best case that’s about 50% of the time. Annoyingly that’s the Alexa mode. Reducing the probability to 0.5 improves things a little, but of course the false negatives creep in. Basically it’s not quite good enough to use “in production”, which is surprising and disappointing. Hopefully it’ll be improved in the coming months, but I’m impatient, so I was wondering if anyone has any tips.
#Micro wakeword false negatives with Respeaker Lite
1 messages · Page 1 of 1 (latest)
Thanks for kind words! 🙂
So you use "alexa" wake word? That model isn't the best (actually, none of them except "ok, Nabu" are great).
Case is, that only Nabu model was trained with real voice samples (and still, it lacks support for female/kids voices because there was not enough samples gathered). All other models were created with synthetic dataset (basically, word, said with TTS engines). So they don't work good with real voices, unfortunately.
I found "hey Jarvis" to work better than "alexa". Didn't try "hey Mycroft". But i use "ok Nabu" everywhere in the house now, since it's the best quality...
Probably, we could start initiative to gather samples for other wake words than Nabu, but it requires a lot of effort and advertisement in community... After that we could ask Kevin to train models. There's still no way to train MWW model on my own.
Hello there! That’s really interesting, thanks. I’ve played with “Okay Nabu” a bit more. It seems it really does not like my fairly deep southern England male tones. But if I attempt to impersonate some other accents it has a much better chance of matching. My son (age 11) managed to match it a few times too.
So the question is how do I go about helping with the training? Happy to ask my son to record himself saying each of the trigger words too.
This clearly has massive potential but I think only with good training will it be viable for the wider community?
Here you go! 🙂 https://ohf-voice.github.io/wake-word-collective/
Do you know how often they bundle up the contributions and do a new release of the model? Looking at the GitHub it looks like the last release was 2 months ago?