Both are in stock and both cost about the same.
Other than the touchscreen aspect of the ESP device, which would you say is the best purchase? I'm specificaly looking to get it working for LLM voice (which i've already pretty much got setup with a M5Stack echo, but the lack of ram and speaker/mic quality makes it only really much use for testing purposes, not for use in the house)
#Which would you recommend out of a ESP32-S3-BOX-3 and a HA Voice PE?
1 messages Β· Page 1 of 1 (latest)
the voice pe: it has the XMOS chip and a 3.5mm jack makes the mics alot better and gives you output options.
Mic of the box 3 is not that bad. Perhaps the PE will improve through Fw updates but as it stands today the microphone is not much better. Also saw some videos suggesting the performance improves if put on a stand. But all in all, today not that much better.
Upside of the box 3 is the screen, with the custom Fw of big bobba it's possible to pipe the response to another speaker.
It's possible to modify the Voice PE firmware to also redirect the audio, though it's a bit clunky at the moment due to the different audio stack it uses.
ALso of note is that the Voice PE can be used as a music "cast" device for things like Music assistant, broadcasts, tts, etc
I think if you are looking for something even more advanced, you may want to wait and see what happens with the FutureProofHomes Sattelite1, it has 4 mics so they may eventually update the firmware to support true Beam Forming, Diarization, etc that will make its mics more precise.
thanks for the replies. as long as I can use it to ask an LLM how to do stuff I'm happy. I've already had Alexa devices all over the house for years (which have gradually got more and more crap) .
id wait for the other makes to come out but, frankly the voice PE device is already quite expensive and no doubt better ones will be even more so. given what Alexa devices cost. it's gunna be a expensive replacement process.
so do you think that voice PE (or the esp one) are in a position to be good enough for "family use" yet ? is it simply software updates needed to make it better or is it hardware the issue and should really wait for something better?
Hard to say. For my uses they work great, even in a living room with a tv blaring and two very active young boys running around and yelling. No false positive wakes, hears me well as long as I duck the tv audio (I have mine right below my tv, probably would work better if I moved it further away but works for me). I have another in my workshop hooked up to a 2000w amp, that I have set up with Music Assistant. I have music blasting out of it, and it is still able to hear me, wake up and duck the audio, and process my commands without issue.
That said, it's a combo of software and hardware for these. For instance the Voice PE has two microphones, which makes true diarization/beam forming not easily feasible as the algorithm requires 3 points of reference. Google makes this work by using fancy AI algorithms to mimic how our human ears work, but the more simple algorithms are doing triangulation math to make it work. So that may be one limitation of the Voice PE. the Sattelite1 aims to fix that as they use 4 mics, so they plan to have a future firmware to add in the proper diarization/beam forming.
In environments without music (vocals), TV (vocals), or multiplayer conversations, PE or ESP can now meet daily needs. Works well with minimal background noise. What is missing is software algorithms to separate the voice of the person issuing the instructions. The addition of beamforming algorithms will greatly improve the performance. This will take a long time.
Depends on the family composition π - my two kids can't activate it. Ok Nabu is hit and miss, Hey Jarvis is a disaster. And then there is the language, it cannot do dual language which in our house is a must. So for now.. it's my toyπ
I had it all:
- self-made ESP32-S3 based satellites with INMP microphone and MAX98753 amplifier;
- Pi-zero-2W with Respeaker 2-mic hat as Wyoming satellite
- ESP32-S3-BOX and BOX3
- have the Satellite1
- (now using) Respeaker Lite with ESP32-S3 (Koala Satellite)
So last one i built is very similar to PE (also XMOS XU-316, 2 mics, same ESP chip, better speaker, pretty much same software, based on PE software). I have 5 of them in different rooms, and the only one that gives me trouble a bit is the one in living room, when TV is working.
Yes, my kids have troubles with wake word. I tweaked some settings, and my wife is using it successfully, kids often just go and push the button to activate it. I ditched Alexa completely around 7-8 months ago. It wasn't simple replacement, i had to write some scripts for routine tasks - but i'm not using LLM (well i do, but just for common knowledge, no control).
BOX/BOX3 have very bad speaker and right now they can't use newest media_player (it's worked on). Ditched into the drawer. Maybe gonna use as some visualization panels... (although displays also aren't great)
Once I asked my daughter to mimic my voice (just barely), she can activate the wakeword π Ttry that with your little ones, it might work out.
TTS then works great with her normal voice
Think the wake word will improve as people submit more samples, hopefully some with kiddos to help on that end. π
Though I personally am kinda glad my kids can't activate it π
you haven't shared the secret about the button then, I guess?
Nooope! π€£
The PE, its purpose built. The S3 box was a retrofit for existing hardware
ordered a PE last night. should get it today and have a play around with it