#DIY setup for ESP Based Voce Assistant
1 messages Β· Page 1 of 1 (latest)
ok i will put everything in here π
i have put all voice related configs and instructions so far https://github.com/BigBobbas/esphome_firmware
I am working on this constantly at the moment so it will be updated daily and probably several times.
Are my components pretty common:
max98357 I2S 3W amp
INMP441 Microphone I2S
plus the ESP32 and ESP32 S3
Is the mic decent? Audio quality good enough to play music on. Wondering if a decent speaker is used if it would be on par with a Google Home Mini (early gen) and/or Echo Dot (early gen)?
they are basically the go to components and they work! so you should be well prepared
the link above are samples recorded on an inmp441 and an esp32
Oh nice. That sounds really good. Amazing what they can squeeze into a small package
oh absolutley ! they work really well.
Is the schematic on the GH page?
not yet ... hopefully i will have one on tomorrow. this is one that was done for a livestream with JLO https://github.com/jlpouffier/voice-assistant-esphome-tutorial/blob/main/wirring.png
I'd really like to find a way to 3d print a nice case for these in the same form factor as the first gen Echo Dots. Would be really cool to make the case where the components like the esp and amp could be pushed into place on some retaining clips and then the mic would have a little slot that it could be slipped into. User could then use the dupont cables to wire it. Would bring the barrier of entry way down IF these components could be sourced without the need to solder. Soldering not an issue for me but I know others may have the desire but lack the confidence or finances to do it themselves.
yeah i get what you are saying, the only problem is that duponts are not the most reliable way of connecting and do cause a number of issues which can lead to hours of head scratching π an interesting design , in the realms of how you are thinking is this https://www.youtube.com/watch?v=4Ft5RxS9Ob4 3d printable, totally modular and tool free
Yep. I know what you mean. I wouldn't use it outside of prototyping but I've been dabbling with electronics for about 30 years. Pretty sure dupont didn't exist when I first started or I had no way to know about them at least! But in the end, they do make things a lot more accessible and maybe will lead that user to better ways of doing things. Makes putting together projects much more Lego like.
That video is great. It's definitely what I'm meaning in regards to the mounting. I guess it can get tricky as you almost have to use the same type or size sensors otherwise it has to be redesigned if something changes. I'm a far ways off from doing something like that myself. I think I need to take a few weeks off of work and learn a bunch of stuff instead of trying to shoehorn it in while trying to do my work too! ha
yes... there are so many different aspects, and to be a master of all is a pretty tall order! I try and do as much as I can but there are somethings that I just havent got into and one is the design side of 3d printing I will happily download stuff or I will just use prusa slicer and basically mangle objects together in order to try and make something that will fit my project.. its not usually pretty but i get away with it ....
Voice assist is really coming together. I'm hoping that this work and the diy stuff will spur some creativity that all of us can benefit from. Really really exciting to think that I'll be able to replace these Alexa and GH devices. I can't tell you how much they frustrate me.
I've spent a TON of time working on the Echo Show replacement due to Amazon constantly showing me things I don't want to see. Really I just want to see the clock! Now after about a month I can do about 80-90% of what I want to do with it via custom sentences, web ip cam to stream audio to Stream Assist for audio into assist on the tablet, Fully Kiosk with Browser Mod to control display and expose the tablet as a media player device. Hoping to wrap up and share this with community with the hopes that folks smarter than me (most folks) can improve it for themselves and reshare with me and community.
sounds like you have put in a lot of work and it also sounds like you have a worthy entry for the Voice Assistant contest! you should enter. not long to go. I would love to see your project. I have created a similar project to replace my Lenovo google display. I have turned the voice facility off on it a long time ago and now it serves as a digital photo frame and occasional speaker in another rooom. ( i really need to look into doing a conversion on it)
I have replaced it with a 10" Hdmi Touch screen , with a Pi4 fixed to the back and the speaker from a google mini and an inmp441 mic. I run fydeOs on the pi and use chromium as the browser to display the dash view which i use browsermod to darken the screen when no presence in the room, and to also navigate to a camera view if someone is on the drive at the front of the house, or the doorbell is presssed. Currently I have been driving the mic and speaker off 2 seperate esp32's in order to have the media player seperate, however I have a solution for that now that will let me use just 1 esp32s for bothe media player and mic with no issues. As is everything else in my house, this is not a completed project... it is functional and works well, it just needs a little tidying up around the back and some tweaking of the frontend to make it better.
Thanks for the kind words. The thing is I really don't have the skills to do any of this the 'right' way but hopefully my proof-of-concept will spur others to pick this up and do it better. I think I may enter it into the contest. While it is different than most other things I've seen it does fill a void in regards to replacing existing hardware from Amazon/Google/etc I think. It's a crucial piece in my family's and my willingness to attempt to switch to something different.
^^^ Link to video
The video was done about a week ago and I've added a few more items since then. Hoping to get some stuff done today on wikipedia search and grocery list along with tidying up some backend things.
THAT IS AWSOME!!!! brilliant work π
ok ... so i create a timer helper for each time duration i want, i created them in 5min blocks up to 2 hours ( i'm sure this could be templateable , but i've had it setup since forever π )
i then have an automation that contains all of the helper id's as triggers, so if any state changes to idle, that means the timer has stopped, i then have a generic 'your timer has finished' to save doing lods of different ones π
So check this out. Custom sentence uses trigger id 'timer':
- conditions:
- condition: trigger
id:
- timer
sequence:
- set_conversation_response: Timer set for {{ trigger.slots.name }}
enabled: true
- service: timer.set_duration
data:
duration: "{{ trigger.slots.name }}"
target:
entity_id: timer.assist_timer_1
enabled: true
- service: timer.start
metadata: {}
data:
duration: "{{ trigger.slots.name }}"
target:
entity_id: timer.assist_timer_1
enabled: true
This will tkae a sentence 'Set timer for XX seconds' and send the 'XX' value, call timer.set_duration (homie integration) and then start the timer
well that is a million times better than my solution π hahaha i havent even started with any of that stuff lol .. really need to start
I don't know that there's a stock way to set a timer. Well now I've got to figure out how to display this one π
Will eventually try to add a way to do minutes/hours too.
Grr. Might as well do that now
and dont forget! only about a week to enter the contest! and you need to enter.... because your creation is a thing of beauty !
Thanks. I know. I also have a huge project that I've been spearheading for the past year at work that is rolling out end of next week so my time is really limited . Fun times π
So a spin over in #templates-archived and I have something that SHOULD work for converting seconds/minutes/hours to seconds but now I can't figure out how to get assist to accept my sentence. It's things like these that eat away at the day. Oh well. More to follow I'm sure
unfortunately this is where I am of no use π
Thanks again for the encouragement. I am uploading a video now with a full demo and will post on the contest page in the forum.
In other news, I've received most of my parts for the voice satellite build and will start work on putting that together soon.
that is excellent... so glad you are going to enter it. you have done an amazing amount of work, i can't begin to imagine how many hours you have put in. you have have covered so much functionality and it works flawlessly. good luck with the contest! If you need any pointers with the satellite build feel free to ping me .. thank for updating me, and i will check out your entry π
https://community.home-assistant.io/t/view-assist-visual-feedback-for-assist-voice-assistant/699659
Yep. Posted. Certainly was a lot of work and still a lot more to do but I'm pretty pleased with what I cobbled together. Really it was good folks like you and others that make these things possible due to all the questions that are being answered. I absolutely could not have gotten what I have without a lot of help and annoying people! haha
aghh not in any way annoying. i love challenges and i love being able to help out when I can ,if it's something i can help with or at least give pointers.
I was hoping someone could help me I have followed all the steps to get voice assistant working and have run the mic test with successful recordings in home assistant but cannot get wake word to work I have tried multiple boards
no problem , can certainly go through things with you. i will create a thread in the voice channel to keep your posts in there own place π
i have created the thread here https://discord.com/channels/330944238910963714/1214869527339208724
excellent thanks for that
Hey friend. I am taking a short (hopefully very short) break from working on View Assist to try and get one of these esphome satellites going. Is this wiring diagram still the one to use? Also, where is the best wakeword on device esphome config to use? Thanks
Hi There , yes the wiring is fine however I would look at using an ESP32-s3 dev board, depending on the size of the enclosure you have in min there are a few options available, if you have space then I would look at using an ESP-32-s3 N16R8 , the pins on those boards are pretty much all multi use so you don't have to use particular pins. any questions you have at all feel free to ask away.
ESP32-S3 Development Board 2.4G Wifi Module for Arduino ESP IDF ESP32-S3-WROOM-1 N8R2 N16R8 44Pin Type-C 8M PSRAM ESP32 S3 https://www.aliexpress.us/item/3256806080061048.html
This is the one I got. Hopefully it's the one you are referring to. Can you point me to the esphome config file for this one?
it's close enough... it's just less flash and less psram , but still fine for this application. you will need to solder the 5v pads to enable them to output 5v for the DAC
the pads in question are directly next to pins 11 and 12 .
this one should be enough to get you up and running https://github.com/BigBobbas/esphome_firmware/blob/main/kahrendt_micro_wake_word/obww_esp32_s3_mic_and_speaker.yaml the voice model is set to hey_jarvis but it can be changed to alexa or ok_nabu
looking at the board there are also a pair of solder pads for the onboard addressable LED which would also need bridging if you wanted to use that light as wake word detection etc. the config is already configured for this
just noticed there is the N16R8 on the link you sent too. depending on if you have the r8 or r2 you may need to make a change to the config , but if its the R8 then you will be fine
I'm struggling to remember how to flash. Do I need to flash genericly from USB and then adopt in esphome and then update the code. I struggled with this a bit the last time and didn't take notes π¦
best way is to make a new device in the dashboard, choose any random board it doesn't matter as you will overwrite the config. don't click install at the end of the wizard click 'skip' then a new card will be created, copy and paste the config i linked into the card , overwriting everything. click save then installand choose 'manual' at the end of compile select modern and you will then be able to save the .bin file, then go to https://web.esphome.com and connect, select the com port and then install.
hopefully that should be fairly smooth going doing it that way you don't need to adopt the device as you have manually created it
yes
I'm getting a compile error when I use that. Not a valid model name
Do I need to use quotes? I see that your example is not using quotes so I followed suit
I am running esphome 2024.3.2 is that modern enough?
Found it. It's spelled out okay and not ok. All good.
aha my bad for not checking
no worries. I figured it was something simple. You gave me a big boost otherwise I would still be searching for stuff
oh and yes that version is fine i think 2024.4.0 will be out next week and there are some voice enhancments included
My chip is different than the one listed on the wiring diagram. Can I assume that A0 on my board is the same as that on the diagram? I know it's critical to get these pins to match
pardon. My board just shows numbers. Can I assume that the numbers match the gpio values from the wiring diagram
This may be a struggle. First one I went to connect is GPIO34 and I don't have that one as an option. Any limitations to me moving these to available pins?
Most of these don't exist. GPIO means general purpose IO right? I should be able to use any of the free ones right?
@gritty wren Hmm. So looking at the code I'm seeing mic is using 2,3,4 I do not know what they correspond to though.
My mic board has these values silkscreened:
WS
SCK
SD
The config is showing these needing to connect:
i2s_audio:
- id: i2s_mic
i2s_lrclk_pin: GPIO3
i2s_bclk_pin: GPIO2
microphone:
i2s_din_pin: GPIO4
HOw do I know what matches to what?
ws goes to lrclk gpio3
sck goes to bclk gpio2
sd dgoes to din pin gpio4 :thumbs
you also need to connect L/R to gnd
Thanks. I'm assuming I'll need to know this for speaker too. I can pull the names from the speaker device in a few minutes
yeah no probs i'll have a look
Got the mic working. Thanks for sure. Let me see if I can do the same for the speaker
ah great
The speaker board has:
LRC
BCLK
DIN
i2s_audio:
- id: i2s_spk
i2s_lrclk_pin: GPIO6
i2s_bclk_pin: GPIO7
speaker:
i2s_dout_pin: GPIO8
So this is straightforward it looks like. LRC = LRCLK, BCLK=BCLK, DIN=dout ?
correct π
Great. I am SUPER impressed at how fast it detects the wakeword.
yes it works really well with handling it on the device rather than streaming it
hmmff.. no sound output π¦
is it a Max98357 DAC ?
Max98357 I2S 3W Class D Amplifier Breakout Interface Dac Decoder Module Filterless Audio Board For Raspberry Pi Esp32
powering with 5V?
Yep. Just checked and all seems good. Swapped out for another speaker/DAC and it doesn't work either.
did you bridge the pads on the esp32 for the 5v output
I doubt it. All looks good. I'll break out the meter tomorrow and see if I've got a disconnect somewhere. I don't hear these speakers make a single sound. Shouldn't it move a tiny bit when power is turned on?
no there are 2 pads that you have to solder in order to get 5v output on the 5v pin
yes
My old eyes really love this challenge. Thanks. I glossed over that as I assumed you were talking about the LED. Wish me luck. I'm wearing two pair of reading glasses for this one!
you need them lol ,,, thats the same for me π
Well that did it. Thanks for the help. I should have read what you wrote
So no streaming music to these devices?
yes you can , you can use media_player instead of speaker, but there is a better option with an external component which allows duplex audio , so that mic and speaker can be active at the same time, whereas with the current ESPHome media player you have to stop voice assistant in order to stream media, It works very well and was updated yesterday. i'll give you the link and you can take a look.. if you decide you want to try it then you can just copy the relevant parts from the sample configs. I'm using it on all builds now.
Yes. I will definitely want to do this. I can't work on it anymore today. Would you mind helping me with this later? I have used esphome devices before but will admit that this is the most complicated config file I've ever used and will probably need help modifying.
yes no problem at all . im on uk time so off to bed shortly , but will be around again in the morning
Hey I'm back for more. Can you help me get things set up to expose media_player?
i can indeed, i'm just in the middle of some other stuff , so may be a little slow ... but no problem
all good. Multitasking here as well. Just let me know what I need to look at when you free up enough to juggle this too. Thanks
probably easiest and quickest if you post your full config as it is now and i can make the changes , rather than going back and forth and i can comment the lines so you know whats changed
Here you go. Not much changed except the wake word and wifi credentials. Thank you
Can the name be changed without causing an orphaned device in HA? I had not noticed the default name before
yes , if you delete the device from HA then add it back in after flashing , that should be fine
just testing it compiles then should be done. before installing, if you click on the 3 dot menu on the device card in the dashboard and click 'clean build files' you can then go ahead and install
it will be a few fins and i'll paste a link, i'll stick it on my GH as it will do as an example, you will just need to change the wifi creds and the name if you want π
Flashed, but not seeing a media_player entity. Should I see one?
https://dpaste.org/U6APA
These are the logs on the device. Not sure about that last error
warning
The device does work as it did before. Just don't see the media_player device
thats odd if its not connecting to logs but it works, did you get the uploading bar after it compiled
errr.. I guess I didn't notice. Let me give it another go
I did find it weird the name did not change in HA either
sounds like it hasnt flashed
I should be able to install wirelessly right?
yes
INFO Successfully compiled program.
INFO Resolving IP address of assistsatesp32masterbath.local
ERROR Error resolving IP address of assistsatesp32masterbath.local. Is it connected to WiFi?
ERROR (If this error persists, please set a static IP address: https://esphome.io/components/wifi.html#manual-ips)
ERROR Error resolving IP address: Error resolving address with mDNS: Did not respond. Maybe the device is offline., [Errno -2] Name or service not known
I'll try flashing it by usb
did you delete it from ha ?
Yep. That did it. I will say that it is much slower in getting the response
I did not delete it. It seems to have updated itself. Is that part of the problem?
yes, it wont be able to resolve the mDNS address , is the media player showing mow?
*now
yes it is showing. It is functional but just slow in getting the reply. I say 'what time is it' and the light flashes a few seconds then it answers. Before it was nearly instant
it shouldn't be any different in speed, thats a head scratcher
Okay. I'll try removing it and adding it back. Noticing the media player is called media_player.media_player .. Is that normal?
It's working much better now.
Now the question about the music. Seems like Music Assistant tries to play for about ten seconds and then gives up
i'm not over familiar with music assistant , someone i spoke to yesterday that they had to enable something, i'll see if i can find it
That'd be great
in their words, they had to 'force mp3 decoding' i dont know where or how as I dont use it
Can you like that message or can I find it with a search. Would like to ping that person
Got it
In case someone asks it's Settings -> Players -> esp32device -> three dots -> configure -> audio -> Enforce (lossy) mp3 stream
I can tell you that this is FANTASTIC! I'll still need to do some work to get it incorporated with my View Assist devices but it won't take much. This $3 speaker sounds pretty good surprisingly. Then I'm on to 3d printing a case. That should be another challenge! haha..
Thanks so much for getting me here so quickly
your'e more than welcome, always happy to help if I can, yes 3d printing is a challenge, i just hope and pray i find something useable on thingiverse or printables for my enclosures. good luck π
I didn't find much when I looked but I did mostly like the egg one posted in the contest. Would prefer something that looked like a GH or Alexa but not sure how great I'll be at design.
Can you tell me if there's a way to change the esp32's media_player name to something better than media_player.media_player ? I'm pretty sure it'll just keep naming them that way and that's not ideal
in the device config you will see name: Media player if you change that to something different it should just show as the name, i think because the device wasnt deleted when you changed the device name it appends that to it , so going forward providing the device is unique it shouldnt happen
Do you see this device reboot itself somewhat frequently when playing media to it? I have an albumn queued and if I skip a few songs it appears to restart sometimes. The lights on the board flash and volume goes back to 100%. It may be a shoddy connection or solder job but not sure.
i've not come across it, no. if you view serial logs over usb from https://esphome.io it may give more info, if you paste the logs from the point directly before during and after the 'crash' i can take a look
I wasn't able to get to this today but I think the device is doing a bunch of resets. I'm not positive though but it does flash the light a lot and becomes unresponsive. I am watching logs wirelessly. Is that okay for diagnosing? Will the logs update. Isn't there an uptime sensor I can factor into the config to confirm what I am thinking?
viewing over wireless should show the fact that it has disconnected from the api, but for further info you would need serial logs over usb. There is an uptime sensor you can add, but that won't really help https://esphome.io/components/sensor/uptime.html
So I hooked up to usb. I am trying to issue a command. Looks like wake word is responding but then nothing. I end up seeing the same stuff in the log each time. Not sure what it means but so far the device is not responding as it was before. Can you take a look and see if you can make something of it?
can you try the logs again from the other usb port on the device, it should get identified AS A JTAG/DEBUG port when you plug in and connected, need to see the esphome logs too , the logsupi pasted are just bootloader logs
Other than this, does it matter what port I use for power?
[07:04:56][D][media_player:066]: Media URL: http://192.168.0.25:8123/local/viewassist/alarm1.mp3?authSig=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiIzY2YwNjMwYzg5ZjY0MzYxOTAwNWViZWQ1NmIzNGU3YyIsInBhdGgiOiIvbG9jYWwvdmlld2Fzc2lzdC9hbGFybTEubXAzIiwicGFyYW1zIjpbXSwiaWF0IjoxNzEzNTI4Mjk1LCJleHAiOjE3MTM2MTQ2OTV9.Em8XVMPchqUjgUU4plRyxbEfRuNCikqOXK-clZoQQg0
I'm seeing this. This is something I was doing yesterday. Somehow this is being retained?
https://dpaste.org/ESfwp
And now it appears to be totally crashed.
for normal wifi use you can use either port. but for logs you need the port labelled USB
Okay. This thing is not stable at all at the moment. I don't know enough to know why though. It was working well a few days ago.
have you tried with a different power source
Yes. It is happening when plugged into the computer (now) and in a different 5V source as well. I can try another but would be surprised ot get different result as I'm using two different ones with same result
do any of the chips on the esp or the DAC feel hot to the touch at all? do you have the mic and dac connected with dupont/jumper wires? or soldered?
Don't seem hot. Yeah dupont wires for now
I don't know. I flashed a different ESP32S3 with the same config and it is doing the exact same thing. I am using a different DAC and speaker and different dupont wires. Could it be something in the config?
Yep. It's definitely something with the config. I reflashed without the media player device and it is working as it was before. Something with media_player is giving me problems
ive got it up and running now and have a couple of hours or so before the house wakes, so far i have run several commands with tts responses and played a couple of radio browser stations without issue. was there something specific causing your issues
still going strong listening to the radio, not as much as a blip in the audio so far
Thanks for testing. I really can't understand it. Really it was playing fine with me streaming audio from Music Assistant for hours without problem.
I find it odd that the logs show that alarm sound effect that I was trying to play. I know for MA I had to set the audio to lossy for it to work. Other than that it appeared to work right out the box. The powersupply I'm using pushes 1A. Is that enough? Just odd that it worked for a long while and then problems on two separate devices.
Should I did just do an upgrade when I put the media_player verison on the first chip. Should I be doing a full erase before doing this? The second chip was brand new with nothing flashed and it had same issues.
Just reflashed and using a 3A power supply. same hardware. So far all is well. Will let it ride for a bit and see if that was the problem.
Yeah so far so good. The only thing I can say that isn't so great is that it is taking a long while to process my responses. I can say 'what time is it' and it will sit there a few seconds before answering. With the 'standard' version it was a lot quicker. Does that have to do with the agressiveness level?
Spoke too soon. It is back in a weird state. I do not know why. I can say that it does detect wake word some times but others not. It does not respond at all now. I have tried restarting the chip as well as reloading in HA. I am also seeing the RGB led flash periodically too.
just had a read back through this post. the ESP32 S3 board that you linked has 2 options the N8R2 and the N16R8 - which one do you have ? you might need to check to see which one was shipped / invoiced for as it's not very clear on Ali at times
ok , i found it by going through your logs ,
must be the n16r8 - just trying to figure out why you would be having such issues. I occaisionally get a slow response to commands, but not often and I just put it down to HA being busy at that time, or network traffic, it's never bothered me enough to look too deep into it. Currently the pipeline uses UDP between the ESP and HA , this will change in the 2024.5.0 release of HA and will use the ESP Native API instead, which may help, the changes are already in place in ESPHome 2024.4.0 but requires HA 2024.5.0 to take effect.
With regards to the device acting randomly, i would certainly rule out wiring by soldering connections, duponts are notoriously problematic, and it only takes one of the connections to not be 100% sound to cause issues.
I took your advice and posted an issue. One of the devs is looking at a bug which may be causing this but I'm not certain I am reading his response correctly. Understood it is still early days for this. I'm really excited. It was magical before the issues and I'm sure it'll get back to stable soon
from looking at the response, and logs. it looks like more than 1 file is being streamed in quick succession (within the same second ) which i'm guessing is what's causing the issue.
Right. I'm not sure how that is happening though or how to stop it
I THINK it may be due to this alarm I'm working on. I have it repeatedly play an alarm mp3 file until the user turns off the alarm. I might have a bug in it or something else that the esp device doesn't like. It works fine on my android tablet media players but maybe that's it. The device appears to be working again after a reboot and I guess it cleared that out. I don't understand why it would be sending that many requests though.
The only thing I'm noting now is that it is really slow in replying. Five or six seconds after I issue the command does the device stop flashing and respond.
if you watch the 'assist in progress' binary sensor in the ESPHome integration for the device, if this shows active when you issue the command , that would indicate that HA is recieving the instruction and the delay is possibly in processing and returning the response, which could be down to the tts engine you are using or wifi releated.
So the switch turns on immediately but does take about 10 seconds before I receive a response. Something really odd about this as when I was using the default config it was really fast.
Hi, any major difference between n16r8 and n8r2 for voice assistant? Or itβs more or less the same
both will work , but for the difference in price the n16 r8 is a no-brainer for a little future proofing
Saw some demos using the super mini n8r2 and it looks pretty good. I guess iβll try both of them