#ESP voice assistant media player stuck on playing after TTS response

1 messages Β· Page 1 of 1 (latest)

surreal venture
#

Hi all, I was just wondering if anyone could help. I updated to HA 2025.4.0 yesterday, and since then I've noticed an issue with my ESPhome based voice assistants, using the media player component. Once they've given their TTS response, the state of the media player entity is stuck on playing. Then roughly every minute, the media player repeats the previous TTS response. Only way to stop it is to restart the ESP device. I was just wondering if anyone knows what this could be? I've not changed any yaml or anything. I also tried using an older ESPhome builder on a single device, but made no difference. Thanks πŸ‘

torpid kayak
surreal venture
surreal venture
torpid kayak
surreal venture
torpid kayak
surreal venture
torpid kayak
#

do you have a copy of the code with the changes you made?

torpid kayak
#

oh i changed the id on the media player

#

that one should be closer to your orginal entity id wise

surreal venture
# torpid kayak something like this should work

Thanks so much Michael, but unfortuntely there's no change, the media player is still stuck on playing, and repeats. As a test, I changed the yaml so that I removed the voice assistant part all together (turned it into just a media player). Now if I relay TTS responses through it, there are no problems (no stuck on playing, and no repetitive responses). So the problem seems to be when I use the voice assistant part. Here's the yaml, just so you can see exactly what I mean.

torpid kayak
#

tbh i am thinking specifically the on_tts_endsection but can just rip them all out and loose the light stuff for testing

#

so would just be:

  microphone: mic
  use_wake_word: false
  noise_suppression_level: 1
  auto_gain: 31dBFS
  volume_multiplier: 2.0
  media_player: media_out
  id: assist```
weak plaza
surreal venture
weak plaza
surreal venture
# torpid kayak tbh i am thinking specifically the `on_tts_end`section but can just rip them all...

Hi again Michael, thanks again for your help, but it didn't solve the problem. I'm guessing that something has changed in the 2025.4 release, which affects the media player component, but only when using arduino as the framework type ( I forgot, but I have another ESP32 setup as just a media player, which uses arduino, and this has the same problem when sending TTS responses through it). As mentioned above, using the esp-idf framework on the media component alone without the voice component is fine, with no freezing, or repetitive responses....

torpid kayak
surreal venture
#

Can also see, that ther are some yellow errors when flashing the media_player code to the ESP (even though it does work). Think it says something about I2S is deprecated or something...

torpid kayak
#

warnings are "fine"

surreal venture
# torpid kayak warnings are "fine"

OK thanks. Another thing I've noticed, since using this new Speaker Audio Media Player (have just been looking at it on the ESPhome site), is that the speaker output is quieter than using the media_player on the arduino code, plus the output quality seems worse. Is there anything that can be done with the volume (even at max vol in the HA entity, it's still quieter than before)?

torpid kayak
torpid kayak
# surreal venture Only if you have time, greatly appreciated that you want to use time to help, so...

for the arduino framework side:

so on the positive side: I can replicate the issue
on the negative side: I am not able to find an easy solution

I believe that the issue lays with ESPHome and recent changes in HA to add support for other things have just highlighted it. the i2s_media_player component is not reporting a state change when when the streamed audio is complete. therefore nothing else knows.

I have looked at the component code and cant see anything but I do not claim to be good at reading and understanding the C.
it could also be an issue upstream in the framework but that doesn't seem likely and its beyond my ability to troubleshoot.

realistically my recommendation still remains, switch to esp-idf

#

i need to convert my project over to esp-idf and want to rewire some stuff too so ill see if i get any issues when i do that.

surreal venture
# torpid kayak for the arduino framework side: so on the positive side: I can replicate the is...

Thanks for looking into this, much appreciated. Sounds like there's not really a solution right now. As I can do, is route the TTS output to an external speaker, which isn't possible everywhere. Just so I clearly understand, do you have any working ESP32 based voice assistants, which use the media player component with HA 2025.4 (might be a stupid question, considering all your suggestions, but just wanted to check)?

torpid kayak
surreal venture
#

Great, was all I wanted check. I also have a voice PE, and another Respeaker ESP32-S3 board, which work fine. It's just all my other 'normal' ESP32's that have the problem.

torpid kayak
#

yeah, they use custom components anyway which some code which has not been upstreamed yet

torpid kayak
#

so it turns out the old dev board I have is too shit for doing what i wanted to do.
I did get the voice assistant "working" with esp-idf but it struggled and sometimes crashed.
end of the line for my experimenting for now I think. if i was gunna get a better board for it then i would probably just go the respeaker direction instead

neat jacinth
torpid kayak
neat jacinth
surreal venture
#

Thanks for all the inputs. The only problem I personally have work the speaker component (unless I'm mistaken), is that you cannot control the volume of the speaker (via an entity in HA), which is why I was really happy using the media player component... But if the volume can be controlled using the speaker, please let me know!

torpid kayak
surreal venture
torpid kayak
neat jacinth
# surreal venture Thanks again Michael, but as you might remember (please read above), once I use ...

I am getting similar issues with my project. Short responses work most of the time but I get lots of buffer errores in the logs and sometimes the device crashes in a way that power cycling and reload will not always fix. I do not get any of those errors with the old code, the player just will not stop after announcement playback.

Could someone please provide some very basic smart speaker example code that I could use as a baseline?

surreal venture
neat jacinth
blissful gate
#

Did you try use code from PE repository? That thing seems to be working.

surreal venture
blissful gate
surreal venture
surreal venture
#

So, after more testing, I'm realising that I can't get the voice assistant to work on my vroom boards anymore, if I use the esp-idf type (but media player works). I've tried many times on two devices now, and it's the same issue. If I flash with Arduino, then it works, but if I flash with esp-idf, then the mic will not pick up the wake word. I even have a push to talk set up on one of them, and that will activate the wake word, but it will not recognise any speech (when using esp-idf code).

torpid kayak
surreal venture
surreal venture
torpid kayak
weak plaza
#

Hi, has anyone tested if the voice repetition still happens in 2025.4.2 when using the Arduino platform for the media player?

torpid kayak
muted oak
#

I've been having this exact issue since 2025.4, has anyone been able to get Onju voice working again?

surreal venture
pure raven
surreal venture
#

@pure raven Yes, this was the original suggestion, which I have tried, and tried again, and tried again. The problem, is that if I use esp-idf, them I cannot use the media_player component (I want to use this, because I want to control the volume of the speaker on the satellite). Then I heard about using the media player speaker comment (the new combination of speaker, and media player components, introduced earlier this year). However, if I use this, then.... Well, just read the above. For me, it doesn't work, the TTS output repeats every minute, and the media player is frozen on playing. Only way to stop it, is to reset the ESP device... So right now, the only options are to use another device as the media player (annoying, as that means extra devices), or I just use the speaker component, which means I can't control the volume in HA

torpid kayak
#

you dont point media player directly at the pin like you do with arduino

surreal venture
torpid kayak
#

i replicated it on arduino but when switched to esp-idf it worked fine? i ended up upgrading the devboard in my custom speaker and have been using it to mess with some AI stuff for a few days now actually

#

do you have a copy of your esp-idf code that doesnt work? i can have a look and see if i can any issues

lime atlas
surreal venture
surreal venture
torpid kayak
torpid kayak
torpid kayak
#

i ask because i did have some stuttering when i was using an older dev board, i moved to using a newer s3 devboard and it worked better

surreal venture
#

So*

torpid kayak
#

which then means i am using

  board: esp32-s3-devkitc-1
  framework:
    type: esp-idf```
surreal venture
#

Ok thanks, in not using an S3, just a normal ESP-32S...

#

I'm*

torpid kayak
#

it could just be that the older boards are struggling to keep up with the newer esphome versions. i dont overally pretend to know the exact hardware differences/limits are and how they intereract with newer versions of the components in esphome. but perhaps its just a simple matter of upgrading to a newer board to solve your issue. which I know is a pain...

surreal venture
muted oak
#

sad to say, but I believe that's our only choice. Thanks for the feedback though

surreal venture
#

So... Just to give an update from my side. I just recieved an ESP32-S3. I used the code @torpid kayak provided near the start of this thread... And it works perfectly (no more freezing on playing state, or repetition of TTS)! So my conclusion is that the ESP32 i was using, was simply too old...

torpid kayak
weak plaza
#

Looks like this problem is not yet fixed in 2025.4.4.

Is HA dropping support for the media player i2s component in the Arduino framework?

blissful gate
#

I guess so. Esp-idf is way to go now.