silero_tts repeating voice clip bug | Text Generation WebUI | Page 1

Having a weird bug whenever I use 'start reply with' which causes the speech file to reply whenever a new response comes.
Contrived Example:
AI reply 0:
"WOW, look at all this gold!"
Me:
"Shh, we need to keep quiet" I whisper.

AI reply 1:
speaks in a whisper "Oh, right, of course, we don't want to wake the dragon."
Me:
I whisper: "Let's just grab as much gold as we can and get the heck out of here."
AI reply 2:
speaks in a whisper "That sounds like a good idea."

So with the above conversation, AI reply 0 works perfectly. Then I add the 'start reply with'

When AI reply 1 starts generating it repeats AI reply 0, and then plays AI reply 1 when it's ready.

Then when AI reply 2 is generating it replays AI reply 1 then plays AI reply 2 when it's ready.

Then when AI reply 3 start is generating it will replay reply 1 and AI Reply 2 both overlapping each other and then play 3 when it's ready.
Then even if I turn the 'start reply with' off, it will always replay AI Reply 1 and 2 whenever a new reply is generated.

Things that don't fix it:
Reloading UI
Changing characters and going back again
Completely shutting down the WebUI CMD and re-launching it,
Changing the model
Turning Silero off and back on.
Turning my PC off and back on.

No matter what I do, that conversation replays those messages forever.

I have recreated the problem from scratch using different models and different characters so it's not model specific.

Any idea why it's happening or how to stop it? (Other than avoiding using Start reply with obviously.)

Windows 11
RTX 3090
64gb RAM
Using WebUI with Whisper_SST and Silero_TTS
Model 1: reeducator_bluemoonrp-13b
llama.cpp
amd
Model 2: TheBloke_Pygmalion-13B-SuperHOT-8K-GPTQ
ExLlama

#silero_tts repeating voice clip bug