#What about it seems complicated?

1 messages · Page 1 of 1 (latest)

dense wing
#

How do I know that se second "read" should be pronounced "reed" to include the correct SSML to fix the pronunciation.

uncut pecan
#

In this case, the teacher who sets the text and have to listen to the audio and flag that it sounds wrong and what the correct pronunciation would be.

I'm also kinda wondering if we could feed the phrase (and possibly the audio) into Gemini or another LLM to see if it detects an error and how to correct it.

dense wing
#

Teachers will hate me for that, but yes, I can force another LLM generation for the problematic phrase once it is flagged bad.

uncut pecan
#

Hate you for which part? Making them check? And is that worse than pronouncing it wrong?

dense wing
#

Making them check. I will need to research a little more. This will be a tough sell.

Thanks for your help.

uncut pecan
#

good luck!

dense wing
#

I found an interesting thing. If you change the dot to a comma to separate the phrase like "'I love to read, I read every night before I go to bed." the second read is pronounced correctly.

dense wing
#

Hi @uncut pecan just to update on the solution. Several models, not only Google ones show the same incorrect behavior.
The solution I implemented was. If the text has the word "read" I send the sentence to LLM and ask to rewrite using SSML. With the ssml I was able to generate the correct audio.
Followed your suggestion, no need to send the audio. The prompt had to force the LLM to review the sentence and analyze it and then write the SSML.

uncut pecan
#

Nod. Good approach!
There are other words besides "read" that might trigger that, tho none leap to mind at the moment. But that sounds like a good first stab at a solution.

uncut pecan
#

@dense wing - Unless you object, I'd like to discuss this issue on a future episode of my podcast. (I can cite you if you wish, or not mention you, or whatever in between. Whatever you're comfortable with.)