#What about it seems complicated?
1 messages · Page 1 of 1 (latest)
How do I know that se second "read" should be pronounced "reed" to include the correct SSML to fix the pronunciation.
In this case, the teacher who sets the text and have to listen to the audio and flag that it sounds wrong and what the correct pronunciation would be.
I'm also kinda wondering if we could feed the phrase (and possibly the audio) into Gemini or another LLM to see if it detects an error and how to correct it.
Teachers will hate me for that, but yes, I can force another LLM generation for the problematic phrase once it is flagged bad.
Hate you for which part? Making them check? And is that worse than pronouncing it wrong?
Making them check. I will need to research a little more. This will be a tough sell.
Thanks for your help.
good luck!
I found an interesting thing. If you change the dot to a comma to separate the phrase like "'I love to read, I read every night before I go to bed." the second read is pronounced correctly.
Hi @uncut pecan just to update on the solution. Several models, not only Google ones show the same incorrect behavior.
The solution I implemented was. If the text has the word "read" I send the sentence to LLM and ask to rewrite using SSML. With the ssml I was able to generate the correct audio.
Followed your suggestion, no need to send the audio. The prompt had to force the LLM to review the sentence and analyze it and then write the SSML.
Nod. Good approach!
There are other words besides "read" that might trigger that, tho none leap to mind at the moment. But that sounds like a good first stab at a solution.
@dense wing - Unless you object, I'd like to discuss this issue on a future episode of my podcast. (I can cite you if you wish, or not mention you, or whatever in between. Whatever you're comfortable with.)