If I have a custom sentence that targets an intent for playing content on a media player (other scenarios also apply) is there a general approach for fuzzing the selection made by voice?
For example, I want to play "Wish you were here" by Pink Floyd, but in my library that album happens to be titled "Wish you were here [2011 - Remaster]" which is an impossible thing to expect STT to get right and is undesirable to attempt. I'm already using ChatGPT as a conversation agent, allowing it the ability to assist/control but with a preference for local handling. Even without the local preference, it remains impossible to get the content selected based on a partial match.
This occurs in other specific parts of my setup too, such as apps on a TV - I can easily get it to select the "YouTube" app by voice, but getting it to switch to "Disney+" is impossible because the STT delivers "Disney plus" as the input.
How do people typically deal with this problem?