#General approach to fuzzing selections that form slots in custom intents?

1 messages · Page 1 of 1 (latest)

normal spade
#

If I have a custom sentence that targets an intent for playing content on a media player (other scenarios also apply) is there a general approach for fuzzing the selection made by voice?

For example, I want to play "Wish you were here" by Pink Floyd, but in my library that album happens to be titled "Wish you were here [2011 - Remaster]" which is an impossible thing to expect STT to get right and is undesirable to attempt. I'm already using ChatGPT as a conversation agent, allowing it the ability to assist/control but with a preference for local handling. Even without the local preference, it remains impossible to get the content selected based on a partial match.

This occurs in other specific parts of my setup too, such as apps on a TV - I can easily get it to select the "YouTube" app by voice, but getting it to switch to "Disney+" is impossible because the STT delivers "Disney plus" as the input.

How do people typically deal with this problem?

vague geyser
#

not sure how your app intent is set up, but I have one similar to this and just made it so it maps Disney++ or Disney Plus to the same app ID to launch. As for the music piece, think that kinda comes down to the backend search engine at the music service side. The search engine that retrieves the song based on that input should be smart enough to match against that track without it being exact.

normal spade
#

mapping things is too brittle and too high maintenance 🙂 The same issue can apply to device names, locations, all kinds of things - hence wondering if there was a general solution to the class of problem (rather than a specific solution to a single instance).

I'd assumed at first that context would be passed to the conversation agent and it would help. I've tried playing with the prompt to chatGPT to tell it explicitly to make best guess matches on voice input but this yielded little success.

vague geyser
#

I mean launching apps on your tv isn't a native HA function, hence I assumed you had something keeping track of the app id's to instruct the tv what to launch

#

as for device names, that's what Aliases are for 😉

#

an LLM can usually figure out areas I think using it's own fuzzy matching of sorts

#

or really just language understanding