#Regarding the Global Search

1 messages · Page 1 of 1 (latest)

tired thistle
#

Problem Statement

The existing emote search algorithm, while effective for finding emotes with exact name matches, suffers from a significant drawback: it prioritizes exact text matches within the default_name field over the overall popularity and prevalence of an emote. This can lead to counter-intuitive search results where rarely used emotes with exact name matches outrank popular and widely used emotes that have partial matches or are relevant through tags.

Examples of the Problem:
https://discord.com/channels/817075418054000661/1317190353295507519

My Proposal

I propose some complex mb junky thingy.

  1. A Prevalence Metric (channel_count): Adding a new field to the emote data model to track the number of channels in which a particular emote is used
  2. Modifying the current Sorting Algorithm: Adjusting the sorting formula to incorporate the emote's prevalence alongside text relevance and overall popularity
  3. Implementing Sort Mode Options: Providing users with the ability to choose between different sorting modes, allowing them to prioritize exact matches or popularity as needed

Walk-trough

1) channel_count field

The objective of this is to obtain a quantifiable measure of how widely an emote is used.

#[derive(Debug, Clone, Default, serde::Deserialize, serde::Serialize, TypesenseCollection)]
#[typesense(collection_name = "emotes")]
#[serde(deny_unknown_fields)]
pub struct Emote {
    // ... other fields ...
    pub channel_count: i32,
}

The channel_count field should be updated in response to the following events:

  • When a user adds an emote to their channel's emote set.
  • When a user removes an emote from their channel's emote
  • Upon the creation of a new emote, its channel_count should be initialized to zero.
  • When an emote is deleted, its channel_count becomes irrelevant. The record might be deleted or marked as deleted.
    ... mb something else?