I'm making a bot to query Javadocs, for this I've already parsed and formatted the messages the way I want, but I'm having a dilemma on whether or not my caching strategy is the best, maybe someone can give some insight:
Discord requirements
- Autocompletions must be answered ASAP, as Discord only allows 3 seconds to respond to them. Ideally they must be ready in memory.
- Autocompletion values (what is actually sent when selecting an autocompletion, not what is shown) must be shorter than 100 characters.
- The actual messages can take a while to be replied to.
My approach
- After parsing, separate into two types of entities: objects (classes, interfaces, enum classes, annotations) and members (fields, methods, enum values, annotation elements).
- For each entity, create an ID using its name and an appended short ID, to avoid possible conflicts. For instance:
JavaPlugin-LZbCb. - The whole Javadoc cache has a folder inside a
__cache__folder, for instance,__cache__/spigot-1.20.1. The files mentioned later will always be under this folder. - Save autocompletions in a
data.jsonwith this schema:
{
d: <timestamp>,
// Holds objects
o: {
<ID>: <Autocomplete Message>
}
// Holds members
m: {
<ID>: <Autocomplete Message>
}
}
The differentiation between objects and members in the schema is done to allow searching differently. For instance, the bot can try to query members only if the user has inputted a . or #.
4. Save actual messages in objects and members folders, for instance: objects/JavaPlugin-LZbCb.txt, using a promise queue of fixed size.
5. In runtime, keep autocompletions (data.jsons) in memory, but only load full messages when queried. I can also keep an in-memory LRU cache for frequent queries.
As a remark post testing with Spigot 1.21.3, while the data.json is surprisingly small (<7MB), there's a massive amount of files (>65.7k), though the total size isn't that big (30 MB).