#slackAskBot (open-source)

19 messages · Page 1 of 1 (latest)

torpid wigeon
#

Use GPT-3 to do semantic search over exported Slack messages to find answers to questions about previously discussed topics

slackAskBot is a project that uses natural language processing (NLP) to search through a given dataset for messages that are similar to a given search string. It uses OpenAI's text-embedding-ada-002 engine to generate embeddings for the search string and the messages in the dataset, and then uses cosine similarity to find the most similar messages. It then prints out the top n results, and uses OpenAI's text-davinci-003 engine to generate a summary of the context and answer the question.

https://github.com/scottleibrand/slackAskBot

It does a good job of answering questions about topics that were previously discussed in the exported Slack history.

If you're a Slack workspace or org Owner/Admin, you can export your Slack history by following the steps at https://slack.com/help/articles/201658943-Export-your-workspace-data
If you're not a Slack admin, you can ask an admin to export only your Slack history.

Once you've got a Slack export, there are scripts to perform all the necessary processing to filter out just the Slack messages and generate and store (as files) the semantic search embeddings required to search for the content most relevant to the search inquiry.

Once you've done all the initial processing on your Slack export, you can simply run search.py everything.json "your topical inquiry or question" and it will find the most relevant results, and ask InstructGPT to use them to summarize the context most relevant to your inquiry and answer any explicit question you asked.

Future work:

  • Make a Slack bot that runs search.py
  • Orchestrate all the above scripts to run all of the necessary preprocessing on a Slack export
  • Figure out how to extend this to Discord
GitHub

Use GPT-3 to do semantic search over exported Slack messages to find answers to questions about previously discussed topics - GitHub - scottleibrand/slackAskBot: Use GPT-3 to do semantic search ove...

wispy idol
#

Did someone test this out?

#

Curious about the quality of the answers after the embedding step

#

Using davinci

sudden raft
#

Hi I have rich experience on discord bot development
I want to work on your project

torpid wigeon
# wispy idol Curious about the quality of the answers after the embedding step

My tests so far have been on 30d worth of Slack exports, and it did a good job finding context and answering questions about things discussed in that export, and mostly but not completely avoided finding unrelated context and making up incorrect answers otherwise. For further public evaluation I’d probably need to do an export on a Slack channel for an open-source project I’m admin for, so I can actually share the results. Or if you have access to a Slack export you can try it.

torpid wigeon
# sudden raft Hi I have rich experience on discord bot development I want to work on your proj...

Go for it! First step is probably to get a history export from a Discord server. That seems to require admin perms. Then you can modify the initial processing scripts to extract the same data from the discord exports as we do from the Slack ones, and the rest of the scripts should work with minimal modification. Once you have discord export search working on the command-line, it would probably make sense to think about a bot that can do such searches on behalf of users.

wispy idol
sudden raft
#

Hi scottleibrand

#

I understood your requirements and what is your deadline and budget on this project ?

wind garden
torpid wigeon
#

Tens of cents IIRC

#

Embeddings are super cheap. DaVinci costs a few cents per query. Typical query is about 1000-3000 tokens IIRC, so you can do the math on that.

wind garden
#

Awesome.

torpid wigeon
#

Obviously if cost were a concern you could do additional tricks to semantic search more narrowly to get specific context and only provide that to DaVinci (or just return unsummarized search results to the user).

wind garden
#

Yeah absolutely, just wanted to get an initial estimate because the 30d slack export seemed like a hefty data size and is good reference

broken oracle
#

Hi @torpid wigeon This sounds interesting. Do you think if I can use this to create embedding ebooks or web pages ? Based on your code if I replace process_slack_export.py with something like read_ebooks_export.py 😜 should this work ?

torpid wigeon
torpid wigeon
#

Not documented yet, but I got it working as a Slack bot, using DaVinci to come up with good search terms, searching for those with Slack search, semantic searching over the “most relevant” results, and then using DaVinci to answer the question. It supports both DM and @ mentions with direct questions. https://github.com/scottleibrand/slackAskBot/blob/main/slackAskBot.py

GitHub

Use GPT-3 to do semantic search over exported Slack messages to find answers to questions about previously discussed topics - slackAskBot/slackAskBot.py at main · scottleibrand/slackAskBot