#RAG categorizing user input before a search

1 messages · Page 1 of 1 (latest)

wicked hawk
#

Hi, I'm trying to do this kind of intention checking prompt for RAG on GPT-4o:
Upon receiving the user input, the bot will try to categorize the input into 3 categories: Ambiguous/ can be answered with conversation history/ require another VectorDB search. Then it will take action accordingly (e.g. Return an action field, in JSON format to be retrievable):

  • Ambiguous: return {"action": "clarify"}
  • Can be answered with conversation history: return {"action": "answer"}
  • Require a VectorDB search, it will also provide good standalone question for this search: return {"action": "search", "standalone_question": ""}
glacial currentBOT
#

Hi @wicked hawk. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

Based on the knowledge sources provided, it seems you're looking to implement a system that can categorize user input and decide whether to clarify the query, answer it using conversation history, or perform a search using VectorDB. While the specific implementation details for such a system are not provided in the sources, we can discuss the general approach using Retrieval Augmented Generation (RAG) and other techniques.

RAG is a process that retrieves relevant contextual information from a data source and passes that information to a large language model (LLM) alongside the user’s prompt. This information is used to improve the model’s output by augmenting the model’s base knowledge source (https://help.openai.com/en/articles/8868588-retrieval-augmented-generation-rag-and-semantic-search-for-gpts).

To implement your system, you might consider the following steps:

  1. Ambiguous: If the user's input is ambiguous, you could use a smaller, fine-tuned model to determine this. If the model determines the input is ambiguous, it could return {"action": "clarify"} source (https://platform.openai.com/docs/guides/latency-optimization/example).

  2. Can be answered with conversation history: If the user's input can be answered with the conversation history, you could use the LLM to generate a response based on the conversation history and return {"action": "answer"}.

  3. Require a VectorDB search: If the user's input requires a VectorDB search, you could use RAG to retrieve relevant information from your database. The standalone question for the search could be the user's original input or a reformulation of it. The system could then return {"action": "search", "standalone_question": "<user's question>"} source (https://platform.openai.com/docs/guides/optimizing-llm-accuracy).

#

Remember, the success of this system heavily depends on the accuracy of your categorization model and the effectiveness of your retrieval system. You might need to optimize your retrieval system to return the right results, include less noise, and provide more information in each retrieved result source (https://platform.openai.com/docs/guides/optimizing-llm-accuracy).

Please note that this is a high-level approach and the specific implementation details might vary based on your exact requirements and the tools you're using.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@wicked hawk kapa.ai is still learning and improving, please let us know how it did by reacting below

wicked hawk
#

Can we do all the 3 categorization tasks in a single call to OpenAI GPT-4o? If yes, can you give an example system prompt to start with? @glacial current

glacial currentBOT
#

Let me check the knowledge base for you. One moment! :books:

#

Yes, it's possible to use GPT-4o to perform all three categorization tasks in a single call. GPT-4o is a powerful model that can handle complex tasks involving both text and image inputs, and generate text outputs source (https://cookbook.openai.com/examples/gpt4o/introduction_to_gpt4o).

However, please note that the specific implementation of such a system would depend on the details of your application and the nature of the user inputs. Here's a general example of how you might structure your system prompt:

system_prompt = """
You are an AI assistant that categorizes user inputs into three categories: 'Ambiguous', 'Can be answered with conversation history', and 'Requires a VectorDB search'. 

For each category, you should take the following actions:
- 'Ambiguous': Return {"action": "clarify"}
- 'Can be answered with conversation history': Return {"action": "answer"}
- 'Requires a VectorDB search': Return {"action": "search", "standalone_question": "<user's question>"}

Given the following user input, categorize it and take the appropriate action:

User: "<user's input>"
"""

In this prompt, you would replace <user's input> with the actual input from the user. The model would then generate a response based on the instructions provided in the prompt.

Remember, this is a high-level example and the specific implementation might vary based on your exact requirements and the tools you're using. You might need to fine-tune the model or adjust the prompt to get the best results for your specific use case.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: