#When to ask the model for recommended questions

1 messages · Page 1 of 1 (latest)

lucid skiff
#

I have a working RAG. I am using GPT-4o for the following 2 tasks:

  • Refine the queries to something better for retrieve the documents
  • Compose the response using the document, without using any GPT knowledge.

Now I want to add another step for recommending the next question based on the past conversation. This will also be done by using GPT-4o. When should this step happen? Running this right after compose concurrently can be time consuming, is there any workaround to this?

noble pulsarBOT
#

Hi @lucid skiff. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

Based on the knowledge sources provided, there isn't specific information about when to ask the model for recommended questions in the context of a Retriever-Augmented Generation (RAG) setup or using GPT-4o for refining queries, composing responses, and recommending the next question.

However, the sources do provide a general principle that could be applied to optimize the latency of your system: "Make fewer requests". Each time you make a request, you incur some round-trip latency. If you have sequential steps for the model to perform, consider putting them in a single prompt and getting them all in a single response. This could potentially reduce the additional round-trip latency and also reduce the complexity of processing multiple responses.

In your case, you could consider combining the tasks of refining queries, composing responses, and recommending the next question into a single request to the model, if possible. This could be done by collecting your steps in an enumerated list in the combined prompt, and then requesting the model to return the results in named fields in a JSON. This way you can easily parse out and reference each result.

However, the specifics of how to implement this would depend on the details of your setup and the capabilities of the model you are using. You may need to experiment with different approaches to find the one that works best for your needs.

For more detailed guidance, you may want to refer to the OpenAI platform documentation (https://platform.openai.com/docs/guides/latency-optimization/).

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@lucid skiff Kapa.ai is still learning and improving, please let me know how I did by reacting below.

lucid skiff
#

@noble pulsar can you recommend a prompt to this task (recommend related questions)

noble pulsarBOT
#

Let me check the knowledge base for you again. One moment! :books:

#

Based on the knowledge sources provided, there isn't a specific example of a prompt for recommending related questions. However, you can construct a prompt based on the principles discussed in the sources. Here's a general example:

{
  "documents": ["[retrieved documents]"],
  "conversation": ["[previous conversation history]"],
  "task": "Given the previous conversation and the retrieved documents, generate a relevant follow-up question."
}

In this prompt, you would replace [retrieved documents] with the documents retrieved by your RAG system and [previous conversation history] with the history of the conversation up to this point. The model would then generate a follow-up question based on this information.

Please note that this is a general example and may need to be adjusted based on the specifics of your setup and the capabilities of the model you are using. You may need to experiment with different approaches to find the one that works best for your needs.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: