#Verification of data for respective prompt

7 messages · Page 1 of 1 (latest)

errant bison
#

I am trying to verify if the knowledge data matches the prompt given by the user.
Backstory: -

  1. I am using smollm2 from ollama as my main llm and langchain/crewai as my main farmwork
  2. I have a custom-built tool that searches google and returns web url's based on the search query produced from the user's prompt.
  3. These web url's and the content within will be the main data source for the llm to give an answer.

My main question is how do i verify if the data source matches the prompt given by the user.

My solutions (not that great): -

  1. Created a tool using smollm2 to verify the context of the data source and the prompt: Very ambiguous results
  2. Tried similarity scores using embedding models: varying scores, unable to set a threshold

Pls do feel free to share your ideas or thought processes.
Thanks in advance!

errant bison
#

some help pls

gleaming rapids
errant bison
gleaming rapids
errant bison
#

Tried it (mentioned above), got very poor results.

This is the template I have used:-
verifier_template = """
You are an expert at verifying data based on the given input sentence.

Sentence: {input_sentence}

First, extract the context from the sentence. Next, analyze the provided data.

Data: {data}

Verify if the data aligns with the concept extracted from the sentence. Be precise and accurate in your evaluation.

Respond in either "Yes" or "No." Explain your thought process.

"""

gleaming rapids
# errant bison Tried it (mentioned above), got very poor results. This is the template I have ...

You don't need to call a LLM separately to filter out irrelevant information. In a RAG based approach, you can use a single system prompt. For example:

Using the relevant information from knowledge base, respond to the user's query. 

Knowledge Base:
"""
{knowledge_base}
"""

User's Query:
"""
{user_query}
"""

However, if you still wish to use a separate LLM call to filter out irrelevant information, we can enhance the prompt by employing a relevance scale (1 to 10) instead of a binary output (Yes/No). This allows the LLM to more effectively articulate the relevance between the data and the sentence. We can subsequently establish a threshold based on preliminary tests. For instance:

You are an expert at verifying data based on the given input sentence.

Sentence: {input_sentence}

First, extract the context from the sentence. Next, analyze the provided data.

Data: {data}

Verify if the data aligns with the concept extracted from the sentence. Be precise and accurate in your evaluation.

Output a relevance score from 1 to 10. Where 10 signifies the highest relevance and 1 denotes the lowest relevance. Explain your reasoning.