#What are the best techniques and tools to have the model 'self-correct?'

1 messages · Page 1 of 1 (latest)

shy ore
#

CONTEXT

I'm a noob building an app that analyses financial transactions to find out what was the max/min/avg balance every month/year. Because my users have accounts in multiple countries/languages that aren't covered by Plaid, I can't rely on Plaid -- I have to analyze account statement PDFs.

Extracting financial transactions like ||||||| 2021-04-28 | 452.10 | credit ||||||| almost works. The model will hallucinate most times and create some transactions that don't exist. It's always just one or two transactions where it fails.

I've now read about Prompt Chaining, and thought it might be a good idea to have the model check its own output. Perhaps say "given this list of transactions, can you check they're all present in this account statement" or even way more granular do it for every single transaction for getting it 100% right "is this one transaction present in this page of the account statement", transaction by transaction, and have it correct itself.

QUESTIONS:

  1. is using the model to self-correct a good idea?

  2. how could this be achieved?

  3. should I use the regular api for chaining outputs, or langchain or something? I still don't understand the benefits of these tools

More context:

  • I started trying this by using Docling to OCR the PDF, then feeding the markdown to the LLM (both in its entirety and in hierarchical chunks). It wasn't accurate, it wouldn't extract transactions alright
  • I then moved on to Llama vision, which seems to be yielding much better results in terms of extracting transactions. but still makes some mistakes
  • My next step before doing what I've described above is to improve my prompt and play around with temperature and top_p, etc, which I have not played with so far!
south vaultBOT
#

Hi @shy ore. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

To address the challenges you're facing with extracting and verifying financial transactions from account statements, here are some techniques and tools that can help the model self-correct and improve accuracy:

1) Is using the model to self-correct a good idea?

Yes, using the model to self-correct can be a beneficial approach. Techniques like self-consistency and verifier models can help improve the reliability of the model's outputs by allowing it to evaluate and refine its own responses.

2) How could this be achieved?

Self-Consistency Technique:

  • This involves generating multiple outputs for the same input and selecting the most consistent result. This can be particularly useful for tasks with a discrete set of answers. You can implement this by sampling multiple explanations and answers from the model and choosing the one that appears most frequently. This technique is detailed in the Self-Consistency Improves Chain of Thought Reasoning in Language Models (https://arxiv.org/abs/2203.11171).

Verifier Models:

  • Train a separate verifier model to evaluate the outputs of the main generative model. If the verifier rejects an output, you can resample until you get an acceptable result. This method has been shown to improve accuracy significantly, as described in Training Verifiers to Solve Math Word Problems (https://arxiv.org/abs/2110.14168).

Prompt Chaining:

  • Break down the task into smaller, more manageable steps. For example, first extract transactions, then verify each transaction against the original document. This approach is similar to the "split complex tasks into simpler tasks" technique, which can help the model focus on specific parts of the task sequentially.

3) Should I use the regular API for chaining outputs, or LangChain or something?

#

Regular API vs. LangChain:

  • The regular API can be used for chaining outputs by manually orchestrating the sequence of prompts and responses. However, using a framework like LangChain can simplify this process by providing tools to manage prompt sequences, handle context, and integrate retrieval-augmented generation (RAG) if needed.

Benefits of LangChain:

  • LangChain can help manage complex workflows by allowing you to define chains of prompts and responses, making it easier to implement techniques like prompt chaining and self-consistency. It also provides utilities for integrating external data sources, which can be useful if you need to augment the model's context with additional information.

Additional Considerations

  • Prompt Engineering: Continue to refine your prompts to ensure clarity and specificity. Experiment with different prompt structures and parameters like temperature and top_p to find the optimal settings for your task.
  • Retrieval-Augmented Generation (RAG): If you have access to a structured database or knowledge base, consider using RAG to provide the model with relevant context before generating responses. This can help reduce hallucinations by grounding the model's outputs in factual data.

By implementing these techniques and tools, you can enhance the model's ability to self-correct and improve the accuracy of transaction extraction from account statements.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: