#Troubleshooting RAG System Performance with Increased PDF Volume in Offline Setup

1 messages · Page 1 of 1 (latest)

barren dune
#

Hello everyone,

This is my first message here, so if this isn’t the right place to ask, I apologize.

I'm currently working on developing a Retrieval-Augmented Generation system using only offline solutions.

The tools I’m using are as follows:

Ollama
LLM Llama 3.1/3.2
Qdrant
LangChain
Here’s what I’ve observed and don’t quite understand:
If I have one or two PDFs in my system and ask a question, I get the correct answer; it works perfectly.

However, if I add about twenty more PDFs and ask the exact same question, I don’t get any response.

The parameters are exactly the same in both cases, so something is eluding me… does anyone have any idea what might explain this phenomenon?

Thanks in advance for your help!

Sébastien

burnt lake
#

Langchain is hard to debug. If I had to guess, it’s having a hard time parsing your other PDFs and getting stuck somewhere there. Hard to say for sure without having a copy of the set up in front of you or without error codes/terminal output.

barren dune
#

Thank you for your response!

I'm really new to this subject.

There might be a blockage causing some of the PDFs not to be processed ?

What can be enabled to get logs or traces to have as many clues as possible to help understand what's happening?

burnt lake