Hi everyone,
I’m encountering a performance issue while working with CSV files, especially those larger than 1 MB. The processing takes a very long time which makes it impractical for real use cases.
After looking into the code, it seems that the aadd_documents method delegates to add_documents, which then calls add_texts. From what I can see, all documents are being processed in a single operation without chunking or splitting, which might be contributing to the slowdown.
Some points and questions:
- Is this expected behavior when dealing with large files, or is there a recommended way to improve processing times?
- Is the method optimized for handling large datasets, or are there improvements planned to support larger files more efficiently?