#Performance Issues When Handling Large CSV Files

1 messages · Page 1 of 1 (latest)

sage sun
#

Hi everyone,

I’m encountering a performance issue while working with CSV files, especially those larger than 1 MB. The processing takes a very long time which makes it impractical for real use cases.

After looking into the code, it seems that the aadd_documents method delegates to add_documents, which then calls add_texts. From what I can see, all documents are being processed in a single operation without chunking or splitting, which might be contributing to the slowdown.

Some points and questions:

  • Is this expected behavior when dealing with large files, or is there a recommended way to improve processing times?
  • Is the method optimized for handling large datasets, or are there improvements planned to support larger files more efficiently?
upper maple