#[RAG] Bandwidth optimization on search

7 messages · Page 1 of 1 (latest)

rancid moss
#

I'm assuming that because of filters and such it may not be possible to make a version of the .search that only returns the score + entryId? That was one nice way to save bw with vector search is it returned a minimal result. I have a documents table holding entryIds and thus have no use for the matching chunk. Although i may end up storing the whole document with the rag component....

deft siren
#

If you want, you don't have to store anything in the chunk data. You can pass in chunks with a custom embedding set and the text content set to "" - then the fetch results are lightweight.
wdyt?

rancid moss
#

That would work assuming i wasn't also using the chuck text elsewhere.
For rag ai context i use the chunk text.
For rag user search i want to return the chunk's parent document so the chunk info is useless.

I did this a few places with vector searching where id just return the ids and remove duplicates and fill in the table via a waterfall. Especially here as you often get multiple chunks for the same "document"

deft siren
#

Gotcha - yeah I store the content separately from the entity - so it's technically possible to provide this but not exposed. Though I'm considering adding text search which would be storing the text search portion on the entity.
So it could be a future optimization, but not on the roadmap right now - if it's important let's track it in a ticket

rancid moss
#

I'm considering how to handle huge docs as i hit the 1mb limit, possibly using rag chucking when the document exceeds 1mb. (1mb chunks)And i definitely wouldn't want the chunk data returned on search.

deft siren
#

wow yeah 1MB chunks are pretty big - are you getting good embeddings over that much data?