#Large JSON inputs for GPT3.5
27 messages · Page 1 of 1 (latest)
maybe just use the json as your prompt?
It's larger than the token limit, roughly 2 years of transaction data.
is it like an Q&A?
Yeah, mostly
or you want to summarize it at once
The exact use case: take 2 years of transactions as input, and answer questions about them
Super useful, thanks!
so in this case, would our transactions JSON be a single embedding?
no, you should cut it into as many embeddings as possible
but dont cut it into a few words lol, like around 50 words per embedding
also you should label the small pieces with pre-set questions and embed the questions instead, so the hit rate would be higher
okay, so I can take our transactions JSON, cut it into CSV-like data (one row per transaction), create an embedding per transaction
yes
gotcha
and I have to store these embeddings myself
or use some hosted version of OpenAI APIs
openai api doesn't store any data for you
you need to store it by yourself
or use a vector DB
theres a cookbook for that in the link i gave you
what if they're all roughly the same cosine similarity to the prompt?