So I have created a number of small open-ai sample use cases for my company. But the main issue I am running into handling large datasets without using embeddings which seem inadequate. I have tried embeddings but when a question requires divergent information across the dataset I really want all the information loaded.
The demo yesterday used the tax code which is very close to what I am looking for. Embeddings would not solve that problem as the entire solution set needs to be sent.
So my question is how are people handling the lack of training in the traditional sense (not fine-tuning)? I would love to be able to send 500 pages of data to a model and have it preserved for future interactions as priority data. Is there a product available that allows training of that sort?