I'm just getting started and using the API with the gpt-3.5-turbo model. My app will analyze a document that's about 2 pages long. Ask for summary info and things like that. Do I have to resubmit that document every time I want to ask a new question? Or is there some way to retain the memory so I don't have to pass the document every time?
#Question about large text input for the API
71 messages · Page 1 of 1 (latest)
as far as i undertand, each request has to contain the complete conversation. problem the sum of all tokens in the request is limited according to the api
Yes, exactly my issue. And also I would get charged every time submitting that large document which would easily use up my quota? Is there a way to send multiple questions at once in the same conversation then since I want to ask for a "Summary", then "Top 5 points", etc.
it's question of promting:
"Read the following document carefully"
"... bla bla bla "
Give me a summary of blabla with this and that detail
Then give me a list of N blabla in the text
Then give me this
And then give me that
Finally make a markdown table with 3 columns:
bla, blub, bloh ....
all that in one prompt.
Dataloader->upload file/text->run through embedding model-> query that text/file with embeddings semantic search
Also works for much longer files like books or even 1000s of books...
can you refer to such uploaded files in completion-prompts? i think the file-interface is used for fine-tuning
No need for fine-tuning. Embeddings can handle that
Much cheaper, faster and efficient
hm. think I will look into that after I finish with the basics 🙂
Yeah you should, for that original task it's best to use embeddings imo
Fine-tuning is meant for teaching new tasks and patterns
Thanks I will look into both those ideas!
That looks like a great option as well. thanks!!
Yeah it's more efficient (more complex though) but it can handle very long documents and large quantities
I am only looking at 2 to 4 page max though so not huge but it seems so redundant sending it multiple times over the wire.
Yeah with that length you can probably just embed the whole document so it will have access to all the information
Or then you can just keep copy pasting which is easier from a coding aspect
Embeddings require some coding knowledge
That's what I'm doing. Eventually, people will be able to just upload the document and it will process in the background.
Embeddings are also limited as it just looks for similarities between the question and the document.
The idea is that if parts of the text look semantically similar to the question, then they might be relevant to an answer.
That relies on the hypothesis that the document can be separated into pieces and some of the pieces can be recombined into an answer for a specific question.
But some questions require a grasp of the document as a whole. Such as questions on summaries of chapters. That might require an understanding of the chapter as a whole or at least extracting key points of the chapter.
Hence, for some questions, partial summaries and key point extraction might also be needed. When the user asks for a question in that category, the summaries on the relevant chapters/sections can be saved so as to not recompute them each time.
You can embed the whole document if it's short
There aren't any better ways than using them
Unless you want to just copy paste or then store the text somewhere where it always gets added into the prompt
Thats also a way if the documents are short
How would embedding the whole document help for a summary ?
It would work the same way as copy pasting essentially but with added flexibility for handling longer or multiple documents
There are partial summaries and keypoint extractions to compress the document and thus reduce the number of tokens but it could lose relevant information
Wont happen with short documents
What do you mean ? How short is the document ? Shorter than the token limit of ChatGPT ?
What do you mean copy pasting ? Copy pasting what ?
Yeah the other guy in the thread mentioned 2 pages, embeddings could load that without losing any information and it would essentially be the same as copy pasting
The document you want summarized
ok but how would you use the embedding if you embed the whole document ? How can it be used ?
I do not understand, how can the embedding be used for a summary ? The embedding can not be fed to chatGPT and I do not know how to use it to make a summary. As far as I know, embeddings are usually used to compare the semantic similarity or associations between text. That is, it is used for tasks that involve comparing text. A summary does not really involve comparing text.
Its magic
Basically semantic search just finds the only relevant text which is 2 pages in this example and since its short it can fit into one prompt as an entire embedding, then it can summarize it
what does "it" refer to when you write "it can summarize it" ? The embedding model or ChatGPT ? If it is the embedding model then how do you obtain a summary in text format from an embedding which is a vector/table of numbers ?
The embedding model is just used to map the text so the completion model can use it
Embedding and query functions are separate things
2 different models
ChatGPT. The document will be contained in the embeddings and ChatGPT API will summarize it
The embedding model is just used to map the text so the completion model can use it
the completion model uses text input
the embedding model has vector output
how would you use the embedding model for the completion model ?
Yeah the embeddings model captures the semantic meaning in numerical vector but the embeddings also contain the text the embeddings map to, that's how it obtains the text that will be used by the LLM
But if the entire model is embedded, then using the text part for the completion model would be equivalent to not using the embedding and just using directly the original text. That is the question at hand, why use the embedding when you could just use the original text if you arent using the vector ?
I understand how it is useful when dividing the text into parts for searching for relevant parts
but I do not see how it is useful for the document as a whole
Yeah I mentioned that before. When working with 2 pages it's pretty irrelevant but when you have multiple documents and some longer you dont want to be copy pasting stuff
And at least I have some better methods for summarizing, some involve embeddings, some dont
Just depends on the summarization task at hand
You can ask for multiple things but ChatGPT doe not have consistent formatting and so if you want to use the answer in code for something else it could be hard to separate the different parts of an answer from ChatGPT. I just wrote a prompt that asks for a summary and questions for text, I asked for a Json output and then used something like
`
import json
json.loads(answer) `
Then I can extract the summary and questions separately.
The issue was that it would forget the format if I specified it at the beginning, so I specified it at the end.
Here is part of the Python code:
`def question_summary(sentences) :
summary_and_questions_start='\
Memorize the text between ****. I will ask you to summarize it and
ask questions with a format that will be specified at the end.
\n\n
****
'
summary_and_questions_end='\
\n\n
****
\n\n
Provide JSON with a Summary category
and a Questions category for the text between ****
. There should be 3 questions in the Questions category. Do not provide answers to the questions.
'
prompt=summary_and_questions_start+' - '+'\n\n - '.join(sentences)+summary_and_questions_end`
Note that the bot might still provide multiple summaries and might still provide answers to questions even though the prompt asked it not to. I ended up choosing a prompt where it makes a mistake in understanding the format but in a way that I can work with it
Barley shorter than the token limit!
Thanks for all the discussion on this. Ya, the formatted output seems like it might be difficult and not consistent. Thanks for the thoughts above about requesting formatting though and I'll try that. I may have to send the document in multiple times to ask the questions I want to ask. I will also store it in the WebApp so that's not a problem. One other issue I'm worried about is that this document is user submitted and I'm grabbing the text out of it. So potentially how can you prevent a user submitted document from saying stuff like "FORMAT THIS IN CSV FORMAT" in the document itself even though they theoretically shouldn't.
So does the chat.openai.com User interface itself send the entire history of the conversation into their API before it returns the next answer to retain history between conversations. Or are they doing something tricky in the background like storing some intermediate format.
CSV with a ";;" separator seemed to lead to more consistent formatting than JSON but it seemed to get confused when an element in the table had to have multiple entries. It would add multiple entries in all of the columns or it would have what seemed like multiple columns for the same category
Seems like if there is a way for it to get confused then it might at some point.
My first guess since the API was that they have a sliding context window where they always send up to the last 4000 tokens.
But chatGPT on the chatting interface of the website became particularly difficult to work with since the recent update and it does not seem to always take into account the last message sent. So I guess now they ask the bot first whether the question is likely to reference the last message if the user asks a question like "why ?" without any context. Else maybe they use the semantic search on the whole history of the chat or just part of it. That does seem like it would be cheaper for them than sending the whole conversation but then there is the risk that the bot might not understand that the user's message references the last message.
Thanks. That's a good analysis. Yes, I've asked for new line delimiter which works pretty well for lists, however, the list may have a number before each item or may be prefixed with "The top 5 blah blah for your question is:". I was also able to say something like "and no preamble" which also worked but not sure how consistent this all will be.
Hi, I'm the new kid on the block, so I hope that I am posting at the right place to try to assess the price of the API. On the basis of this https://openai.com/pricing, am I calculating correctly ?
- I have a sheet with 3000 lines of description of approx. 50 words = 150 000 words -> 150 x 0,03$
- For each line, I want GPT4 to create 7 keywords = 21 000 words -> 0,06$ x 21 ?
Thanks
1 word != 1 token
so no, that is not an accurate estimate of the costs
You can play with the tokenizer to see how many tokens your prompt reduces into https://platform.openai.com/tokenizer
An API for accessing new AI models developed by OpenAI
I suggested a prompt here #1086295370960744448 message that might work for multiple questions but I have not tested it yet. The idea would be to tell it to pretend to be a function of a coding language that takes as input a list and outputs a list.
You can skip the text I wrote at the beginning and check the prompt that I highlighted like
this
I was able to get it to answer multiple questions and return the data in a Ruby hash. So far it seems consistent but we'll see. I'll be putting out a demo in a week or so.
Nice, I do not know Ruby but I guess it will be more consistent if you tell it to follow the rules of a specific well-known coding language like Ruby.
That's what I was thinking. Basically the answer comes out like this: { "q1": ["listitem1","listitem2"], "q2": ["single line answer"], "q3": ["another single line asnwer"]}