#Question about large text input for the API

71 messages · Page 1 of 1 (latest)

pale mountain
#

I'm just getting started and using the API with the gpt-3.5-turbo model. My app will analyze a document that's about 2 pages long. Ask for summary info and things like that. Do I have to resubmit that document every time I want to ask a new question? Or is there some way to retain the memory so I don't have to pass the document every time?

sour island
#

as far as i undertand, each request has to contain the complete conversation. problem the sum of all tokens in the request is limited according to the api

pale mountain
#

Yes, exactly my issue. And also I would get charged every time submitting that large document which would easily use up my quota? Is there a way to send multiple questions at once in the same conversation then since I want to ask for a "Summary", then "Top 5 points", etc.

sour island
#

it's question of promting:

"Read the following document carefully"
"... bla bla bla "

Give me a summary of blabla with this and that detail

Then give me a list of N blabla in the text

Then give me this

And then give me that

Finally make a markdown table with 3 columns:
bla, blub, bloh ....

#

all that in one prompt.

flat cloud
#

Also works for much longer files like books or even 1000s of books...

sour island
#

can you refer to such uploaded files in completion-prompts? i think the file-interface is used for fine-tuning

flat cloud
#

Much cheaper, faster and efficient

sour island
#

hm. think I will look into that after I finish with the basics 🙂

flat cloud
#

Yeah you should, for that original task it's best to use embeddings imo

#

Fine-tuning is meant for teaching new tasks and patterns

pale mountain
pale mountain
flat cloud
pale mountain
#

I am only looking at 2 to 4 page max though so not huge but it seems so redundant sending it multiple times over the wire.

flat cloud
#

Yeah with that length you can probably just embed the whole document so it will have access to all the information

#

Or then you can just keep copy pasting which is easier from a coding aspect

#

Embeddings require some coding knowledge

pale mountain
#

That's what I'm doing. Eventually, people will be able to just upload the document and it will process in the background.

junior osprey
#

Embeddings are also limited as it just looks for similarities between the question and the document.

The idea is that if parts of the text look semantically similar to the question, then they might be relevant to an answer.

That relies on the hypothesis that the document can be separated into pieces and some of the pieces can be recombined into an answer for a specific question.

But some questions require a grasp of the document as a whole. Such as questions on summaries of chapters. That might require an understanding of the chapter as a whole or at least extracting key points of the chapter.

Hence, for some questions, partial summaries and key point extraction might also be needed. When the user asks for a question in that category, the summaries on the relevant chapters/sections can be saved so as to not recompute them each time.

flat cloud
#

There aren't any better ways than using them

#

Unless you want to just copy paste or then store the text somewhere where it always gets added into the prompt

#

Thats also a way if the documents are short

junior osprey
flat cloud
junior osprey
flat cloud
junior osprey
junior osprey
flat cloud
flat cloud
junior osprey
junior osprey
flat cloud
#

Basically semantic search just finds the only relevant text which is 2 pages in this example and since its short it can fit into one prompt as an entire embedding, then it can summarize it

junior osprey
flat cloud
#

Embedding and query functions are separate things

#

2 different models

flat cloud
junior osprey
#

the completion model uses text input

#

the embedding model has vector output

#

how would you use the embedding model for the completion model ?

flat cloud
#

Yeah the embeddings model captures the semantic meaning in numerical vector but the embeddings also contain the text the embeddings map to, that's how it obtains the text that will be used by the LLM

junior osprey
#

I understand how it is useful when dividing the text into parts for searching for relevant parts

#

but I do not see how it is useful for the document as a whole

flat cloud
#

And at least I have some better methods for summarizing, some involve embeddings, some dont

#

Just depends on the summarization task at hand

junior osprey
# pale mountain Yes, exactly my issue. And also I would get charged every time submitting that l...

You can ask for multiple things but ChatGPT doe not have consistent formatting and so if you want to use the answer in code for something else it could be hard to separate the different parts of an answer from ChatGPT. I just wrote a prompt that asks for a summary and questions for text, I asked for a Json output and then used something like
`
import json

json.loads(answer) `

Then I can extract the summary and questions separately.

The issue was that it would forget the format if I specified it at the beginning, so I specified it at the end.

Here is part of the Python code:

`def question_summary(sentences) :

summary_and_questions_start='\

Memorize the text between ****. I will ask you to summarize it and
ask questions with a format that will be specified at the end.
\n\n
****
'

summary_and_questions_end='\

\n\n
****
\n\n
Provide JSON with a Summary category
and a Questions category for the text between ****
. There should be 3 questions in the Questions category. Do not provide answers to the questions.
'
prompt=summary_and_questions_start+' - '+'\n\n - '.join(sentences)+summary_and_questions_end`

Note that the bot might still provide multiple summaries and might still provide answers to questions even though the prompt asked it not to. I ended up choosing a prompt where it makes a mistake in understanding the format but in a way that I can work with it

pale mountain
#

Thanks for all the discussion on this. Ya, the formatted output seems like it might be difficult and not consistent. Thanks for the thoughts above about requesting formatting though and I'll try that. I may have to send the document in multiple times to ask the questions I want to ask. I will also store it in the WebApp so that's not a problem. One other issue I'm worried about is that this document is user submitted and I'm grabbing the text out of it. So potentially how can you prevent a user submitted document from saying stuff like "FORMAT THIS IN CSV FORMAT" in the document itself even though they theoretically shouldn't.

#

So does the chat.openai.com User interface itself send the entire history of the conversation into their API before it returns the next answer to retain history between conversations. Or are they doing something tricky in the background like storing some intermediate format.

junior osprey
#

Seems like if there is a way for it to get confused then it might at some point.

junior osprey
# pale mountain So does the chat.openai.com User interface itself send the entire history of the...

My first guess since the API was that they have a sliding context window where they always send up to the last 4000 tokens.

But chatGPT on the chatting interface of the website became particularly difficult to work with since the recent update and it does not seem to always take into account the last message sent. So I guess now they ask the bot first whether the question is likely to reference the last message if the user asks a question like "why ?" without any context. Else maybe they use the semantic search on the whole history of the chat or just part of it. That does seem like it would be cheaper for them than sending the whole conversation but then there is the risk that the bot might not understand that the user's message references the last message.

pale mountain
#

Thanks. That's a good analysis. Yes, I've asked for new line delimiter which works pretty well for lists, however, the list may have a number before each item or may be prefixed with "The top 5 blah blah for your question is:". I was also able to say something like "and no preamble" which also worked but not sure how consistent this all will be.

hallow hearth
#

Hi, I'm the new kid on the block, so I hope that I am posting at the right place to try to assess the price of the API. On the basis of this https://openai.com/pricing, am I calculating correctly ?

  • I have a sheet with 3000 lines of description of approx. 50 words = 150 000 words -> 150 x 0,03$
  • For each line, I want GPT4 to create 7 keywords = 21 000 words -> 0,06$ x 21 ?

Thanks

young kraken
#

so no, that is not an accurate estimate of the costs

junior osprey
#

You can skip the text I wrote at the beginning and check the prompt that I highlighted like

this

pale mountain
#

I was able to get it to answer multiple questions and return the data in a Ruby hash. So far it seems consistent but we'll see. I'll be putting out a demo in a week or so.

junior osprey
pale mountain
#

That's what I was thinking. Basically the answer comes out like this: { "q1": ["listitem1","listitem2"], "q2": ["single line answer"], "q3": ["another single line asnwer"]}