#Context related questions consultation
13 messages · Page 1 of 1 (latest)
Can you give us some more information? Do you mean something like a chat bot?
Yes, because openai's API interface has the maximum tokens limit, if you encounter content that cannot be fully displayed at one time, you need to continue to output it. The official documentation of this problem does not seem to see a clear explanation.
Yeah, you're restricted to the tokens limit
After that, you need to implement other strategies, such as using embeddings, or creating summaries of the chat
in order to prolong the chat
Can you give me a demo case? After I merge the historical dialogue content into prompt, I will always prompt "This model's maximum context length is 4097 tokens, however you requested 4190 tokens, Please reduce your prompt; or completion length.", openai seems to have calculated the tokens length of prompt as well. If the conversation continues, the historical dialogue content will cause the prompt to get bigger and bigger, and it will soon be overrun. I don't know if there is a better solution.
Yes, the prompt token length is also included
So you need to tokenizer your prompt, and then subtract the amount of tokens in your prompt from 4096
you can use the various tokenizer libraries available online for GPT2 and GPT3
But the result of the tokenizer is not very accurate, it seems that it can only be like this at present, thank you.
the results of the tokenizer are identically accurate for me
i use the GPT2Tokenizer library from HuggingFace