I have an issue with my gpt-3.5 api chatbot. When approaching the token limit, it will error out due to submitting too many tokens.
For instance, when I use the openAI python import to send a message, it will say I sent in 4154 tokens, when I really sent in 3736.
What's going on?
Here's my py code for token encoding. Along that, it counts the number of tokens in messages.content
def tokenizer(self, prompt2):
self.splitIntoTokens = self.tokenModel.encode(prompt2)
#print(self.splitIntoTokens)
return len(self.splitIntoTokens)
Here's my confusion.
If I send a request to the api using cURL,
{
"model": "gpt-3.5-turbo",
"messages": [{"role": "user", "content": "How many tokens is this message"}]
}'
By my calculations, that should be 6 tokens.
I get back
"usage":
{"prompt_tokens":14,"completion_tokens":32,"total_tokens":46},
"choices":[{"message":{"role":"assistant","content":"As an AI language model, I do not have access to the message you are referring to. Please provide the message so I can determine the number of tokens."}
Why does it say prompt tokens is 14?
I found this tiktoken example
https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb
tokens_per_message = 4 # every message follows <|start|>{role/name}\n{content}<|end|>\n
tokens_per_name = -1 # if there's a name, the role is omitted
num_tokens = 0
for message in messages:
num_tokens += tokens_per_message
for key, value in message.items():
num_tokens += len(encoding.encode(value))
if key == "name":
num_tokens += tokens_per_name
num_tokens += 3 # every reply is primed with <|start|>assistant<|message|>
return num_tokens
Could someone explain this code to me?