#How to calculate max_tokens?

11 messages · Page 1 of 1 (latest)

thorny wigeon
#

I am using the completions API (Davinci model) which has a 'max_tokens' parameter, with a max of 4096, per the documentation. I want to allow the API to generate the full max amount (up to 4096), but am getting errors often because I can't figure out how to accurately set max_tokens, which has to be offset by the # of tokens that my input prompt consumes.

I am using C# and don't know how to accurately calculate the # of tokens for the input prompt. Word counts don't work and neither does 1 token per 4 characters, which the documentation suggests as a loosely accurate measure.

For example:

var prompt = "what is the capital of tennessee?"
var promptTokens = (prompt.Length / 4);

So, I set max_tokens in my API request to:

max_tokens = (4096 - promptTokens)

The error I get is: "This model's maximum context length is 4097 tokens, however you requested 4099 tokens (9 in your prompt; 4090 for the completion). Please reduce your prompt; or completion length."

How do I solve this problem? Thanks!

#

How to Calculate max_tokens?

#

How to calculate max_tokens?

fringe pelican
#

Hi there, you can use the GPT2Tokenizer or the new GPT3Tokenizer. I currently use the GPT2Tokenizer in my projects and it works great and always gives me the exact token count.

from transformers import GPT2TokenizerFast

tokenizer = GPT2TokenizerFast.from_pretrained("gpt2")

res = tokenizer(text)["input_ids"]

num_tokens = len(res)

covert raptor
fringe pelican
#

Not that I can see, I'm sure there are differences in edge cases but GPT2 tokenizer's worked beautifully for me so far

#

but it's probably a better idea to switch over to the GPT3Tokenizer, cuz why not

covert raptor
#

hmmm true thx man ill look into adding that to my CLI project

thorny wigeon
#

I am using C# / .NET - where can I find this tokenizer for C# instead of python or javascript?

fringe pelican
thorny wigeon