I am using the completions API (Davinci model) which has a 'max_tokens' parameter, with a max of 4096, per the documentation. I want to allow the API to generate the full max amount (up to 4096), but am getting errors often because I can't figure out how to accurately set max_tokens, which has to be offset by the # of tokens that my input prompt consumes.
I am using C# and don't know how to accurately calculate the # of tokens for the input prompt. Word counts don't work and neither does 1 token per 4 characters, which the documentation suggests as a loosely accurate measure.
For example:
var prompt = "what is the capital of tennessee?"
var promptTokens = (prompt.Length / 4);
So, I set max_tokens in my API request to:
max_tokens = (4096 - promptTokens)
The error I get is: "This model's maximum context length is 4097 tokens, however you requested 4099 tokens (9 in your prompt; 4090 for the completion). Please reduce your prompt; or completion length."
How do I solve this problem? Thanks!