#Token usage
1 messages · Page 1 of 1 (latest)
Hi @deep gyro. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:
The length of an answer generated by GPT-4 Turbo or any other model in the GPT series is determined by several factors, including the max_tokens parameter and the total number of tokens used in the API call.
-
Max Tokens Parameter: The
max_tokensparameter is a key factor in controlling the length of a model's response. This parameter sets the maximum length of the generated response. However, the total maximum length depends on the specific model used for the request. You can adjust themax_tokensparameter to get longer responses from the model (source (https://community.openai.com/t/729349)). -
Total Tokens in API Call: The total number of tokens used in an API call includes both input and output tokens. For example, if your API call used 10 tokens in the message input and you received 20 tokens in the message output, you would be billed for 30 tokens. The total tokens must be below the model’s maximum limit. For
gpt-3.5-turbo, the maximum limit is 4097 tokens (source (https://platform.openai.com/docs/guides/text-generation)). -
Model's Context Window: The model has a context window, which is the maximum memory of tokens it can handle at once. For
gpt-4-turbo, you can include over 120,000 tokens. However, the sum of input and output tokens must stay within this limit (source (https://community.openai.com/t/732733)). -
Long Conversations: Very long conversations are more likely to receive incomplete replies. For example, a
gpt-3.5-turboconversation that is 4090 tokens long will have its reply cut off after just 6 tokens (source (https://platform.openai.com/docs/guides/text-generation)).
To see how many tokens are used by an API call, you can check the usage field in the API response (e.g., response**'usage']['total_tokens']). If a conversation has too many tokens to fit within a model’s maximum limit, you will have to truncate, omit, or otherwise shrink your text until it fits ([source** (https://platform.openai.com/docs/guides/text-generation)).