#rate limits in headers
1 messages · Page 1 of 1 (latest)
Hi @hazy inlet. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:
Sure, the rate limit information can be accessed from the headers of the HTTP response when you make a request to the GPT-4 API. Here is an example of how you can do this using the OpenAI Python API library:
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.with_raw_response.create(
model="gpt-4",
messages=[
{
"role": "user",
"content": "Say hi!",
}
],
)
print(response.headers.get("x-ratelimit-reset-requests"))
completion = response.parse()
print(completion.model_dump_json(indent=2))
In this example, the response.headers.get("x-ratelimit-reset-requests") line is used to access the x-ratelimit-reset-requests field from the headers of the HTTP response. This field indicates the time until the rate limit (based on requests) resets to its initial state.
You can replace "x-ratelimit-reset-requests" with any of the following to get the corresponding rate limit information:
"x-ratelimit-limit-requests": The maximum number of requests that are permitted before exhausting the rate limit."x-ratelimit-limit-tokens": The maximum number of tokens that are permitted before exhausting the rate limit."x-ratelimit-remaining-requests": The remaining number of requests that are permitted before exhausting the rate limit."x-ratelimit-remaining-tokens": The remaining number of tokens that are permitted before exhausting the rate limit."x-ratelimit-reset-tokens": The time until the rate limit (based on tokens) resets to its initial state.
Please note that this example is for the OpenAI Python API library. If you are using a different client, the method to access the headers may be different.