Context: My application has a time limit to respond, so to make sure it always gives a response I'm using stream=True. In the first two days in production (03/22 and 03/23), I had some of the requests taking longer than the time limit to give the first partial response (sometimes more than 30 seconds), a behavior that led me to believe that I was passing the API rate limits and that's why they were putting me in a queue. I've handled the code to catch this error, but now even with more traffic it's not happening anymore.
I would like to know if each partial response when using stream=True is considered a request, and if so, does the request limit remain the same in stream=False and stream=True? (since stream=True easily catches hundreds of partial responses)