#Incredibly slow response times

4 messages · Page 1 of 1 (latest)

frozen kestrel
#

I'm getting response times around 15s from the chat completion endpoint with GPT-4 (give or take about 5 seconds). Pretty consistently in that range for 200 - 600 total tokens.

Is this standard? Of course there's some heavy duty work going on during the call, but 15 seconds is far too long for my application (and I suspect most applications).

slate cobalt
#

are you experiencing a 15 sec response time for a completed chat, or a stream of tokens?

frozen kestrel
#

This is for a completed chat, though a streamed response takes about the same amount of time to finish

proud lily
#

I see , ahahaha, maybe most developers shifting their products from 3.5 turbo to 4 🤣 3.5turbo is faster than previous runs, now 4 is overloading, I guess