#Incredibly slow api calls for gpt-4, gpt-3.5. Hoping to contribute to solving this!

22 messages · Page 1 of 1 (latest)

daring spear
#

hey there! seems this has been posted a few times, but have yet to see a solution set that applies for me. i’m building an app (a discord bot) that is cloud hosted with a powerful vm, and runs GPT-4 (8k). i am not a free user. api calls tend to take several minutes to fully complete— and occasionally i’ll get the error message attached. curious if other folks have found solution sets here, or if it truly rests entirely with openai as they scale to meet the massive demand. thank you all! would love to contribute to finding solves here, or being pointed toward solutions that i may have missed in my search.

dusky storm
#

Can you share some of your request IDs that were failing?

daring spear
#

thank you for the response! this one is from 10s ago @dusky storm

dusky storm
daring spear
#

@dusky storm can i dm or call you?

stiff cosmos
daring spear
#

nothing yet, unfortunately

strange flame
stiff cosmos
strange flame
stiff cosmos
storm sparrow
#

it seems to be random to me, if I'm calling the api at 3am or 3pm I personally don't notice a difference in how many times it tells me there's an error. pretty much only happens when I'm throwing a lot of text at it

spice wadi
spice wadi
#

in python

daring spear
#

it’s just stream=true,

#

but the latency delta may be negligible (or impactful!) depending on your application; may also be helpful as a preventative measure for timeouts. unsure of the theoretical impact (if any) of default use across devs being streaming, though

daring spear
stiff cosmos
#

Thanks for this piece of useful information. The stream method is not useful in my opinion, except the application emulates the typing style like CHATGPT did🙂. Really thankful for the sharing.

#

I am not sure the inconsistent time taken for completions. But from testing time ,I guess there is peak time where most users used them. The model became very slow in response due to massive requests. It's very painful for the bad experience, especially customers based applications.