#GPT 3.5 become 10x slower than 2 weeks ago
8 messages · Page 1 of 1 (latest)
this request took 1 minute:
{"model":"gpt-3.5-turbo","messages":"temperature":0.5,"max_tokens":3000}
completion_tokens: 201,
total_tokens: 349
Is it OK 1 minute for these numbers?
try lowering the max tokens to 1000
Lowering the tokens may work however what is the point if you can’t use the API as intended. These issues make it difficult to actually make products from OpenAI with these laggy issues. I’ve also been testing over the last 4 months and the same issues keep popping up. Main with the endpoint for gpt3 turbo being overloaded, and the error response is try again later or contact OpenAI. We need more hardware in place to process more requests and resources is the only real solution to this ongoing problem imo. Anyone else have an opinion?
@mild pulsar im having a possibly related problem.
the backend is using gpt 3.-1-turbo-0301 isntead, which is 10x more expensive, and maybe also slower
I forced to switch to davinchi-003, because the app is unusable with this speed. davinchi renders it for 15 sec.
Yeah , I agree they need more scalability. Way more