#How can I reduce the time of getting an answer from the API to 1-2 seconds?
5 messages · Page 1 of 1 (latest)
The time depends on the current load of the server. There's nothing you can do but shorten your prompt to reduce the time.
I've read lowering max tokens can help. I agree, the API is slow.
streaming the answer helps it feel faster.
Since yours is a chat, I would suggest to go with streaming like webhead suggested, it’s not fast but it gives you the feeling of faster response the same thing you see on the UI of chat gpt. If you don’t think it works for you see if this helps in your use case #1143773908139515965 message