#Performance when stream=false via the chatCompletion API.
3 messages · Page 1 of 1 (latest)
why not just set the stream to true?
When the response starts (using stream=true, as you experience on chat.openai.com), the model, when outputting the first words, have no idea how the sentence ends. It's literally generating it's response on the fly. So you either wait for a complete response or stream, which is best suited for user interactions 🙂