#Performance when stream=false via the chatCompletion API.

3 messages · Page 1 of 1 (latest)

dawn tinsel
#

Is there a way to expedite the response time for chatCompletion calls when stream is set to false? It appears that the model needs to generate all the text before returning to the user. However, does it already have all the text, or will it only produce the subsequent words?

charred gust
#

why not just set the stream to true?

tribal igloo
#

When the response starts (using stream=true, as you experience on chat.openai.com), the model, when outputting the first words, have no idea how the sentence ends. It's literally generating it's response on the fly. So you either wait for a complete response or stream, which is best suited for user interactions 🙂