#Continue generating after hitting max output length.
1 messages · Page 1 of 1 (latest)
Example:
the message is merged into the previous one and allows to continue generating even after cutoff.
This is already implemented. The API will send a finish_reason field in the chat completion response which tells us if it got to this point. Except with LibreChat, you can also hit "continue generating" at any time but it only seems to work well if the previous message was cut off
I'm re-reading your message to make sure I understand
in OpenAI interface that once it hits the token output limit, it can keep on generating to continue the text since in most cases the context length is longer than the output length limit
do you mean it will continue generating by itself?
Yes I don’t know I think the model hit the output limit of 4K tokens for 1 message and stopped abruptly, there was no continue generating option.
It was likely a failure of the API to give that finish reason field but I can also double check. In short this functionality is already implemented and expected behavior
Ok thanks. I will check if I have the logs when I get home.
I went ahead and checked, there seems to be no finish_reason.
which endpoint are you using?
looks like Bing..
Yes, it is Bing.
Bing doesn't return a finish_reason like OpenAI. It's an unofficial API