I'm using the chat/completion API with gpt-3.5-turbo. This appear to work, but when I send a relatively small prompt, I get socket hangup on request unless I specify in my prompt to keep the result under 300 characters.
The total tokens being used is not high. When I do receive a response back after specifying a limit, I see 70 tokens used.
What could cause a socket hangup with no further error?