#API Response Time

3 messages · Page 1 of 1 (latest)

twilit ravine
#

Hello guys, I'm currently using the api to request the mistral-small model. But I'm seeing a fairly long response time (around 30 seconds), knowing that the expected response is in the form of a JSON and contains around 1500 characters.

First of all, is it normal to have such a long response time? And if not, do you know of anything that could help me reduce this time? If need be, I'd change the way I request the api by splitting it into several requests, but I'd like, as far as possible, not to have to do this.

Thanks for your help !

near canyon
#

you can try streaming and see if you get the next tokens in a reasonable time

twilit ravine