#slow Response
30 messages · Page 1 of 1 (latest)
Yes, same here. Responses are pretty slow with all models but worst with OpenAI models
I had always get response from PaLM in 3 seconds sometimes but now it’s very crazy, now 20seconds, OpenAI models are like 30 seconds or more
Yeah, it must be an issue on OpenRouter’s end. Calls to OpenAI API directly are fine and quick.
Hopefully this is fixed soon!
Just see https://movieuncover.com , it so slow it’s loads in 10seconds maximum before. I’m frustrated
Get movie recommendations by simply describing the movie you're thinking of, and the platform will provide you with movies that matches your description.
Same here
Can you try again? We just made a change that should have sped things up a lot
Just tried turbo 3.5 1106. 28 seconds slower than OpenAI for me
Palm 2 respond in 11seconds. Used to be 3 - 5 seconds from you guys.
@dapper oyster I've been noticing the same. Extremely long waiting times to start getting a response, more than 10 seconds. Happens with all the models I use, including Mistral 7b OpenHermes
Can we know the cause?
I hope this is fixed!
Same, I'm around 20-30s with mythomax and with a simple hello world message without any context. Sometimes I got a reply in 10-15s but very rarely.
just waiting it out is the best
or just pushing through with a 10 minute conversation with 3 messages
Working on adding new providers so we have more options for the same models. Right now, we can’t really control the “Host” times on the activity page except by deploying the model ourselves, and when we do, it faces its own load issues
Will be scaling up our own deployments to help
Getting pretty slow responses with MythoMax 13B.
Mystral 7B
Meta Llama-2 13B
GPT 3.5 16k
In all of them my query was "Hello there"
From the OpenRouter activity page, the speed seems fine though
My internet is fast , I'm using the openai npm package to make requests with stream on.
Here is the difference when using a direct call to OpenAI with the same parameters:
Please let me know if you need any more info that might be useful to debug this, and thanks for the support.
This is wild. We're working on it - launching a few experiments today. Part of the issue is that the main Mystral and Llama providers are going down more than normal. OpenAI is also showing massive latency increases, but it's very sporadic. And if OpenAi returns a 502, we redirect to Azure, which seems to be happening more now
and the "Host" column in Activity just currently shows the latency for the one successful host. other latency metrics are on the way
Thanks for the quick reply and for looking into this @dapper oyster !
I just want to point out that this issue is not only in these models.
I just tried the same query in the following models and here are the total times it took for each:
Claude-1:
Waiting for server response: 15s
total: 36s
Claude-2:
Waiting for server response: 50.61s
total: 51s
Zephyr:
Waiting for server response: 26s
total: 54s
PaLM-2:
Waiting for server response: 50s
total: 50s
I can't concretely deduce anything but it seems to be an issue from OpenRouter's server itself. Is it possible to check the CPU and RAM usage/limits to verify?
Yeah we are definitely doing something wrong - possibly when autoscaling. We're switching to new infra to try to fix this
Cross reference with host time from activity if you need it