slow Response | OpenRouter | Page 1

minor palm Nov 17, 2023, 5:22 PM

#

I’ve been using PaLM 2 Code Chat and I get response within 5seconds to 7 secs. But for a while now it’s been 20 - 30 seconds. I tried other models too and it’s that slow. So I guess it’s from open router.

frank scaffold Nov 17, 2023, 5:42 PM

#

Yes, same here. Responses are pretty slow with all models but worst with OpenAI models

minor palm Nov 17, 2023, 6:01 PM

#

frank scaffold Yes, same here. Responses are pretty slow with all models but worst with OpenAI ...

I had always get response from PaLM in 3 seconds sometimes but now it’s very crazy, now 20seconds, OpenAI models are like 30 seconds or more

frank scaffold Nov 17, 2023, 6:02 PM

#

Yeah, it must be an issue on OpenRouter’s end. Calls to OpenAI API directly are fine and quick.
Hopefully this is fixed soon!

minor palm Nov 17, 2023, 6:04 PM

#

frank scaffold Yeah, it must be an issue on OpenRouter’s end. Calls to OpenAI API directly are ...

Just see https://movieuncover.com , it so slow it’s loads in 10seconds maximum before. I’m frustrated

MovieUncover - Describe, Search, and Watch

Get movie recommendations by simply describing the movie you're thinking of, and the platform will provide you with movies that matches your description.

exotic ingot Nov 17, 2023, 6:23 PM

#

Same here

dapper oyster Nov 19, 2023, 8:04 AM

#

Can you try again? We just made a change that should have sped things up a lot

minor palm Nov 19, 2023, 8:53 AM

#

dapper oyster Can you try again? We just made a change that should have sped things up a lot

Just tried turbo 3.5 1106. 28 seconds slower than OpenAI for me

minor palm Nov 19, 2023, 8:58 AM

#

dapper oyster Can you try again? We just made a change that should have sped things up a lot

Palm 2 respond in 11seconds. Used to be 3 - 5 seconds from you guys.

orchid junco Nov 19, 2023, 2:47 PM

#

@dapper oyster I've been noticing the same. Extremely long waiting times to start getting a response, more than 10 seconds. Happens with all the models I use, including Mistral 7b OpenHermes

#

Can we know the cause?

frank scaffold Nov 20, 2023, 8:57 AM

#

I hope this is fixed!

frank tendon Nov 20, 2023, 3:44 PM

#

minor palm Just tried turbo 3.5 1106. 28 seconds slower than OpenAI for me

Same, I'm around 20-30s with mythomax and with a simple hello world message without any context. Sometimes I got a reply in 10-15s but very rarely.

glad vapor Nov 20, 2023, 3:48 PM

#

just waiting it out is the best

#

or just pushing through with a 10 minute conversation with 3 messages

dapper oyster Nov 20, 2023, 4:32 PM

#

Working on adding new providers so we have more options for the same models. Right now, we can’t really control the “Host” times on the activity page except by deploying the model ourselves, and when we do, it faces its own load issues

#

Will be scaling up our own deployments to help

frank scaffold Nov 20, 2023, 4:57 PM

#

Getting pretty slow responses with MythoMax 13B.

formal saddle Nov 20, 2023, 7:55 PM

#

Mystral 7B

#

Meta Llama-2 13B

#

GPT 3.5 16k

#

In all of them my query was "Hello there"

#

From the OpenRouter activity page, the speed seems fine though

#

My internet is fast , I'm using the openai npm package to make requests with stream on.

Here is the difference when using a direct call to OpenAI with the same parameters:

#

Please let me know if you need any more info that might be useful to debug this, and thanks for the support.

dapper oyster Nov 20, 2023, 8:27 PM

#

formal saddle Mystral 7B

This is wild. We're working on it - launching a few experiments today. Part of the issue is that the main Mystral and Llama providers are going down more than normal. OpenAI is also showing massive latency increases, but it's very sporadic. And if OpenAi returns a 502, we redirect to Azure, which seems to be happening more now

#

and the "Host" column in Activity just currently shows the latency for the one successful host. other latency metrics are on the way

formal saddle Nov 20, 2023, 9:37 PM

#

dapper oyster This is wild. We're working on it - launching a few experiments today. Part of t...

Thanks for the quick reply and for looking into this @dapper oyster !

I just want to point out that this issue is not only in these models.

I just tried the same query in the following models and here are the total times it took for each:

Claude-1:
Waiting for server response: 15s
total: 36s

Claude-2:
Waiting for server response: 50.61s
total: 51s

Zephyr:
Waiting for server response: 26s
total: 54s

PaLM-2:
Waiting for server response: 50s
total: 50s

I can't concretely deduce anything but it seems to be an issue from OpenRouter's server itself. Is it possible to check the CPU and RAM usage/limits to verify?

dapper oyster Nov 20, 2023, 9:47 PM

#

Yeah we are definitely doing something wrong - possibly when autoscaling. We're switching to new infra to try to fix this

formal saddle Nov 20, 2023, 9:49 PM

#

Cross reference with host time from activity if you need it

#slow Response