Wrong tokens count // when Transformed by OpenRouter? | OpenRouter | Page 1

shrewd kite Sep 20, 2023, 8:11 AM

#

Hi,

I've been using OpenRouter since 1-2 months, and never had issues.

I didn't use since few days, but now I see persistent issue, errors such:
This model's maximum context length is 8000 tokens. However, you reques
ted 11147 tokens (11147 in the messages, 0 in the completion). Please re
duce the length of either one, or use the "middle-out" transform to compress your prompt automatically.

Do you apply transforms by default?

Because I can't find even find one request which displayed Input / Output tokens size is correct: mostly they are higher, even for the output.

I ask that because for example this 11147 tokens prompt (lol) is in fact at MOST 6k tokens (but allegedly it was <4k)
We have implemented our own tokens quantity handlers, which makes it always very safe and fallbacks to bigger models as soon as the message + 30% reaches 95% of model's token limits.... 😦

Thanks for help

#

P-s. we're running the request throught a custom langchain-based wrapper

P-s2. maybe did you update/change the instruct mode?

eternal pine Sep 20, 2023, 3:51 PM

#

We have changed the default instruct formatting for most of the open source models in the past few weeks, but you can opt out by formatting prompts yourself and passing prompt in your request body instead of messages

#

Which model are you using and what tokenizer are you using to count tokens? cc @rotund valve

slow orbit Sep 20, 2023, 4:20 PM

#

I'm getting a weird behaviour also related with this, using anthropic claudev2. Only by sending hello I'm getting

"error"=>{"message"=>"max_tokens limit exceeded: 99999 > 98016. To increase: visit https://openrouter.ai/account and add more credits", "code"=>402}

eternal pine Sep 20, 2023, 5:23 PM

#

@slow orbit weird, I'm not able to reproduce this, even on an account with 0 balance and without specifying max_tokens. Are you specifying max_tokens manually? what happens when you use openrouter.ai/playground ?

shrewd kite Sep 21, 2023, 1:39 PM

#

Hi @eternal pine
Thanks for your answer

To use prompt instead of messages, using Langchain, do you have any guidelines for me, please?

#

Tokens counting, classic tokenizer in base 100
Models: all of them, but that seems to happen in first instance with palm2

#

Btw, is it possible that your instruct format doubles the tokens count? 👀

#

Also, can't we use a param like "instruct: false" instead of switching from "messages" to "prompt"?

Because it seems that it challenges the whole usage of Langchain's ChatOpenAI wrapper flawlessly with openrouter, and force us to make a custom wrapper in that case, no?

slow orbit Sep 21, 2023, 2:13 PM

#

it stopped happening, for the account causing troubles

shrewd kite Sep 21, 2023, 2:14 PM

#

slow orbit it stopped happening, for the account causing troubles

great, maybe due to the recent update 🙂
I didn't try since then...

eternal pine Sep 21, 2023, 2:25 PM

#

shrewd kite Btw, is it possible that your instruct format doubles the tokens count? 👀

The instruct formatting does increase token counts for open source models, in order to format the messages array into a single prompt string.

I'm not sure how to get LangChain to submit a prompt instead of messages, but check their docs because it must be doable for them to support e.g. Davinci models

shrewd kite Sep 21, 2023, 2:26 PM

#

@eternal pine okay thanks
but palm2 is considered opensource also?
and this increase can really reach twice or more the initial size?

eternal pine Sep 21, 2023, 2:27 PM

#

no, palm is closed and accepts a messages array natively

shrewd kite Sep 21, 2023, 2:27 PM

#

👀 wtf, because the issue happened with palm lol

#

Btw, if we send both prompt and messages, what happen?

eternal pine Sep 21, 2023, 2:27 PM

#

it depends on what the initial size is. Alpaca prompt transformation wraps all messages in ### Instruction: or ### Response: for example

eternal pine Sep 21, 2023, 2:28 PM

#

shrewd kite Btw, if we send both prompt and messages, what happen?

shouldn't be allowed

shrewd kite Sep 21, 2023, 2:28 PM

#

isn't it possible to allow it and make prompt override the messages? this would be a solution
even if the best remains a simple param to disable instruct formatting

shrewd kite Sep 21, 2023, 2:29 PM

#

eternal pine it depends on what the initial size is. Alpaca prompt transformation wraps all m...

4-5k initial -> 11-12k after formatting?

eternal pine Sep 21, 2023, 2:29 PM

#

native tokens for Palm are characters

#Wrong tokens count // when Transformed by OpenRouter?