16K max for GPT32K | OpenRouter | Page 1

hard thicket Aug 24, 2023, 5:08 AM

#

I can never seem to get more than 16k tokens in total out of the 32K model. If I supply 15K of input and ask for a comprehensive response I get a heavily concatenated response back (less than 1k).

sharp crater Aug 24, 2023, 6:09 AM

#

Just to confirm, this is with openai/gpt-4-32k right? - have you tried tweaking the max_tokens setting?

hard thicket Aug 24, 2023, 7:05 AM

#

Yes it's gpt-4-32k. The max setting only allows me up to 25k tokens. Anything higher and it gives an error that I am requesting more than 32k tokens. But even at 25k tokens it still only gives me maximum 16k.

#

Could it be as a result of the chat memory? The chat memory has a minimum of 2. And 2 x 16k = 32k. Does it keep the extra 16K for chat memory?

sharp crater Aug 24, 2023, 8:56 AM

#

Note that 32K is the total context size afaik, so your prompt will also be counted as well.

#

So I'm not sure how large your prompt was but if it gives you max 16K on completion, and your prompt was roughly 16K, I think that's the limit (?)...

hard thicket Aug 24, 2023, 9:05 AM

#

It doesn't give me 16K on completion. I would have loved it if it did. Look at the image I posted.

#

It gave me a 1K response after a 15k request. 15+1 = 16. I need a 15k response.

#

brave trail Aug 24, 2023, 9:39 AM

#

imho, it depends on instruction and hyperparameters. If you instruct "say hi", it will not generate 16k tokens of "hi".

hard thicket Aug 24, 2023, 9:46 AM

#

As I said, I asked it for a comprehencve reply. I supplied a 15k script and asked it to edit it. It supplied a highly concatenated rsponse that's only 1k in size.

#

Can you confirm that you're able to get any response bigger than 16k? And give me the prompt so I can test it on my end?

#

Here's an example of a 16K prompt that gives a 1k (concatenated) response. Can you see if you get the same on your end?

📎 16K_Prompt.txt

#

hard thicket Aug 24, 2023, 11:15 AM

#

Maybe your GPT-4 32k is in fact GPT -3.5 Turbo 16k ??

brave trail Aug 24, 2023, 12:33 PM

#

hard thicket Maybe your GPT-4 32k is in fact GPT -3.5 Turbo 16k ??

turbo can't solve this problem:

hard thicket Aug 24, 2023, 12:36 PM

#

Did you try the sample prompt I gave? Do you get 32K tokens?

#

Is it something to do with my account?

brave trail Aug 24, 2023, 12:41 PM

#

Unlikely, most probably it's a problem with the prompt itself, or hyperparameters like temperature, penalties, topp, etc - if finish reason is "stop", that means the model decided to emit "end of text" token and has nothing else to add.

Here's me testing on a big prompt:

$ head -n 8000 /usr/share/dict/words > prompt && bash prompt.sh
{
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "anchoretism\nanchoretist\nanchorettish\nanchorhold\nanchorite\nanchoritic\nanchoritical\nanchoritically\nanchoritism\nanchorless\nanchorlike\nanchorman\nanchormen\nanchorness\nanchorwise\nanchovy\nanchusin\nanchusine\nancience\nanciency\nancient\nancientism\nanciently\nancientness\nancientry\nancienty\nancile\nancilla\nancillary\nancipital\nancipitous\nAncistrocladaceae\nancistrocladaceous\nAncistrocladus\nancon\nanconad\nanconagra\nanconal\nancone\nanconeal\nanconeous\nanconeus\nanconitis\nanconoid\nancony\nancora\nancoral\nancylose\nancylotomy\nancyroid\nAnd\nand\nAnda\nAndaman\nAndamanese\nAndamooka\nandante\nandantino\nandarum\nAndaste\nandean\nandeite\nAndeles\nAndersen\nAndersenian\nandesine\nandesinite\nandesitization\nandesitize\nAndi\nAndian\nAndira\nandirin\nandirine\nandiroba\nAndoke\nandorite\nAndorobo\nAndorra\nAndorran\nAndouillet\nAndouin\nandradite\nandragogy\nandrajo\nAndranatomy\nAndranip\nAndranis\nAndras\nAndre\nandreaea\nAndreaea\nAndreaeaceae\nandreaeaceous\nAndreaeales\nAndrean\nAndrei\nAndrene\nandrewsite\nAndria\nAndriana\nAndrias\nandric\nAndrieu\nAndries\nandring\nandrite\nandrodioecism\nandrodioecious\nandrodynamous\nandroecial\nandroecium\nAndroecium\nandrogen\nandrogenesis\nandrogenetic\nandrogenic\nandrogenous\nandroginous\nandrogonia\nandrogonial\n"
      },
      "finish_reason": "length"
    }
  ],
  "model": "gpt-4-32k-0613",
  "usage": {
    "prompt_tokens": 32319,
    "completion_tokens": 449,
    "total_tokens": 32768
  },
  "id": "gen-asY1YtIbdltgtwQvrEDD7OEK"
}

Script is attached for you to reproduce.

📎 message.txt

#

In this run, finish_reason is length, therefore we reached end of context (32768 tokens) or max response length

hard thicket Aug 24, 2023, 12:43 PM

#

Thanks, I will try it. How can I see the finish reason?

brave trail Aug 24, 2023, 12:44 PM

#

it's in the response json above

hard thicket Aug 24, 2023, 12:44 PM

#

How do I get the response json for prompts I create on your website?

brave trail Aug 24, 2023, 12:45 PM

#

I'm OpenRouter user

#

I don't think their playground is meant for anything but checking if model works and for comparing reasoning capabilities.

brave trail Aug 24, 2023, 12:50 PM

#

hard thicket Here's an example of a 16K prompt that gives a 1k (concatenated) response. Can y...

In general, LLM's don't work on letter-level. They work on token-level. Also, reptition is LLM weakness. And instructions should be after the data to work on.

So you're throwing most complex tasks for an LLM to do in a single prompt - replace letter inside word, input is super repetitive, and not paying attention to how LLM's attention mechanism works, and you haven't told it how long you want your response - therefore it loses track of what it should do in the middle of response.

Take a read at OAI's best practices for prompting: https://platform.openai.com/docs/guides/gpt-best-practices

This applies to all LLM's, not just OAI -- Anthropic's docs are basically saying the same, but worded differently: https://docs.anthropic.com/claude/docs/prompt-troubleshooting-checklist

You're not explaining task simply and clearly
You haven't asked LLM how it understands the task.
Since it doesn't work, you haven't tried to break down the task into subtasks.
Your instructions are before the data to work on, not after.
You haven't given examples of task executed perfectly.

Another document by OpenAI saying the same - https://github.com/openai/openai-cookbook/blob/main/techniques_to_improve_reliability.md

hard thicket Aug 24, 2023, 1:16 PM

#

I did try to get the AI to edit a script, several times with different scripts. It will edit a small script but not something big. The reason I subscribed to your platform is to edit large scripts which gpt4 concatenates. But gpt 32m does the same rendering it useless for my purposes

brave trail Aug 24, 2023, 1:23 PM

#

again, I'm not working for OpenRouter

#

just trying to help a fellow user

#

clearly separate instruction from your script
move your instruction under the script
clarify that entirety of script needs to be edited
clarify that "AI must avoid skipping for brevity"
use OpenRouter directly, not openrouter.ai/playground

etc etc.

#

@hard thicket hope this helps

#

Prompting LLM is a skill that needs to be learned - It's not human, you need to explain to it what to do in very clear unambiguous way.

#

If LLM is not doing something, something that you think is implied from other instructions - then you need to tell it to do that.

#

Also, LLM doesn't always understand negatives in your words - "do not say bob" - it will read as "do say bob". Verbs are more powerful than nouns - "AI must avoid mentioning Bob" will work better.

hard thicket Aug 24, 2023, 1:33 PM

#

Ok thank you, I will keep trying.

patent glen Aug 25, 2023, 12:34 AM

#

hard thicket Here's an example of a 16K prompt that gives a 1k (concatenated) response. Can y...

this is just an annoying problem with openai's models

#

they are trained to give short responses

#

even when you tell it to respond with a whole thing it will sometimes randomly stop if its too long

patent glen Aug 25, 2023, 12:37 AM

#

hard thicket I did try to get the AI to edit a script, several times with different scripts. ...

Unfortunately, the best option here seems like chunking up the script

#

I too am experiencing the same issues trying to get it to edit a document

patent glen Aug 25, 2023, 12:41 AM

#

hard thicket I did try to get the AI to edit a script, several times with different scripts. ...

Here's what I would do in your situation for the best results (and what I am working on right now)

Chunk up the text with a text chunking algorithm
Feed each chunk to GPT-4, with a summarized version of the text before it and after it
Tell GPT-4 to edit the chunk provided

#

Merge all edited chunks together

#16K max for GPT32K