#Getting around token limit

10 messages · Page 1 of 1 (latest)

tepid saddle
#

I am trying to send data to the ChatCompletion api, but the data + the prompt is over the 4096 token limit. Is there a way i can continue the response when it runs out of tokens similar to how the continue button works on the chatgpt website?

humble veldt
#

ask it to break it down into chunks and confirm before posting the next chunk. It will simply wait for you to acknowledge the response before continuing.

A number of clients also offer ways around this.

It's a system designed to emulate human language - talk to it like a human. Use terms like "Can you break it down for me" and "If you reach your limit confirm before moving to the next response." Sometimes it chains them together, sometimes it will want a new prompt - work with the model not against it.

old wharf
tepid saddle
#

@old wharf @humble veldt is there a way to only send the 4k token and then send another request if there are more tokens?

old wharf
#

Sure, but the model would only have the context of the new tokens.

tepid saddle
#

is there a python tool for counting tokens? how would i do that

old wharf
#

There's one mentioned in the docs. I don't remember the exact name. Something about "tictokenizer". Also, you get the used tokens count in the API response.

tepid saddle
#

hmm thanks. im basically getting dates and event data from a pdf, so i need every date and it cannot be summarized in any way. im thinking that I count tokens before i send the request and split it into multiple requests if need be

humble veldt
#

There's quite a few mrthods and specialised embeddings for doing this.

Document processing can be done using 'chaining' for large inputs - there's a lot of talk about this and a variety of tools

tiktoken i think is the lib ...?

tame shoal
#

Why not use the new 16k token 3.5 model? That’s a huge increase over 4096…