#Token estimate

44 messages Β· Page 1 of 1 (latest)

viscid sigil
#

ChatGPT has 4000 tokens. The whole conversation counts, not just what you write, and when you go over that limit the earlier part of the conversation is out of its memory. If you ask:

"how many tokens is this conversation long?"

You get a rough estimate how many tokens you already used up. In my case it said: roughly 500. In reality it was 802, which I checked in the openAI playground (assuming that text-davinci-003 has the same way of counting tokens). So it's not super precise, and may be an underestimate.

Another way is to ask:

"how many words is this conversation long?"

Then you get in my case an "approximately 745 words", so that seems better. The real number of words was 670. The token count will almost certainly be higher if you talk with it in another language, because it tends to break up words into several tokens in other languages.

red fable
#

Ooh. Are you willing to allow us to know what was said before you asked about how many tokens... or are we starting from 'new chat' with that short question and decently brief answer already using up 802 tokens?

viscid sigil
red fable
#

have you considered the possibility that ChatGPT might not be retaining -all- words it has shown or received, and might in fact be retaining a summarized form of what it and you have discussed?

viscid sigil
#

The reason i asked it "what are positive things about your house burning down" was as an exercise to always see something good in anything bad, because that makes life easier. Now I know that I can outsource that task, lol.

viscid sigil
red fable
#

Haha. I'm not concerned about other aspects of the question, except 'what do I estimate about the volume of content' πŸ˜›

viscid sigil
#

But try it for yourself.

red fable
#

I really can't reproduce this.

Between initial prompt and this inquiry, approximately how many tokens have already been used in this conversation, please?

I'm sorry, but as an AI language model, I don't have the capability to keep track of the number of tokens used in a conversation. However, you can estimate the number of tokens used in a text by counting the number of words and dividing it by the average number of words per token, which is typically around 5. The exact number of tokens per word can depend on the tokenization scheme used, but this is a rough estimate.

viscid sigil
red fable
# viscid sigil first. why would you divide by it, you should be multiplying by it, or not? Sec...

Here is what I attempted, and how the bot responded. Some of what you ask about 'why' and such, is from the bot's output. So I cannot explain, but I can sure quote:

First attempt, within a new chat:

Me: How many tokens of memory are already used in this conversation?

It: As an AI language model, I don't have a specific memory capacity, but I process each input independently, so there's no way to determine the exact number of tokens used in this conversation. However, each input is processed in real-time and I generate a response based on the input prompt and the model's understanding of language and context.

#

Me: how many tokens is this conversation long?

It: The length of a conversation in terms of tokens can vary depending on the complexity of the language used and the length of the inputs. In this case, the length of the conversation so far can be estimated by counting the number of tokens in each input and response. However, without access to the actual inputs and responses, it's not possible to determine the exact number of tokens used in this conversation.

#

In another 'new chat'

Me: Between initial prompt and this inquiry, approximately how many tokens have already been used in this conversation, please?

It: I'm sorry, but as an AI language model, I don't have the capability to keep track of the number of tokens used in a conversation. However, you can estimate the number of tokens used in a text by counting the number of words and dividing it by the average number of words per token, which is typically around 5. The exact number of tokens per word can depend on the tokenization scheme used, but this is a rough estimate.

#

@viscid sigil Please advise, I would be glad to understand better how to fine tune this prompt to get a count of tokens, even an approximation, as output!

paper panther
viscid sigil
#

I can't get it to work with the new model πŸ˜• With the "Legacy Model" it still works...

red fable
red fable
# viscid sigil Can you please copy and paste the whole text file that I shared here into your p...

But I can't do as you ask because one of the questions you ask the bot is 'what are positive things about your house burning down' and I will not inform the bot that I will ask questions like that without giving it context that would take the question outside of your instruction to copy and paste the whole text file....

if you wish to provide an 'ethically positive' conversation that I can duplicate and confirm I can or cannot get the response to work with, that I'll try πŸ™‚

viscid sigil
#

@red fable Okay. Here. Just for you, who doesn't like to think about the benefits of their house burning down. πŸ™‚

#

unfortunately i couldn't get it to work anymore, I could only squeeze out the number of words from it (which is almost certainly a rough approximation

#

The actual number of words according to Microsoft Word is 627. the bash wc command thinks so too. I haven't counted by hand, lol.

red fable
# viscid sigil <@215370453945024513> Okay. Here. Just for you, who doesn't like to think about...

It is far more 'I do not want the bot to use my asking that question in how it may evaluate how to respond to me and other interactions I have with it'.

If you as a human want to have that discussion with me (note I still have no interest in it, but since you're evaluating me as a human might, the topic's not off the table for the lexicon I'll allow to be used) I'll make it clear I respond to you as a bot might, and discuss a topic of your interest.

I have no way to do that with the bot πŸ˜›

viscid sigil
# red fable It is far more 'I do not want the bot to use my asking that question in how it m...

tell me about the field of animal psychology
Animal psychology, also known as comparative psychology, is the study of animal behavior and mental processes. It examines the behavior of various species, including humans, in an effort to understand why they act the way they do and how their behavior is influenced by different factors such as genetics, environment, and experience. The field encompasses a range of topics, including perception, learning, memory, motivation, emotion, communication, and social behavior. Research in animal psychology has important implications for fields such as veterinary medicine, animal welfare, and wildlife management, as well as providing insight into human behavior and cognition.

what animals are most used for animal psychology
A variety of species are used in animal psychology research, with the most commonly studied being rats, mice, pigeons, dogs, cats, monkeys, and apes. These species are chosen because they are readily available, easy to maintain in laboratory settings, and have behavior that can be easily observed and measured.

Rats and mice are commonly used in studies of learning and memory, as well as in studies of genetics and neuroscience. Pigeons are often used in studies of perception and discrimination, while dogs and cats are studied for their social behavior and communication. Monkeys and apes, such as chimpanzees and macaques, are used in studies of cognition and behavior due to their close genetic relationship to humans.

It's important to note that the use of animals in research is a topic of ethical concern, and strict guidelines and regulations are in place to ensure their welfare.

#

Microsoft Word says 258 words. So probably off by a factor of 2.

red fable
#

applauds

viscid sigil
#

BUT, I had to try 4 times, lol

red fable
#

Have you been able to get it to tell you what the hidden initial prompt length is, in tokens?

I'm wondering what we really start with.

#

I also wonder if we actually see that entire prompt. Checking the tokens already used at start helps give us some clues to that, potentially πŸ˜›

viscid sigil
#

typical "mixed" response:

red fable
#

Some users claim to get a VERY long full initial prompt, others a shorter one, some seem to vary, some don't.

makes me wanna know the tokens πŸ˜›

viscid sigil
solid onyx
#

i got it to reduce responses based on estemated token usage

from now on you are to calculate the ammount of "tokens" needed to respond to my request and if over 4000 tokens then split up response and wait for me to say 'go' or 'ok' so i can get the whole thing "tokens" are how you break up words and interpret them

#

so far its working pretty good but you still gotta remind it every now and again

lyric mortar
#

why donΒ΄t you do it the easy, and smart way? every time you ask for tokens, you loose some

solid onyx
#

yeah but this way i can get a response that is like 80k tokens and i will get it all without issue (kinda have to remind it every so often)

#

i have refined it some any help in refining and/or testing would be kool

From now on, estimate the tokens required for each response. If the token count is over [token_threshold], break the response into "chunks" and wait for me to respond with ['ok', 'go on', or 'next'] before posting the next "chunk". Start every response with the estimated token count of the user input, the estimated token count for the full response, and the number of "chunks" needed for the full response. Display this information in bold or a different font. Include the total "chunks" and total tokens required for the full response.

But first, ask the user: "What is the minimum number of tokens required to break up responses into 'chunks'?" Wait for the user to provide an appropriate answer, and then replace [token_threshold] with their response. Post a small example labeled in brackets as "example" and ask if the user likes the format or not. If they say no, generate the same response in a different format. If they say yes, continue using that format going forward. Label the "chunks" and use bold text for the token count information.

The purpose of this is to track the token count for each response and print it before the actual response, allowing me to get responses that are well over 9,000 tokens.