I understand that an LLM generates text based on the last given word, but how does it know when to stop talking? Does it have some kind of value that represents how long the sentence should be based on the input? This question isn't specifically about neuro more about LLMs in general. Sorry if this isn't the best place to ask this but I couldn't find a clear answer online so I guessed asking it on a discord server about an LLM would be my best bet.
#How does neuro know how much to say?
1 messages · Page 1 of 1 (latest)
does that mean that the length of an output depends on the amount of tokens used in the input? for example if the input was 10 tokens long the output could be 4087 tokens long.
the output cannout use more tokens than allowed
you can compare it to a character limit
discord has a 2k character limit
I understand that part, but is that purely how the AI decides how long the output should be, if so then wouldn't that mean you get a longer output for a shorter prompt and a shorter output for a longer prompt. If so that seems counter-productive
prompt has nothing to do with tokens
they are separate
prompt = instruction to follow
tokens = character limit
but the link that you shared says, "Depending on the model used, requests can use up to 4097 tokens shared between prompt and completion. If your prompt is 4000 tokens, your completion can be 97 tokens at most."
"requests can use up to 4097 tokens shared between prompt and completion"
so that means what I said before right, the longer the prompt the shorter the output.
But I wonder if the AI will always try to use the full tokens provided, or will it choose to only output some
might depend on the prompt
In the past I tried to make a text-generation AI using an lstm which worked by telling it to run for a certain amount of characters then stop, but at the end of the text generated it would just be cutoff in the middle of a sentence which isnt really a good idea for a chatbot
so I am confused on how a chatbot knows when the prompt has been answered
this might be a question for chat-gpt. bots answering questions about bots, it is truly the AI take over

chat-gpt says that the AI doesn't know when the prompt has been answered but can decide when to stop based on context or information given in the prompt, such as how long the sentence should be.
I would like to ask vedal how he implemented it into neuro, but I feel like that might be confidential
it's not really confidential. LLMs know when to stop based on either a predefined token limit, or when they generate a predefinied stop sequence
does "predefined stop sequence" mean like a fullstop in a sentence or a specific set of words?
could it be that you have some other neural network that reads the generated text and decides when to stop?
A stop sequence is like yes a word or string of characters. When the thing running the LLM is returned that sequence, it stops generating text, or when a set number of token is reached, it stops generating text.
A common sequence for roleplay models is "### Response:"
i.e. when the LLM outputs "### Response:", you stop it from generating any more output
that seems like an odd sequence, when would an LLM output that?
Well... When it determines that's the next sequence of most probable characters to generate
so when training an LLM would you want to put the stop sequence at the end of any training data
Yeah, like when you train an LLM, you have these input and output sequences so it learns how posting works
It generates the output sequence when it reaches the end of it's post
ok I think Im starting to understand know
Token generation by LLMs is all just "what is the next most probable character based on the prior characters"
would a similar method be used for something like an lstm
LSTM is not something I'm too learned on, but I believe it's not something that is similiar to a LLM, but a mechanism built into an LLM in order to generate more meaningful outputs
by giving the model a mechanism that is similiar to a "memory"
That sounds about right, but I have to say I have never heard of lstms in an llm before l
Well, like I said, I'm not too learned on that one
Alright, after about 10 seconds of googling, it seems LSTMs are a way to design a LLM
Yeah, I have actually fiddled with lstms before as a way to make an llm before but I couldn't get it to make anything good, mostly because I couldn't train it enough on account of my crappy pc
you can always get a premade model
I'm more interested in understanding how AI chatbots work rather than making a functioning chatbot so getting a premade model isn't really what I'm looking for
Enjoy that rabbit hole of research
I'll try my best, also thanks for the help it was fun talking