#How does neuro know how much to say?

1 messages · Page 1 of 1 (latest)

lone torrent
#

I understand that an LLM generates text based on the last given word, but how does it know when to stop talking? Does it have some kind of value that represents how long the sentence should be based on the input? This question isn't specifically about neuro more about LLMs in general. Sorry if this isn't the best place to ask this but I couldn't find a clear answer online so I guessed asking it on a discord server about an LLM would be my best bet.

lone torrent
#

does that mean that the length of an output depends on the amount of tokens used in the input? for example if the input was 10 tokens long the output could be 4087 tokens long.

weary eagle
#

the output cannout use more tokens than allowed

#

you can compare it to a character limit

#

discord has a 2k character limit

lone torrent
#

I understand that part, but is that purely how the AI decides how long the output should be, if so then wouldn't that mean you get a longer output for a shorter prompt and a shorter output for a longer prompt. If so that seems counter-productive

weary eagle
#

prompt has nothing to do with tokens

#

they are separate

#

prompt = instruction to follow
tokens = character limit

lone torrent
#

but the link that you shared says, "Depending on the model used, requests can use up to 4097 tokens shared between prompt and completion. If your prompt is 4000 tokens, your completion can be 97 tokens at most."

weary eagle
#

okay wtf

#

that confused me

lone torrent
#

now we're both confused

#

we need an even smarterer person

weary eagle
#

"requests can use up to 4097 tokens shared between prompt and completion"

lone torrent
#

so that means what I said before right, the longer the prompt the shorter the output.

weary eagle
#

seems like it

#

now that i think about that, it really does seem counter-productive

lone torrent
#

But I wonder if the AI will always try to use the full tokens provided, or will it choose to only output some

lone torrent
#

In the past I tried to make a text-generation AI using an lstm which worked by telling it to run for a certain amount of characters then stop, but at the end of the text generated it would just be cutoff in the middle of a sentence which isnt really a good idea for a chatbot

#

so I am confused on how a chatbot knows when the prompt has been answered

weary eagle
#

you just created your own tokens

#

except it cuts off

lone torrent
#

this might be a question for chat-gpt. bots answering questions about bots, it is truly the AI take over

weary eagle
lone torrent
#

chat-gpt says that the AI doesn't know when the prompt has been answered but can decide when to stop based on context or information given in the prompt, such as how long the sentence should be.

#

I would like to ask vedal how he implemented it into neuro, but I feel like that might be confidential

gilded crescent
#

it's not really confidential. LLMs know when to stop based on either a predefined token limit, or when they generate a predefinied stop sequence

lone torrent
#

does "predefined stop sequence" mean like a fullstop in a sentence or a specific set of words?

#

could it be that you have some other neural network that reads the generated text and decides when to stop?

gilded crescent
#

A stop sequence is like yes a word or string of characters. When the thing running the LLM is returned that sequence, it stops generating text, or when a set number of token is reached, it stops generating text.

#

A common sequence for roleplay models is "### Response:"

#

i.e. when the LLM outputs "### Response:", you stop it from generating any more output

lone torrent
#

that seems like an odd sequence, when would an LLM output that?

gilded crescent
#

Well... When it determines that's the next sequence of most probable characters to generate

lone torrent
#

so when training an LLM would you want to put the stop sequence at the end of any training data

gilded crescent
#

Yeah, like when you train an LLM, you have these input and output sequences so it learns how posting works

#

It generates the output sequence when it reaches the end of it's post

lone torrent
#

ok I think Im starting to understand know

gilded crescent
#

Token generation by LLMs is all just "what is the next most probable character based on the prior characters"

lone torrent
#

would a similar method be used for something like an lstm

gilded crescent
#

LSTM is not something I'm too learned on, but I believe it's not something that is similiar to a LLM, but a mechanism built into an LLM in order to generate more meaningful outputs

#

by giving the model a mechanism that is similiar to a "memory"

lone torrent
#

That sounds about right, but I have to say I have never heard of lstms in an llm before l

gilded crescent
#

Well, like I said, I'm not too learned on that one

#

Alright, after about 10 seconds of googling, it seems LSTMs are a way to design a LLM

lone torrent
#

Yeah, I have actually fiddled with lstms before as a way to make an llm before but I couldn't get it to make anything good, mostly because I couldn't train it enough on account of my crappy pc

gilded crescent
#

you can always get a premade model

lone torrent
#

I'm more interested in understanding how AI chatbots work rather than making a functioning chatbot so getting a premade model isn't really what I'm looking for

gilded crescent
#

Enjoy that rabbit hole of research

lone torrent
#

I'll try my best, also thanks for the help it was fun talking