#?Frequently used Stop Tokens?
22 messages · Page 1 of 1 (latest)
Hi @subtle niche. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:
Stop tokens are specific sequences of text that signal to a language model to stop generating further text. The exact stop tokens can vary depending on the specific model and its training data. However, the knowledge sources provided do not list the most frequently used or well-known stop tokens across all languages or models.
In the context of GPT4All, a stop token can be a list, as mentioned in a GitHub issue (https://github.com/nomic-ai/gpt4all/issues/2167). The issue suggests that ideally, everyone would use stop tokens correctly when uploading Hugging Face models, and the conversion scripts would pick that up. However, this is not always done in practice. The issue also suggests that a robust solution would be a user-customizable list of stop tokens, and even customizable stop sequences.
In Python, you can use a custom callback function to stop generation based on specific conditions. Here's an example from the GPT4All Python Generation API documentation (https://docs.gpt4all.io/old/gpt4all_python.html#interrupting-generation):
from gpt4all import GPT4All
model = GPT4All('orca-mini-3b-gguf2-q4_0.gguf')
def stop_on_token_callback(token_id, token_string):
# one sentence is enough:
if '.' in token_string:
return False
else:
return True
response = model.generate('Blue Whales are the biggest animal to ever inhabit the Earth.',
temp=0, callback=stop_on_token_callback)
print(response)
In this example, the model stops generating text when it encounters a period ('.').
For more specific information about commonly used stop tokens, you may need to refer to the documentation or community resources for the specific language model or platform you're using.
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@subtle niche kapa.ai is still learning and improving, please let us know how it did by reacting below
wonderful
@rich cliff
are there any such sequences known, like "stop", or "cease and desist", or "parle toi plus"?
(the answer is blowing in the wind)
@rich cliff you can also answer my previous question if you please, please.
Let me check the knowledge base for you. One moment! :books:
I'm sorry, but I can't provide the answer you're looking for. The knowledge sources provided do not contain information on the most frequently used or well-known stop tokens across all languages or models.
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@subtle niche kapa.ai is still learning and improving, please let us know how it did by reacting below
@rich cliff however, what form could such stop tokens have in the Prompt Template of an LLM?
Let me check the knowledge base for you. One moment! :books:
In the context of GPT4All, a stop token can be a list, as mentioned in a GitHub issue (https://github.com/nomic-ai/gpt4all/issues/2167). However, the exact form of stop tokens in the prompt template of a language model like GPT4All isn't explicitly detailed in the provided knowledge sources.
The prompt template in GPT4All is used to format user and assistant interactions for the chats the model will be used for, as mentioned in the GPT4All Documentation (https://docs.gpt4all.io/gpt4all_desktop/settings.html#model-settings). However, it doesn't specify how stop tokens are incorporated into the prompt template.
In the GPT4All Python SDK Reference (https://docs.gpt4all.io/gpt4all_python/ref.html#gpt4all.gpt4all.GPT4All), the generate method uses a callback function to process the model's response. This function receives the tokens from the model as they are generated and stops the generation by returning False. This could potentially be used to implement stop tokens, but the exact implementation would depend on the specific requirements of your application.
For more specific information about how to use stop tokens in the prompt template of a language model, you may need to refer to the documentation or community resources for the specific language model or platform you're using.
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@subtle niche kapa.ai is still learning and improving, please let us know how it did by reacting below
@rich cliff thanks a bunch
Let me check the knowledge base for you. One moment! :books:
You're welcome! If you have any other questions, feel free to ask. I'm here to help!
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@subtle niche kapa.ai is still learning and improving, please let us know how it did by reacting below
i don't think the bot gave you a good answer, but it depends a bit on what you're trying to achieve. a model's internal stop tokens should typically not be exposed to the user, they're there for a model itself to signal a response is complete and no further tokens should be generated.
Yes, they are pretty much hidden, but I thought that a list of frequently-met such tokens may exist, also words/expressions like "stop" that when sent as a Prompt would make the model stop its reply. With the Chocolatine French model, good old "stop" works inasmuch as the model tells me "I understand that you want me to stop" and goes on afterwards, like a shrink saying to the patient "okay I hear you saying that you want to..." 🙂