I think it may just be a weird model, because the huggingface page says it has no intruction template. My current go-to writing/roleplay model is Sao10K/Fimbulvetr-11B-v2, I highly recommend it. Also, you should experiment with using SillyTavern as an Oobabooga frontend, as it can properly load TavernAI cards, and has much better generation presets and instruction templates.
#What are Instruction and Chat templates, and how do I use them?
1 messages · Page 1 of 1 (latest)
I've already heard of SillyTavern; frankly, I'm already swamped with enough settings with all the setup and testing I did for WebUI, I don't need even more of that just to get a fancier front end. I have no interest in image generation, text-to-speech, or speech-to-text. Does it have more benefits besides those?
And thanks for the model suggestion, I'll definitely take a look!
The primary reason I use SillyTavern is actually to not get swamped with settings lol
SillyTavern has a bunch more presets for different settings, meaning I dont have to stare at number sliders all day
So it completely takes over WebUI, you don't need to change settings in both if you use it as a front end?
you need to load a model using Oobaboogas webui, and then SillyTavern can take the wheel from there
Also in my case, I don't think I need TavernAI cards either, since I'm making my character entirely from scratch and not basing it on any existing one
If it's that much simpler, I'll look into it further
I still believe that the issue with your model spewing gibberish is model-related, most models under the model info on their huggingface pages should tell you which instruction template to use, but for some reason Yarn Mistral 7B 128k AWQ doesnt
quite odd
It isn't really gibberish; I haven't done enough testing, but it seemed to just not act as it should. 'Chat' mode works a lot more like expected. I don't particularily understand your suggested model's prompt formats either, though:
Which one of these would I use, and how would they be extended?
I have already written quite long definitions that go on for multiple lines
you can use either template, it understands both
Alright that's cool, but I'm still not sure how to adapt my definitions into that format. What's the difference between Instruction and Input for example?
And under Vicuna, aren't System and Assistant the same?
Unfortunately I'm not an expert on the science behind LLMs and their prompt templates
When you were making your own character, didn't you have to write prompts in one of these formats?
No, the prompt template is just for the LLM to see. You can talk to the LLM in normal human type and it will understand.
I know it'll understand what I'm saying, but I mean if you want it to behave a certain way like I do you'd need to write prompts like those, right?
Depends on what you mean by getting it to respond in a certain way (because there are many different things that can directly change how the LLM generates), give me an example
Like adding example messages, so it knows how the character would respond to certain questions
And in what tone
Stuff like that
Like
{{user}}: Hello!
{{char}}: Greetings
Ohhhh, like that. That stuff is decided by the character card.
or if you dont have a character card, anything typed here
Yeah, that's what I've filled out currently and what makes my character work well on the 'chat' mode
So I dont need to mess with this at all?
I'm guessing that one came prepackaged with the model
I'm not sure if this particular model was built for roleplaying though, it might be made to be used like a sort of assistant or whatever, so I'll try yours too
Yeah Mistral 7B is more general task oriented
Aight I'll switch then, thanks for the advice!
Still very new to this stuff but very happy to learn that all this can be run locally now
I was hoping for stuff like this a couple years ago
I remember being really facinated with AI Dungeon and NovelAI back a couple years ago
amazing that I can run this all locally now
Yeah I was always paranoid using sites like character.ai, i didnt want to lose all the writing I did to get it working and just wanted it on my PC, especially since I was there in the early days and they frequently went down
now I don't have to worry about that or their restrictions
and don't have to worry about other site's usage limits
for Fimbulvetr-v2, download the quantized versions here https://huggingface.co/collections/Lewdiculous/personal-favorites-65dcbe240e6ad245510519aa
I run the q8_0 version on my RTX 4070S, however depending on your GPU, you might want to opt for the lower quant versions
I suggest q6_K
also, Fimbulvetr-v2 is a GGUF model, which has to be loaded with the llama.cpp loader in Oobabooga. llama.cpp is a bit different than some of the other loaders, as by default it will load the model into CPU memory, and you have to tell it how many layers to offload to the GPU
set the threads to the amount of physical cores your cpu has, enable tensorcores, and set n-gpu-layers to maybe like 20-30
Yep found out about this already from a tutorial vid
thats great
Thanks for all your help again! I'll try this out rn
when i first started playing with LLMs i downloaded oobabooga and went head first blind
Yep pretty much same, I also tried some other things first that required other complex stuff like GIT LFS
also read the Oobabooga wiki on the github if you havent yet, super comprehensive and useful
I have not, I'll take a look
I don't see Fimbulvetr-v2 in here though, I already have the link to it don't I?
oh wait i think i sent the wrong link
I currently have 15 huggingface tabs open so ig i copied the wrong one
not sure, ask the creator
One last thing before I go to bed, heres a screenshot from the oobabooga wiki as to which presets to use for different things
Oh that would have been helpful a few hours ago lol
Instruct ones are for like ChatGPT style stuff, while the Chat ones are more for roleplay and stuff
Thanks!
np
One last thing, on the main page for Fimbulvetr there's this link to the GGUF version
For what reason should I use that version?
Does the main page version not work as well?
the main page version is the full, unquantized (uncompressed) model
which is HUGE and uses like 24GB of memory
ah so thats why theres multiple large files
So there's no point of downloading the main one, I should download from the GGUF page instead, yeah?
Yep already do
before I thought I had to use Git LFS so that was a pain to install for nothing
I'm not sure if I should use a test version and I don't want to bother the creator for something this small so I'll just use this version
should still be fine
no problem, night night
General purpose / promptless LLM models are more for "notebook" mode; you type whatever you want at the start of the notebook, then the LLM starts filling in the rest.
LLMs are trained on all sorts of things; novels, wikipedia articles, roleplaying chat logs... it will try to reproduce whatever style you give it, based on what it learned from its training data.
Instruction-following models are fine tuned to follow a particular prompt format. That's what the templates are about; you need to match the templates that were used in training to have the best effect.
Character templates are just prompts that get added in front of (er... roughly) your actual prompt. If the character prompt is "Fontayne Natrerya, a bard" and you type "tell me a story" into the chat, what the LLM will see is something like:
A conversation between Fontayne Natrerya and "Loading".
Fontayne Natrerya is a bard.
Loading: tell me a story
Fontayne Natrerya:
An instruction template would add something like
System: A conversation between Fontayne Natrerya and Loading.
Fontayne is a bard.
Instruction: Continue the following conversation. Play the role of Fontayne Natrerya.
Loading: tell me a story
Response:
That makes a lot more sense, but where would I put this? Would this replace the defintions and description I put in Parameters/Chat/Character?
It seems to work fine with my format too right now, which is just this:
Steve is a bartender at a small bar in an even smaller town.
{{user}}: Hey bartender!
{{char}}: What can I do ya for?
What's the benefit of transforming my definitions into that format? And are those infinitely expandable, or do they just work a line at a time
If you aren't using an instruction following model, you do not need an instruction template.
If you are using an instruction following model, you need the template to match the format the model was trained on.
How can I check if I am using an instruction following model? Usually on the model page I just see a bunch of code and technical details I don't understand
I've switched to the model Lé Hönque suggested, Sao10K/Fimbulvetr-11B-v2-GGUF. Is it an instruction following model?
If the model card lists a prompt, that's what the instruction template should spit out; the first one mentioned {none} so you don't need one.
I don't believe there is one on the Fimbulvetr page, but I do see this when I load it:
Prompt Formats - Alpaca or Vicuna is what the model card for that one says
Right so I still don't need to worry about the Instruction template with this one?
Or should I replace the code that's in that box by default with something written in the Alpaca/Vicuna format
Should I change anything in there?
The default that ooba set it to is probably fine
Alright, so I don't need to fill any fields in there like this example you gave?
A conversation between Fontayne Natrerya and "Loading".
Fontayne Natrerya is a bard.
Loading: tell me a story
Fontayne Natrerya:
All that is is code that makes the AI generate responses correctly
So all I need to worry about are my definitions, and I should use it on 'chat' mode, not 'chat-instruct' mode