#too much prose!
14 messages · Page 1 of 1 (latest)
@stuck shuttle which model are you using?
There are different methods with different models
Dolphin mixtral
Cool - with both models you can
- Use prefilling ( pass a part of the assistant response and it will finish that for you )
- Add details in a system message ( instruct it with words like: this is a concise conversation, only send short responses )
- Give it some example history, as well as a good system message. This involves you editing its first few responses to something ideal - then it should follow the pattern later on
In my testing it seems Dolphin doesnt return too much - so I assume its part of learning from its chat history
It really depends on what you are trying to do
If its a natural character chat something like
When you respond, respond in a natural tone, as if you are messaging somebody, or having a natural conversation with them.
Only return 1, maybe 2 sentences max, unless the user explicitly asks you to return more.
You are talking to {Your Name}, and your name is {Character Name Here}.
{Insert some more context and personality here}
heres an example of that across both models: https://modelbench.ai/shared/chat/15e692b9-1315-4483-9a74-f190d17a3cd8
Absolutely - infact, the clearer the prompt, the better. Just be aware of token limits
Dolphin Mix 8x22b has 64k token context window which is pretty large, so you should be plenty fine. the 8b model is half so it depends how long your chat is - but OpenRouter manages that atm anyway I think
What’s the difference with 8b and 22b