For finetuning a Llama3.1-based instruct model with Unsloth, you should use the "llama-3.1" chat template provided by Unsloth. This template matches the format in your example and is specifically designed for Llama 3.1 instruct models. Apply it using:
from unsloth.chat_templates import get_chat_template
tokenizer = get_chat_template(tokenizer, chat_template="llama-3.1")
For inference, use tokenizer.apply_chat_template with add_generation_prompt=True as in your code. This ensures prompts are formatted correctly for both training and inference. For more details and examples, see the official Unsloth Llama3.1 Conversational notebook and Unsloth documentation.
Would you like a step-by-step example for dataset formatting and training?
Sources:
💡 **Hint:** Mention @RunLLM in the thread for followups.