#Help with SFT Qwen3 model
12 messages · Page 1 of 1 (latest)
Everything seems to go fine but after finetuning the model, it does not seem to be generating the thinking tokens. The training examples have these thinking tokens.
This link contains the notebook I am using https://colab.research.google.com/drive/1SfQ6M96s5pqE6d31x1xlcZaSeorLOy-O?usp=sharing
Feels like this might be something small / silly on my side. Any help will be appreciated!
Your prompt format doesn't quite look right.
Edit: Scratch that, I copied the data into a text editor and parsed it, looks like standard multi-turn. Not every turn has a <think></unthink> tag though\
You can obviously see that the training is working because the assistant is replying IN ALL CAPS ALL THE TIME
But if all of your training examples are like this, with 90% of the turns having no thinking, and only the final turn having thinking, the model will want to mimic that
This makes sense and is super helpful. Thank-you!!
Sorry I googled the names of the medication in the chat, and the data is all there
So the <thinking> section is reasonable
who is on medication?
The person in this fellows dataset 🙂