#Fine tuning on non-reasoning data but inference with reasoning on

3 messages · Page 1 of 1 (latest)

wintry zealot
#

Hi, I am fine tuning a qwen3-32b model on some non-reasoning data. Later, I plan to use it for inference using thinking mode. Will this work? I mean since I'm using LoRA, the core reasoning ability should not degrade during fine tuning right?

lost turretBOT
#

Fine-tuning a Qwen3-32B model on non-reasoning data—even with LoRA—will likely degrade its reasoning ability, especially when using "thinking mode" at inference. Unsloth and Qwen3 docs recommend maintaining at least 75% reasoning and 25% non-reasoning data in your fine-tuning set to preserve reasoning capabilities; fine-tuning only on non-reasoning data can cause the model to lose its reasoning skills, regardless of LoRA's parameter efficiency. LoRA does not "freeze" core abilities; it adapts the model, and if the new data lacks reasoning, those skills can diminish or be overwritten during training.

For best results, include a significant portion of reasoning (chain-of-thought) data in your fine-tuning set if you want to use the model for reasoning tasks later. Otherwise, the model may not respond well to "thinking mode" prompts after fine-tuning on only non-reasoning data, as shown in both Unsloth documentation and user reports. Would you like more detail or code examples?
Sources:

wintry zealot
#

@lost turret what are pros and cons of using low r,alpha pairs vs high ones