Has anyone managed to use a training checkpoint for inference or to resume training? We've seen dramatically worse results after resume from a checkpoint than before and have been unsuccessful trying to load the checkpoint as a LoRA adapter - vllm is returning identical results with and without the LoRA. Any pointers would be appreciated.
#TRL checkpoints with LoRA not working
4 messages · Page 1 of 1 (latest)
This issue is known: resuming training or using LoRA adapters from TRL checkpoints can lead to problems where the adapter is not properly loaded, resulting in identical outputs with and without the LoRA. This has been reported and discussed in several Unsloth issues, and fixes have been released. The recommended solution is to update both unsloth and unsloth_zoo to the latest versions and use FastLanguageModel.from_pretrained to load the adapter, not PeftModel or manual checkpoint loading. For inference, always ensure the correct adapter is loaded and activated, and for merging, use save_pretrained_merged for best results. If you still see no effect, check that the LoRA rank and configuration match, and avoid mixing old checkpoints with new library versions, as format changes may break compatibility.
For vLLM, ensure you use the latest Unsloth and vLLM versions, and pass the LoRA adapter using model.load_lora or directly via LoRARequest. If you continue to get identical results, verify that the LoRA weights are not empty or corrupted, and that the adapter is actually being applied (not just loaded). See the linked issues for more troubleshooting steps and code examples. Would you like a step-by-step guide for loading and resuming LoRA checkpoints with Unsloth and vLLM?
Sources:
💡 **Hint:** Mention @RunLLM in the thread for followups.
@lucid current I've confirmed I'm on the latest Unsloth version - could you share a step-by-step guide for loading and resuming LoRA checkpoints with Unsloth and vLLM?
Certainly! Here’s a step-by-step guide for loading and resuming LoRA checkpoints with Unsloth and vLLM:
-
Upgrade Unsloth and vLLM
Ensure you have the latest versions:pip install --upgrade unsloth unsloth_zoo vllm -
Resume Training from Checkpoint
When training, set up checkpoint saving:trainer = SFTTrainer( ..., args=TrainingArguments( output_dir="outputs", save_strategy="steps", save_steps=50, ), ) trainer.train(resume_from_checkpoint=True)To resume, just call
trainer.train(resume_from_checkpoint=True)and it will pick up from the latest checkpoint.
Reference -
Load LoRA Adapter for Inference
After training, to load the LoRA adapter for inference:from unsloth import FastLanguageModel model, tokenizer = FastLanguageModel.from_pretrained( model_name="BASE_MODEL_NAME", max_seq_length=2048, load_in_4bit=True, ) model = FastLanguageModel.get_peft_model( model, r=32, target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"], lora_alpha=32, ) model.load_adapter("outputs/checkpoint-XXX", adapter_name="my_adapter") model.set_adapter("my_adapter") -
vLLM Inference with LoRA
For vLLM, use the LoRARequest object:from vllm.lora.request import LoRARequest outputs = model.fast_generate( prompts, sampling_params=sampling_params, lora_request=LoRARequest('my_adapter', 1, lora_path='outputs/checkpoint-XXX') )