#Saving LoRa weights only
1 messages · Page 1 of 1 (latest)
I just tested it again and it works fine in my case -> https://huggingface.co/Erland/test_push_lora/tree/main
Maybe can you elaborate? What model are you using?
oh wait
in CPT, I think you actually save the full Embedding and LM Head
so you didn't do LoRA on them
model = FastLanguageModel.get_peft_model(
model,
r = 128, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128
target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
"gate_proj", "up_proj", "down_proj",
"embed_tokens", "lm_head",], # Add for continual pretraining
lora_alpha = 32,
lora_dropout = 0, # Supports any, but = 0 is optimized
bias = "none", # Supports any, but = "none" is optimized
# [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes!
use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context
random_state = 3407,
use_rslora = True, # We support rank stabilized LoRA
loftq_config = None, # And LoftQ
)
those embed_tokens and lm_head is not LoRA, it's full parameter. So it kinda make sense if those are very big
I'll test them tonight to make sure
I have a LoRA after CPT of Phi-4
Worked for me
https://huggingface.co/burgasdotpro/bgGPT-Phi-4-LoRA
I wasn't able to save full model because of OOM, that's why I saved LoRA
I saw your question:
"Does anyone have actual success in doing CPT+SFT push-to-hub LoRa only ?"
So...
The answer is "Yes".
I have trained LoRA successfully.
The model Phi-4, CPT + instruction fine tuning, pushed lora to hub. Size of lora as expected. Worked fine.
Maybe I misunderstood something?
After about 2 hours It will finish.
Want me to push Lora to hub for test?
Model is R1-didtilled-Qwen2.5-14B