Finetuning doesn't learn the LLM anything. | Unsloth AI | Page 1

quick flicker Mar 6, 2025, 11:31 AM

#

Hello, I have a test dataset and it doesn't learn the LLM anything after finetuning for 30 minutes and converting to .gguf and startnig with ollama.

I'm really new to this and i can't find any solutions.
Can someone tell me what is wrong? I hoped some experts could help me out :)

train.py: https://hastebin.nl/6cmoaCa
dataset.jsonl:

{"question": "What is the meaning of Shotka?", "answer": "Shotka made by the Shotka VOF is an alcoholic drink with an apres ski theme. More info on https://shotka.nl"}
{"question": "Tell me about Shotka.", "answer": "Shotka made by the Shotka VOF is an alcoholic drink with an apres ski theme. More info on https://shotka.nl"}

vestal leaf Mar 6, 2025, 1:03 PM

#

quick flicker Hello, I have a test dataset and it doesn't learn the LLM anything after finetun...

to make the model memorize facts, use a high lora rank like 128 and at least 3 epochs (ideally more)

quick flicker Mar 6, 2025, 1:09 PM

#

Oww alright, so the rest of my code is good?

#

Because it does output things while finetuning

 What is the meaning of Shotka?
Expected Answer:
 Shotka made by the Shotka VOF is an alcoholic drink with an apres ski theme. More info on https://shotka.nl
Raw Response:
 <reasoning>
Shotka is the Japanese martial arts term for "first blow" or "initial strike." It is often associated with Okinawan karate, particularly in the styles of Goju-ryu and Shorin-ryu. In Shotka, the first movement or strike is considered crucial, as it sets the tone and determines the direction of the flow of the fight.
</reasoning>

<answer>
Shotka is a term referring to the initial strike or first movement in Okinawan karate.
Extracted Answer:
 Shotka is a term referring to the initial strike or first movement in Okinawan karate.

{'loss': 0.0, 'grad_norm': 1.4658467769622803, 'learning_rate': 2.0000000000000003e-06, 'rewards/xmlcount_reward_func': 0.010500000789761543, 'rewards/soft_format_reward_func': 0.0, 'rewards/strict_format_reward_func': 0.0, 'rewards/int_reward_func': 0.0, 'rewards/correctness_reward_func': 0.0, 'reward': 0.010500000789761543, 'reward_std': 0.061941102147102356, 'completion_length': 152.6666717529297, 'kl': 0.0009041924495249987, 'epoch': 5.0}
  4%|█▋                                        | 10/250 [01:18<32:33,  8.14s/it]

And answers are not the expected answer I put in my dataset, or does it acrually learn from it?

vestal leaf Mar 6, 2025, 1:26 PM

#

quick flicker Because it does output things while finetuning ```Question: What is the meanin...

it seems you are using GRPO? I think instruction fine-tuning will work much better for this memorization task

quick flicker Mar 6, 2025, 1:27 PM

#

Are there any examples fot using instruction finetuning?

vestal leaf Mar 6, 2025, 1:28 PM

#

quick flicker Are there any examples fot using instruction finetuning?

#notebooks

quick flicker Mar 6, 2025, 1:28 PM

#

Thanks!

quick flicker Mar 6, 2025, 1:44 PM

#

Iset lora_rank to 128 and num_train_epochs to 5, now when it is dome and try to save it gets error out of memory.
What can i do about this error?

Training complete.
Saving the final trained model to: train_output
Unsloth: Merging 4bit and LoRA weights to 16bit...
Unsloth: Will use up to 29.11 out of 62.58 RAM for saving.
Unsloth: Saving model... This might take 5 minutes ...
 31%|█████████████▍                             | 10/32 [00:00<00:00, 30.71it/s]
[rank0]: Traceback (most recent call last):
[rank0]:   File "/home/tracy/Desktop/finetuning/train_dataset.py", line 198, in <module>
[rank0]:     main()
[rank0]:   File "/home/tracy/Desktop/finetuning/train_dataset.py", line 191, in main
[rank0]:     model.save_pretrained_merged("model_trained_merged", tokenizer, "merged_16bit")
[rank0]:   File "/home/tracy/Desktop/finetuning/tracy/lib/python3.12/site-packages/unsloth/save.py", line 1313, in unsloth_save_pretrained_merged
[rank0]:     unsloth_save_model(**arguments)
[rank0]:   File "/home/tracy/Desktop/finetuning/tracy/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[rank0]:     return func(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/tracy/Desktop/finetuning/tracy/lib/python3.12/site-packages/unsloth/save.py", line 569, in unsloth_save_model
[rank0]:     W, bias = _merge_lora(proj, name)
[rank0]:               ^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/tracy/Desktop/finetuning/tracy/lib/python3.12/site-packages/unsloth/save.py", line 177, in _merge_lora
[rank0]:     maximum_element = torch.max(W.min().abs(), W.max())
[rank0]:                                 ^^^^^^^
[rank0]: torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 224.00 MiB. GPU 0 has a total capacity of 23.53 GiB of which 201.50 MiB is free. Process 11890 has 6.04 GiB memory in use. Including non-PyTorch memory, this process has 17.24 GiB memory in use. Of the allocated memory 16.25 GiB is allocated by PyTorch, with 146.00 MiB allocated in private pools (e.g., CUDA Graphs), and 108.76 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
[rank0]:[W306 14:40:29.024894852 ProcessGroupNCCL.cpp:1250] Warning: WARNING: process group has NOT been destroyed before we destruct ProcessGroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present,  but this warning has only been added since PyTorch 2.4 (function operator())

vestal leaf Mar 6, 2025, 2:04 PM

#

quick flicker Iset lora_rank to 128 and num_train_epochs to 5, now when it is dome and try to ...

try restarting your pc and then run unsloth right away

#

save the lora adapter instead of the merged model

quick flicker Mar 6, 2025, 2:05 PM

#

Alright i will, thanks

quick flicker Mar 6, 2025, 3:00 PM

#

vestal leaf save the lora adapter instead of the merged model

Do you maybe know how i can start this in ollama, or do i need to merge it with llama3.1 now first?

vestal leaf Mar 6, 2025, 3:00 PM

#

quick flicker Do you maybe know how i can start this in ollama, or do i need to merge it with ...

you should try merging it. just load the adapter then merge right away

quick flicker Mar 6, 2025, 3:01 PM

#

Alright thank you i will try that!

zinc flicker Mar 7, 2025, 5:55 AM

#

but also beware
https://arxiv.org/abs/2405.05904

@vestal leaf u have a tangible feel for ^ / tradeoffs

arXiv.org

Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?

When large language models are aligned via supervised fine-tuning, they may encounter new factual information that was not acquired through pre-training. It is often conjectured that this can teach the model the behavior of hallucinating factually incorrect responses, as the model is trained to generate facts that are not grounded in its pre-exi...

#Finetuning doesn't learn the LLM anything.