#Fine tuning Llama-3-8B-Lexi-Uncensored on custom dataset - LR not declining

4 messages · Page 1 of 1 (latest)

dusk solar
#

I am trying to create a LLM to create chain texts down in the style here: https://www.dirtychaintexts.com/

To do this, I've gathered 184 example chain texts from various websites and put them into a two column CSV. I'm fine tuning an uncensored Llama model I found here https://huggingface.co/Orenguteng/Llama-3-8B-Lexi-Uncensored

I'm using the demo unsloth notebook as my guide: https://huggingface.co/datasets/unsloth/notebooks/blob/main/Alpaca_%2B_Mistral_7b_full_example.ipynb

I am able to kick off the training workflow successfully however my training loss stays between 1.5-2.0 even after 3 full epochs. What should my first steps be to tweak my training in order to get better results? Are there any examples where folks have fine tuned a model using their own custom curated small datasets? Curious as to whether I should get more training data or invest my time with hyperparamter tuning and think I could learn alot by looking at how others have gone about this

#

and when I go to eventually run inference on the model it just outputs a string of emoji's and then terminates

queen chasm
#

I'm the author of Lexi .You are using incorrect chat template. Lexi uses the exact same chat template as the official instruct models by META. Ensure you use the same chat template.

You should never change chat template of an instruct / tuned model. IF you want to make your own chat templates or different templates, you should only fine tune a base model that has not been tuned previously.

#

Use the unsloth notebook for llama models instead.