#Unsloth with TRL DPO

7 messages · Page 1 of 1 (latest)

normal jewel
#

Hi. I found out about unsloth from the huggingface trl docs on DPO (https://huggingface.co/docs/trl/main/en/dpo_trainer), where it suggested using the unsloth library for acceleration. I tried modifying the code on those docs with the custom dataset

dpo_dataset_dict = {
    "prompt": [
        "hello",
    ],
    "chosen": [
        "hi nice to meet you",
    ],
    "rejected": [
        "leave me alone",
    ],
}
train_dataset = Dataset.from_dict(dpo_dataset_dict)

to fine-tune the Phi-3 model, which I load as per the notebook: https://colab.research.google.com/drive/1lN6hPQveB_mHSnTOYifygFcrO8C1bxq4?usp=sharing
However, I am getting an error when running dpo_trainer.train()

Traceback (most recent call last):
    dpo_trainer.train()
  File "/usr/mambaforge-pypy3/envs/unsloth_env/lib/python3.10/site-packages/transformers/trainer.py", line 1948, in train
    return inner_training_loop(
  File "<string>", line 320, in _fast_inner_training_loop
  File "/usr/mambaforge-pypy3/envs/unsloth_env/lib/python3.10/site-packages/accelerate/data_loader.py", line 463, in __iter__
    current_batch = send_to_device(current_batch, self.device, non_blocking=self._non_blocking)
  File "/usr/mambaforge-pypy3/envs/unsloth_env/lib/python3.10/site-packages/accelerate/utils/operations.py", line 183, in send_to_device
    {
  File "/usr/mambaforge-pypy3/envs/unsloth_env/lib/python3.10/site-packages/accelerate/utils/operations.py", line 184, in <dictcomp>
    k: t if k in skip_keys else send_to_device(t, device, non_blocking=non_blocking, skip_keys=skip_keys)
  File "<string>", line 42, in _fixed_send_to_device
  File "/usr/mambaforge-pypy3/envs/unsloth_env/lib/python3.10/site-packages/accelerate/utils/operations.py", line 81, in honor_type
    return type(obj)(generator)
  File "<string>", line 43, in <genexpr>
TypeError: 'str' object is not callable

Does unsloth or perhaps this particular model require a separate format for the dpo_dataset_dict than the huggingface docs?

buoyant nest
#

I am running into the same issue, but with Llama 3.1 8B, so I don't think this is model-specific!

#

@normal jewel did you happen to find a solution to this?

normal jewel
#

I did not.

upbeat tide
marble harness
buoyant nest
#

Awesome thanks!