Hi. I found out about unsloth from the huggingface trl docs on DPO (https://huggingface.co/docs/trl/main/en/dpo_trainer), where it suggested using the unsloth library for acceleration. I tried modifying the code on those docs with the custom dataset
dpo_dataset_dict = {
"prompt": [
"hello",
],
"chosen": [
"hi nice to meet you",
],
"rejected": [
"leave me alone",
],
}
train_dataset = Dataset.from_dict(dpo_dataset_dict)
to fine-tune the Phi-3 model, which I load as per the notebook: https://colab.research.google.com/drive/1lN6hPQveB_mHSnTOYifygFcrO8C1bxq4?usp=sharing
However, I am getting an error when running dpo_trainer.train()
Traceback (most recent call last):
dpo_trainer.train()
File "/usr/mambaforge-pypy3/envs/unsloth_env/lib/python3.10/site-packages/transformers/trainer.py", line 1948, in train
return inner_training_loop(
File "<string>", line 320, in _fast_inner_training_loop
File "/usr/mambaforge-pypy3/envs/unsloth_env/lib/python3.10/site-packages/accelerate/data_loader.py", line 463, in __iter__
current_batch = send_to_device(current_batch, self.device, non_blocking=self._non_blocking)
File "/usr/mambaforge-pypy3/envs/unsloth_env/lib/python3.10/site-packages/accelerate/utils/operations.py", line 183, in send_to_device
{
File "/usr/mambaforge-pypy3/envs/unsloth_env/lib/python3.10/site-packages/accelerate/utils/operations.py", line 184, in <dictcomp>
k: t if k in skip_keys else send_to_device(t, device, non_blocking=non_blocking, skip_keys=skip_keys)
File "<string>", line 42, in _fixed_send_to_device
File "/usr/mambaforge-pypy3/envs/unsloth_env/lib/python3.10/site-packages/accelerate/utils/operations.py", line 81, in honor_type
return type(obj)(generator)
File "<string>", line 43, in <genexpr>
TypeError: 'str' object is not callable
Does unsloth or perhaps this particular model require a separate format for the dpo_dataset_dict than the huggingface docs?