Hi, does anyone encounter same problem when trying to use "indonlp/cendol-llama2-7b-chat"?
Here's my notebook:
https://colab.research.google.com/drive/1EMypQDmEeLjWYFVLBQuMCNAn7ld56rHQ?usp=sharing
8 messages · Page 1 of 1 (latest)
Hi, does anyone encounter same problem when trying to use "indonlp/cendol-llama2-7b-chat"?
Here's my notebook:
https://colab.research.google.com/drive/1EMypQDmEeLjWYFVLBQuMCNAn7ld56rHQ?usp=sharing
And here is the stacktrace:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/torch/utils/_config_module.py in __getattr__(self, name)
141 try:
--> 142 return self._config[name]
143 except KeyError as e:
KeyError: 'vocab_size'
The above exception was the direct cause of the following exception:
AttributeError Traceback (most recent call last)
4 frames
<ipython-input-2-b0633927e3ce> in <cell line: 24>()
22 # ] # More models at https://huggingface.co/unsloth
23
---> 24 model, tokenizer = FastLanguageModel.from_pretrained(
25 model_name = "indonlp/cendol-llama2-7b-chat",
26 max_seq_length = max_seq_length,
/usr/local/lib/python3.10/dist-packages/unsloth/models/loader.py in from_pretrained(model_name, max_seq_length, dtype, load_in_4bit, token, device_map, rope_scaling, fix_tokenizer, trust_remote_code, use_gradient_checkpointing, resize_model_vocab, revision, *args, **kwargs)
270 pass
271
--> 272 model, tokenizer = dispatch_model.from_pretrained(
273 model_name = model_name,
274 max_seq_length = max_seq_length,
/usr/local/lib/python3.10/dist-packages/unsloth/models/llama.py in from_pretrained(model_name, max_seq_length, dtype, load_in_4bit, token, device_map, rope_scaling, fix_tokenizer, model_patcher, tokenizer_name, trust_remote_code, **kwargs)
1401 )
1402
-> 1403 model, tokenizer = patch_tokenizer(model, tokenizer)
1404 model = model_patcher.post_patch(model)
1405
/usr/local/lib/python3.10/dist-packages/unsloth/models/_utils.py in patch_tokenizer(model, tokenizer)
468 if len(check_pad_token) != 1:
469 possible_pad_token = None
--> 470 if check_pad_token[0] >= config.vocab_size:
471 possible_pad_token = None
472 pass
/usr/local/lib/python3.10/dist-packages/torch/utils/_config_module.py in __getattr__(self, name)
143 except KeyError as e:
144 # make hasattr() work properly
--> 145 raise AttributeError(f"{self.__name__}.{name} does not exist") from e
146
147 def __delattr__(self, name):
AttributeError: torch._dynamo.config.vocab_size does not exist
So, I just found Temporary fix by installing specificly one commit before commit 8001d30.
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git@bfe38e6ea8d3d7cf8ce9e37962de03c71c90cbe2"
!pip install --no-deps "xformers<0.0.27" "trl<0.9.0" peft accelerate bitsandbytes
I just had the same issue, thank you for the temporary fix!
I just had the same issue with meta-llama/Meta-Llama-3.1-8B. It only happens when load_in_4bit = False with True it works.
My configuration is
==((====))== Unsloth 2024.8: Fast Llama patching. Transformers = 4.44.0.
\\ /| GPU: NVIDIA GeForce RTX 3090. Max memory: 23.593 GB. Platform = Linux.
O^O/ \_/ \ Pytorch: 2.3.0+cu121. CUDA = 8.6. CUDA Toolkit = 12.1.
\ / Bfloat16 = TRUE. FA [Xformers = 0.0.26.post1. FA2 = True]
The above temporary fix has also worked.
should be fixed now - please upgrade unsloth!