#Gemma 2 issues on raw pretraining

11 messages · Page 1 of 1 (latest)

tiny temple
#

Using this notebook:https://colab.research.google.com/drive/1ef-tab5bhkvWmBOObepl1WgJvfvSzn5Q?usp=sharing with gemma 2 results in the following error: (only tested 2b, both instruct and base)

Exception in thread Thread-12 (generate):
Traceback (most recent call last):
  File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.10/dist-packages/peft/peft_model.py", line 1704, in generate
    outputs = self.base_model.generate(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 2024, in generate
    result = self._sample(
  File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 2982, in _sample
    outputs = self(**model_inputs, return_dict=True)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/unsloth/models/llama.py", line 919, in _CausalLM_fast_forward
    outputs = fast_forward_inference(
  File "/usr/local/lib/python3.10/dist-packages/unsloth/models/gemma2.py", line 396, in Gemma2Model_fast_forward_inference
    seq_len = past_key_values[0][0].shape[-2]
TypeError: 'HybridCache' object is not subscriptable

This error occurs when running the last step (inference). The same notebook works for llama 3.2 3B

tiny temple
#

Another issue is that i trained llama 3.2 3B base on this same notebook but when i try to do inference using VLLM, i get this error:
"As of transformers v4.44, default chat template is no longer allowed, so you must provide a chat template if the tokenizer does not define one."

Since we trained a base model with no chat template, what chat template am i supposed to use?

tiny temple
#

its not the same issue

split oriole
tiny temple
split oriole
#

Did you actually read the link telling you to update transformers?

#

I am aware base models do not have a chat template.

#

Good luck.

tiny temple
#

i did unless i missed reading something