Issues with inference on large input on llama 3 8B fine tuning | Unsloth AI | Page 1

clever shoal May 1, 2024, 10:54 PM

#

I'm trying to run the llama 3 8B notebook fine-tuning notebook (https://colab.research.google.com/drive/135ced7oHytdxu3N2DNe1Z0kqjyYIkDXp) with inputs longer than 8K. I'm getting an error at the inference step:

Token indices sequence length is longer than the specified maximum sequence length for this model (10106 > 8192). Running this sequence through the model will result in indexing errors
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
Unsloth: Input IDs of length 10106 > the model's max sequence length of 8192.
We shall truncate it ourselves. It's imperative if you correct this issue first.

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-33-6210034c4068> in <cell line: 12>()
     10 ], return_tensors = "pt").to("cuda")
     11 
---> 12 outputs = model.generate(**inputs, max_new_tokens = 2056, use_cache = True)
     13 tokenizer.batch_decode(outputs)

49 frames
/usr/local/lib/python3.10/dist-packages/unsloth/models/llama.py in LlamaModel_fast_forward(self, input_ids, causal_mask, attention_mask, position_ids, past_key_values, inputs_embeds, use_cache, output_attentions, output_hidden_states, return_dict, *args, **kwargs)
    581             inputs_embeds.requires_grad_(False)
    582         pass
--> 583         inputs_embeds *= attention_mask.unsqueeze(0).transpose(0, 1).transpose(1, 2)
    584         if inputs_requires_grad: inputs_embeds.requires_grad_(True)
    585     pass

RuntimeError: The size of tensor a (8192) must match the size of tensor b (10106) at non-singleton dimension 1

It seems to be complaining that the input is too long. Given that training on longer contexts worked fine,, I'm not sure why it's complaining. ANy guidance would be appreciated.

Google Colab

stable dust May 2, 2024, 5:18 AM

#

Oh we'll check this. So sorry for the issue.

stable dust May 7, 2024, 4:22 PM

#

@past surge

past surge May 7, 2024, 6:48 PM

#

Oh yes yes will investigate again apologies

#Issues with inference on large input on llama 3 8B fine tuning