#Model does not stop generating during inference after fine-tunning.

6 messages · Page 1 of 1 (latest)

west falcon
#

I have a problem that after finetunning when doing inference. The model does not stop generating another answers even if it already answered the question. The model is based on llama 2.
Looks like the model have problems with eos token somehow. Here is my formatting function:

def create_conversation(sample) -> dict:
    strip_characters = "\"'"
    return {
        "messages": [
            {"role": "system", "content": system_message},
            {"role": "user",
             "content": f"{sample['instruction'].strip(strip_characters)} "
                        f"{sample['input'].strip(strip_characters)}"},
            {"role": "assistant",
             "content": f"{sample['output'].strip(strip_characters)}"} # moze tutaj dodac ten eos token?
        ]
}

here is my chat template

tokenizer.chat_template = "{% if messages[0]['role'] == 'system' %}{% set loop_messages = messages[1:] %}{% set system_message = messages[0]['content'] %}{% elif false == true and not '<<SYS>>' in messages[0]['content'] %}{% set loop_messages = messages %}{% set system_message = '' %}{% else %}{% set loop_messages = messages %}{% set system_message = false %}{% endif %}{% for message in loop_messages %}{% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}{{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}{% endif %}{% if loop.index0 == 0 and system_message != false %}{% set content = '<<SYS>>\n' + system_message + '\n<</SYS>>\n\n' + message['content'] %}{% else %}{% set content = message['content'] %}{% endif %}{% if message['role'] == 'user' %}{{ bos_token + '[INST] ' + content.strip() + ' [/INST]' }}{% elif message['role'] == 'system' %}{{ '<<SYS>>\n' + content.strip() + '\n<</SYS>>\n\n' }}{% elif message['role'] == 'assistant' %}{{ ' '  + content.strip() + ' ' + eos_token }}{% endif %}{% endfor %}"
#

here is the tokenizer ```
LlamaTokenizerFast(name_or_path='OPI-PG/Qra-7b', vocab_size=32000, model_max_length=4096, is_fast=True, padding_side='left', truncation_side='right', special_tokens={'bos_token': '<s>', 'eos_token': '</s>', 'unk_token': '<unk>', 'pad_token': '</s>'}, clean_up_tokenization_spaces=False), added_tokens_decoder={
0: AddedToken("<unk>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
1: AddedToken("<s>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
2: AddedToken("</s>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
}


and here is output response:
Ceremonia otwarcia Letnich Igrzysk Olimpijskich 2024 w Paryżu była kontrowersyjna ze względu na odtworzenie obrazu Leonarda da Vinci Ostatnia Wieczerza przez drag queens. \n\nCeremonia otwarcia Letnich Igrzysk Olimpijskich 2024 w Paryżu była kontrowersyjna ze względu na odtworzenie obrazu Leonarda da Vinci Ostatnia Wieczerza przez drag queens. \n\nCeremonia otwarcia Letnich Igrzysk Olimpijskich 2024 w Paryżu była kontrowersyjna ze względu na odtworzenie obrazu Leonarda da Vinci Ostatnia Wieczerza przez drag queens. \n\nCeremonia otwarcia Letnich Igrzysk Olimpijskich 2024 w Paryżu była kontrowersyjna ze względu na odtworzenie obrazu Leonarda da Vinci Ostatnia Wieczerza przez drag queens. \n\nCeremonia ot

the first sentence is already the anser to the question but it keeps generating
craggy dove
#

Hi there it usually means that something went wrong during the chat template step

rocky ore
west falcon
#

template was taken from llama-2 chat model

west falcon
#

I think the problem was since I have set eos token = pad token. I let unsloth fix tokenizer(fix_tokenizer = True) and let pad token to be <unk>