#Understanding how logits work in GPT-2

3 messages · Page 1 of 1 (latest)

devout prairie
#

Hello. I have the following task: given a string such as This is a nice string I have to tokenize it and compute the log probs of each token. I assume this means that for e.g. nice I need to compute log p(nice|this is a).

This is the code I have written to achieve this:

import torch
from transformers import GPT2Tokenizer, GPT2LMHeadModel

model_name = "gpt2"
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
model = GPT2LMHeadModel.from_pretrained(model_name)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

model.to(device)
model.eval()

text = "This is a nice string."
tokens = tokenizer.encode(text, return_tensors="pt")
tokenized_sequence = tokenizer.convert_ids_to_tokens(tokens[0])
print(f"Tokenized sequence: {tokenized_sequence} \n")

outputs = model(tokens)
logits = outputs.logits  # shape: [1, sequence_length, vocab_size]

log_probs = torch.log_softmax(logits, dim=-1)

for idx, token in enumerate(tokenized_sequence):
    token_id = tokenizer.convert_tokens_to_ids(token)  
    log_prob = log_probs[0, idx, token_id].item()      
    print(f"token: {token},  log prob: {log_prob}")

There is something that I'm not sure about. in the line log_prob = log_probs[0, idx, token_id].item(), should I use log_prob = log_probs[0, idx-1, token_id].item() instead? In other words, do the logits at position i in the sequence give the predictions for the token at position i using the previous 0,...,i-1 tokens as context, or does it predict the token at i+1 using all tokens up to and including the i-th one? In the latter case, the shift idx -> idx -1 is necessary, and the initial token the should also be dealt with separately (do I just skip it, or assign it an indeterminate value?)

granite oyster
#

you have to use idx-1

#

because we are doing next token prediction, so ideally model shouldn't see what's coming next in the sentence, it should see the current token and tokens before that, but not the ones after it