#Loss comes out to be NaN while finetuning Bert
23 messages · Page 1 of 1 (latest)
Yea that would do
Please try this:
class BERTForClassification(tf.keras.Model):
def __init__(self, bert_model, num_classes):
super().__init__()
self.bert = bert_model
self.fc1 = tf.keras.layers.Dense(768, activation="relu")
self.fc2 = tf.keras.layers.Dense(num_classes, activation='softmax')
def call(self, inputs):
x = self.bert(inputs)[1][:1]
x = self.fc1(x)
output = self.fc2(x)
return output
Also, you passed the num_classes into fc1 too while also assining same naming to both the layers. What we're looking to is
model output -> take only the CLS token output (size 768)-> pass it into dense layer (size 768) -> pass it into classification layer
got your point but using the new class gives this error
ok i fixed that by giving training = false in the call function and it now seems to show a loss as well as improving accuracy
ahhhh the train accuracy reached 75 but test accuracy is 7-8
Hm, you will have to optimize the run, this is usually the challenge in training models; you have so many things to play with
any idea as to what I can do about it? I tried increasing learning rate a bit but it did train faster but then the test accuracy is veryy low
also I noticed this one thing
during every epoch from 0 to 44 the accuracy increases quite a bit
but starts to decrease afterwards
loosing 5 percent or more until the epoch ends
is it normal?
It really depends on the input data you have. Start by checking the data points. It's normal for accuracy to fluctuate. Also, precision and recall are a better way of dealing with classification problem. try seeing the precision and recall scores and you'll get an idea of the cases your model is failioing.
Also, try tweaking the fully connected layer size, increase it to 1024 or something;
umm no luck, the accuracy rapidly to like 30 percent for the first epoch and then for next epoch it goes to like 50 initially in few seconds and by end of 15 minutes goes down to 45
I am not that proficient in fine tuning models so no idea whats going on here. Mind if you take a look at the collab notebook?? Just need a start in the right direction
@strange geode heyy mind if you look into this