#CNN not good at testing new data

32 messages · Page 1 of 1 (latest)

atomic wedge
#

why is my cnn model performs good at training and validation, but performs very bad at testing new data?

what's my dataset?
Person doing a sleeping position: Supine, Left Lateral, and Right Lateral. The background is of the image is black (to prevent complexity).

Does my model overfit?
I don't think so. please refer to the images (model accuracy and model loss)

Did I augment my data?
yes

What's my training and testing data?
I have approx. 4.5k images. I used ImageDataGenerator for for augmenting, shuffling, and separating into training and validation.

obsidian canyon
#

What you'd want to see is how different your test set is from your train+val set. It's actually possible to overfit on your train+val set when test set or real world data is not representative

atomic wedge
#

the test set is not very different on my train+val data. i have black background on both all dataset. not really sure why the model is not picking up the patterns on test set

obsidian canyon
#

What does the classification report look like?

atomic wedge
#

Oh, about that I forgot to save the output since I'm constantly tinkering with the model

obsidian canyon
atomic wedge
#

this is my latest classification report

#

this is the result from the training:

#

i know it overfits, but that's the best result so far based on the classification report

#

models evaluation based on testing set is [0.705134391784668, 0.7232142686843872]

obsidian canyon
#

When you say your train and test set are not much different, what's the distribution of each class? Do you have imbalacned data?

solar flare
#

for a little experiment you could check this yourself but as you scale up this is a great example of why tracking data provenance is so important.

atomic wedge
atomic wedge
solar flare
#

It’s like if you have multiple photos of the same person (especially frame from a video, for example) they should all be in the same set

#

Or else there is leakage the model can take advantage of and the metrics will not represent unseen data

atomic wedge
solar flare
atomic wedge
#

but im confused. isn't that the cnn's job to identify the position kf unseen data?

solar flare
#

Well yeah but if you have frames from the same video or something else very similar to the train set in your validation set, that metric will not reflect the accuracy on unseen data

#

It’s a lot bigger problem if you accidentally had this leak into your test set and then went out to deploy the model not realizing how poorly it will generalize

atomic wedge
#

hmmm got it

#

but I tried doing training the model using the training set and validation using unseen data. will this method gives the model the ability to generalize better for more unseen data?

solar flare
#

It won’t improve your model as much as improve your analysis of the model.

#

Again, since it at least isn’t leaking into the test data you might have a better idea still

atomic wedge
#

hello, i have a last question. I have trained the model earlier and it's by far the best model I have. the test data received 60% loss and 81% accuracy (based on model.evaluate). is this an acceptable result to continue with more real data?

solar flare
#

This is where you need domain knowledge/context to answer. Do you have any similar model to compare it to? Do you know how they performed with a similar amount of data?

It is hard to say for sure, but since you are getting mid-to-high 90%s on training the model is probably at least somewhat sound so yes I think the best way to close the gap for your test data could be gathering more training data. If adding more data does not help then you should reasses

atomic wedge
#

maybe i just try to more different models using this base model with the current best performance. then, compare all models using the test data.

#

thanks!

solar flare