I am training a convolutional neural network to classify signal data. it performs very well when training, and achieves 97% validation and training accuracy with training loss of 0.04 and validation loss of 0.14. i am using the keras binary cross entropy loss function.
However, when I test my model manually on individual signals, it appears to classify everything as one class, basically achieving 0% accuracy.
I made a confusion matrix out of 100 test cases out of never before seen testing data immediately after training the model. it achieves very good results with roughly 90% accuracy and around 5 false positive and negatives.
Im not sure why my model performs so well when trained and tested in batches, but performs extremely poorly when tested on individual test cases.
any ideas?