I did ANN classification on training data with oversampling and without oversampling. For each data, the smallest validation loss is sought with trial and error of 18 models. In the data without oversampling, the best model is obtained in the 7th model with a validation loss of 0.2122. on data with oversampling, the best model is obtained in the 10th model with a validation loss of 0.0939. However, when checking the classification accuracy, the data model without oversampling (model 7th) provides greater accuracy, namely 88%, while the model with oversampling (model 10th) provides an accuracy of 87%. Why does this happen? and if it is concluded which model is the best of models 7 and 10?
2 Answers
This is an incorrect comparison.
The loss, probably crossentropy loss, evaluates the raw predictions made by the model.
The accuracy evaluates a two-stage pipeline that first uses the neural network to make predictions and then uses some decision rule to classify those raw predictions into discrete categories, typically a threshold (at least in the binary case).
Especially if you go with a software-default decision rule of taking the category with the highest predicted probability, you could wind up with results that seem to contradict each other. In fact, the results do not contradict each other; they just concern different models (one of which has a decision rule on top of the neural network).
If you have a good loss value but a poor accuracy value, perhaps consider changing the decision rule for how to use the predictions made by that model with a low loss value. The easiest way to do this is by changing the threshold for classification (at least in a binary problem). A more sophisticated approach would be to use the raw model predictions, which are useful. Frank Harrell’s blog has two good posts explaining why.
Damage Caused by Classification Accuracy and Other Discontinuous Improper Accuracy Scoring Rules
Specific to your particular setup, if you do any oversampling (it is not clear that you should), leave the natural class ratio in the test data (only apply the oversampling to training data).
Loss and accuracy tend to negatively correlate, but their rank correlation is usually not $-1$. So if one model has a better loss than another model, it doesn't also necessarily have a better accurary. Nor vice versa.
To see that, you can make a scatter plot of accuracy vs. loss.
Which model is "better" depends on your goals. Choose a quality metric (for example loss or accuracy or something else) that matches your goals.
And ideally, train the neural network with a loss that matches your goals. For example, in medical diagnosis, you might want to penalize false negatives more than false positives, or penalize the misclassification of a dangerous disease more than that of a less dangerous one.