why does a model with a larger val loss produce higher accuracy than a model with a smaller val loss?

Question

I did ANN classification on training data with oversampling and without oversampling. For each data, the smallest validation loss is sought with trial and error of 18 models. In the data without oversampling, the best model is obtained in the 7th model with a validation loss of 0.2122. on data with oversampling, the best model is obtained in the 10th model with a validation loss of 0.0939. However, when checking the classification accuracy, the data model without oversampling (model 7th) provides greater accuracy, namely 88%, while the model with oversampling (model 10th) provides an accuracy of 87%. Why does this happen? and if it is concluded which model is the best of models 7 and 10?

Can you confirm that the training data used for each model was exactly the same between the two sampling methods? — user78229
– user78229, Commented Jul 1, 2023 at 15:33

Dave · Accepted Answer · 2023-07-01 15:52:17Z

This is an incorrect comparison.

The loss, probably crossentropy loss, evaluates the raw predictions made by the model.

The accuracy evaluates a two-stage pipeline that first uses the neural network to make predictions and then uses some decision rule to classify those raw predictions into discrete categories, typically a threshold (at least in the binary case).

Especially if you go with a software-default decision rule of taking the category with the highest predicted probability, you could wind up with results that seem to contradict each other. In fact, the results do not contradict each other; they just concern different models (one of which has a decision rule on top of the neural network).

If you have a good loss value but a poor accuracy value, perhaps consider changing the decision rule for how to use the predictions made by that model with a low loss value. The easiest way to do this is by changing the threshold for classification (at least in a binary problem). A more sophisticated approach would be to use the raw model predictions, which are useful. Frank Harrell’s blog has two good posts explaining why.

Damage Caused by Classification Accuracy and Other Discontinuous Improper Accuracy Scoring Rules

Classification vs. Prediction

Specific to your particular setup, if you do any oversampling (it is not clear that you should), leave the natural class ratio in the test data (only apply the oversampling to training data).

root · Accepted Answer · 2023-07-01 18:07:04Z

Loss and accuracy tend to negatively correlate, but their rank correlation is usually not $-1$. So if one model has a better loss than another model, it doesn't also necessarily have a better accurary. Nor vice versa.

To see that, you can make a scatter plot of accuracy vs. loss.

Which model is "better" depends on your goals. Choose a quality metric (for example loss or accuracy or something else) that matches your goals.

And ideally, train the neural network with a loss that matches your goals. For example, in medical diagnosis, you might want to penalize false negatives more than false positives, or penalize the misclassification of a dangerous disease more than that of a less dangerous one.

Stack Exchange Network

why does a model with a larger val loss produce higher accuracy than a model with a smaller val loss?

2 Answers 2

Your Answer

Linked

Hot Network Questions

why does a model with a larger val loss produce higher accuracy than a model with a smaller val loss?

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Linked

Related

Hot Network Questions