Why when training ML models do we not take the model with the highest accuracy

729 views

Its pretty common for an ML model to lose accuracy at various points during training. But presumably that means they are worse so why do we take the last version instead of the one that had the highest accuracy?

In: 1

12 Answers

Anonymous 0 Comments

We assume that prior instances of higher accuracy are due to overfitting, which other people have already talked about. Basically, something that might be accurate on one dataset may not be accurate on another because you don’t want the model to learn the noise. But another point is robustness. Imagine you are on a hill. We tend to want models that are more likely to stay accurate despite small disturbances. So in likelihood, if the model “fell off” a peak, that probably means the peak was very steep and not very robust. We want to be in a “flat” area in the end.

You are viewing 1 out of 12 answers, click here to view all answers.