Why when training ML models do we not take the model with the highest accuracy

543 views

Its pretty common for an ML model to lose accuracy at various points during training. But presumably that means they are worse so why do we take the last version instead of the one that had the highest accuracy?

In: 1

12 Answers

Anonymous 0 Comments

Imagine a mountain range. You’re wandering around trying to find the tallest peak but you can’t look around the horizon to try and find the highest one. The only thing you can do is keep walking around and noting when you climb higher and lower. If you only followed where you climbed higher, you could wind up just scaling a random mountain and never finding a different mountain that could be higher.

AI/ML is basically this. The training data basically creates a mountain range and the training itself is essentially randomly scaling the mountains until you find one that is most likely to be the highest. If you only ever went up, you will only be stuck on 1 mountain rather than checking the hundreds of others.

Should be noted that this is why AI/ML is essentially applied statistics. The mountains are basically the likely hood of a given generated “answer” being the “right” one. The definition of answer/right is subject to interpretation by the humans looking at the results.

You are viewing 1 out of 12 answers, click here to view all answers.