What are Hyperparameters, and why are they so important to A.I.?
Hyperparameters are values that are set before training an AI model. They can have a major impact on how the model performs, so it’s important to set them carefully.
Hyperparameters are settings that control the behavior of an AI model. They are important because they determine how well the model performs and how quickly it learns.
The parameters of an AI model are the gargantuan tables of numbers that make up the calculations of the model and which are refined during model training.
The hyperparameters of a model are the decisions made in advance of any training, like how many layers the model should have, how many numbers should be used per layer, how many numbers should be used to represent a word, how quickly the values of these numbers should change during each phase of training, and so on. They are chosen up front by humans, not refined through training.
Input to the algorithm that the programmer has to tune themselves. In other words, just, “parameters”. The “hyper” is mainly there to distinguish them from parameters that the AI learns by itself from the training data.
One of the most prevalent hyperparameters is e.g. the learning rate. When you feed new data to the AI, it can calculate in which direction it has to change in order to better fit that data. It’s basically trying to get to the point of lowest error by checking which way is down.
But how far it should follow this way is up to the programmer – there’s no knowing whether the slope continues for a while, or whether there’s a “mountain” of high error just ahead. The learning rate is this “how far”, and it’s basically just trial and error to adjust it. Set it too low, it takes forever to get a good result. Set it too high, and you’ll overshoot every good result learned.