Input to the algorithm that the programmer has to tune themselves. In other words, just, “parameters”. The “hyper” is mainly there to distinguish them from parameters that the AI learns by itself from the training data.
One of the most prevalent hyperparameters is e.g. the learning rate. When you feed new data to the AI, it can calculate in which direction it has to change in order to better fit that data. It’s basically trying to get to the point of lowest error by checking which way is down.
But how far it should follow this way is up to the programmer – there’s no knowing whether the slope continues for a while, or whether there’s a “mountain” of high error just ahead. The learning rate is this “how far”, and it’s basically just trial and error to adjust it. Set it too low, it takes forever to get a good result. Set it too high, and you’ll overshoot every good result learned.
Input to the algorithm that the programmer has to tune themselves. In other words, just, “parameters”. The “hyper” is mainly there to distinguish them from parameters that the AI learns by itself from the training data.
One of the most prevalent hyperparameters is e.g. the learning rate. When you feed new data to the AI, it can calculate in which direction it has to change in order to better fit that data. It’s basically trying to get to the point of lowest error by checking which way is down.
But how far it should follow this way is up to the programmer – there’s no knowing whether the slope continues for a while, or whether there’s a “mountain” of high error just ahead. The learning rate is this “how far”, and it’s basically just trial and error to adjust it. Set it too low, it takes forever to get a good result. Set it too high, and you’ll overshoot every good result learned.
The parameters of an AI model are the gargantuan tables of numbers that make up the calculations of the model and which are refined during model training.
The hyperparameters of a model are the decisions made in advance of any training, like how many layers the model should have, how many numbers should be used per layer, how many numbers should be used to represent a word, how quickly the values of these numbers should change during each phase of training, and so on. They are chosen up front by humans, not refined through training.
The parameters of an AI model are the gargantuan tables of numbers that make up the calculations of the model and which are refined during model training.
The hyperparameters of a model are the decisions made in advance of any training, like how many layers the model should have, how many numbers should be used per layer, how many numbers should be used to represent a word, how quickly the values of these numbers should change during each phase of training, and so on. They are chosen up front by humans, not refined through training.
Latest Answers