Input to the algorithm that the programmer has to tune themselves. In other words, just, “parameters”. The “hyper” is mainly there to distinguish them from parameters that the AI learns by itself from the training data.
One of the most prevalent hyperparameters is e.g. the learning rate. When you feed new data to the AI, it can calculate in which direction it has to change in order to better fit that data. It’s basically trying to get to the point of lowest error by checking which way is down.
But how far it should follow this way is up to the programmer – there’s no knowing whether the slope continues for a while, or whether there’s a “mountain” of high error just ahead. The learning rate is this “how far”, and it’s basically just trial and error to adjust it. Set it too low, it takes forever to get a good result. Set it too high, and you’ll overshoot every good result learned.
Latest Answers