I give you a brand new video game, but I don’t tell you how to play it, you have to figure it out yourself. If you are a computer then this is machine learning.
The textbook definition of machine learning is the idea of designing programs to perform a task without explicitly telling it to do it. On the mathematical side, this, very loosely speaking, boils down to solving a complex data fitting problem.
One relatively modern strategy of machine learning involve artificial neural networks (ANNs). The idea is to build a model containing many layers, with each layer made up of artificial neurons that mimics biological neurons in the brain. Deep learning is the use of ANNs that contain many layers. There’s no predetermined number of layers that determines if one is a deep network or a “shallow” network, but almost all uses of neural networks have at least three layers so everyone started using the phrase deep learning.
Simply put, a computer makes a guess as to how a problem might be solved to get a desired answer from some input values. The computer checks to see if the answer is, in fact, correct and if not, changes are made to the equations that the computer is using to determine the answer from the inputs. The process continues until the computer can reliably obtain the correct answer.
The trick lies in the method whereby the equations are corrected little by little over time. If the computer is allowed to make enough guesses it will be able to figure out how to compute the answer from the input.
There is a YouTube channel called Welch Labs that has a good series of videos explaining how a very simple implementation of machine learning would work. I also recall that MIT has some math lectures on YouTube that explain the basic mathematical approaches. The math at the heart of these processes is relatively approachable.
Just like how your hard drive stores data as 1’s and 0’s, Neural Networks compress data into the form of probabilities. Because the data isn’t set in stone, but is a form of odds, you can actually retrieve more data than you stored at the cost of accuracy.
This makes it appear as if they’re learning, or mimicking language or art. When stable diffusion makes a photo of a dog, it’s really just combining the most likely idea of a dog and most likely surroundings into a photo, rather than building anything new.
Latest Answers