How does machine learning work? I’ve seen plenty of videos but after a certain point, I realize that I am not learning anything because they make a lot of references to weird mathematical symbols and concepts I never learned/forgot.



How does machine learning work? I’ve seen plenty of videos but after a certain point, I realize that I am not learning anything because they make a lot of references to weird mathematical symbols and concepts I never learned/forgot.

In: Technology

So CGP greys video, link below, is the best ELI5 out there.

If you want more than this you really need a passing grade in both linear algebra and calculus 3. For a good start with that go with 3blue1brown’s Neural networks series.

Note this is an explain it like I have a bachelor’s degree in math.

Lets make an example. Lets say you’re trying to teach an AI to recognize, lets say fire hydrants, in random pictures.

Digital pictures are essentially just a sequence of 1’s and 0’s, so you can feed these to the software. Then you start off by manually feeding the AI a lot of pictures that either do or do not have fire hydrants, and you tell it which are which. Then as the AI is fed more and more pictures it can narrow down a set of rules that pinpoint which sequences of bits represent fire hydrants and whether there is one in a picture or not.

It’s essentially comparable to if I gave you 100 pictures, that each came with a yes or no label and you had to identify what was special about the “yes” images.

That’s what the computer is doing, but since a simple AI doesn’t understand how an image is a representation of a 3 dimentional world, it just takes many times more pictures.

And at the end the software will still have no concept of what a fire hydrant is, it’ll just be really good at identifying them.

Basically you set up some boundary conditions. You define what variables can’t be changed, what variables can be changed, and what outcomes are favorable. Then you allow a computer to make random guesses at the changeable variables…and look at what the outputs are. If they’re “good” or “bad” you have the computer adjust its guesses and try again. Repeat millions of times.

A simple example that gets used a lot is animal skeletons. Specifically, dinosaurs. You can input all the dimensions for the skeletal structure, some generic guesses for mass, and then leave the motion to the computer. You let the computer guess which parts of the body to move and in which order, and you “score” its guesses by how much forward progress you make and which walking motion reduces impact forces and strain as much as possible on all the joints (animals move in the lowest-possible energy expending method in most cases).

Let the computer make millions of guesses and pretty soon you have an accurate computer simulation of how a massive dinosaur would have really walked, including using its tail as counterbalance. It’s pretty cool stuff.

First thing to understand is that “machine learning” is a collection of methods which only have the goal in common. And that goal is to classify new stuff based on known stuff. The actual method, and thus the math, vastly differs depending on what you want to do.

But from my experience learning ML, the most important concept to grasp are matrices. They’re just very convenient to represent calculations on massive amount of multi-dimensional data in a compact mathematical formula.
So my advice if you want to properly understand ML is to start with a linear algebra course. Doesn’t need to be a overly detailed one either.

Have you ever thought you could come up with an equation for the value of, say, a house? Like this:

Value = $50,000 + $10 X Square Feet + $4,000 X Number of Bedrooms

In this example, “Square Feet” and “Number of Bedrooms” are the inputs; $50,000, $10, and $4,000 are the weights. It’s the weights that need to be “learned”. Maybe $51,257, $11.25 and $3,986 will work better? I don’t know. Maybe there’s a procedure I can use to calculate the best weights automatically?

Machine learning is using computers to go nuts with this basic idea.

Computers determine if something is true by performing a calculation – basically putting data into a formula and computing the result. Normally humans program the formula and the computer just does the calculation. You can imagine if the result of the formula is positive the answer is “true”. Computer learning is having the computer modify its own equation to get more accurate answers.

Machine learning works by giving a computer a formula with variables that the computer can weight. Meaning a computer can decide for each variable whether it affects the result and by how much. And the computer gets a set of data with a known answer.

So imagine a formula for determining if a person will live. The formula has variables for height, weight, is breathing, heart is beating, and has cancer.

The computer creates two versions of the formula, randomly setting the weights for the different variables, runs the data through the formula and sees if the result matches the known answer for that data. So if it runs the formula for someone who dies does the formula show that the person dies?

The computer then sees which version of the formula was the most accurate. It takes that version of the formula, modifies it slightly but randomly, and tries again.

Each time it gets a slightly more accurate version of the formula. It may never get a 100% accurate version of the formula. It just gets the formula as accurate as it can given the data it has.

In the case of the will they live formula, it should eventually figure out that height and weight matter very little and that breathing and heart beating are almost required for the person to live.

The key to machine learning is that a computer can modify the formula and run a test hundreds or thousands of times per minute. So it is not smart, it is just fast.