To understand deep learning, you need to understand a little bit about what deep learning and AI are *trying* to simulate. They’re trying to simulate how the human brain solves problems.
When we normally program computers, what we do is we write down a series of steps you have to follow to achieve a result. But what happens when your problem is so complicated, we don’t really have a series of steps to solve it?
Imagine I hand you a picture, could you determine if the picture is of a cat? You could probably do that pretty easily, it’d probably take you less than a second to determine if there’s a cat in the picture, right? But amazingly, it doesn’t matter if I hand you a cartoon image of Garfield, or if I hand you a black and white photo, or if I hand you a 3D model of a cat, or if I hand you a stone slab with a cat carved into it; it doesn’t matter if it’s a lion, or a puma, or a house cat, you could easily identify all of these things as cats.
How did you do it? What are the steps you followed?
There isn’t any easy answer to this, and these types of problems that don’t have obvious steps to follow to solve them are what we use AI to solve.
When you learned what a cat is, and learned how to identify a cat, you learned by seeing cats and seeing images of cats, having people tell you that it’s a cat, and then having people show you things and ask you if that thing is a cat. Each time you got it wrong (or right), your brain responded by reinforcing or decreasing certain connections between the neurons inside your brain. Your brain built specific pathways for the signals to travel in response to particular images that lead to you understanding it’s a cat.
While we don’t understand exactly *how* you’re identifying a cat, we can still reproduce the *process* by which you identified the cat by building the entire structure that did it – the pathway in your brain.
To do that, we make something called a “neural network.” Essentially, we’re building a mathematical model that represents a bunch of different neurons – like a slice of your brain, but just the part that identifies cats.
Then, we have to teach this slice of your brain, just like you had to be taught. We teach it by showing it a picture and saying “is this a cat?” It gives an answer, “no, this is not a cat.”
Then, we provide feedback to the model. “Actually, this was a cat.” The model takes that feedback, and then it determines which connections it needs to strengthen and weaken so that the next time it sees this picture, it identifies it as a cat. This process is “deep learning.”
We use the term “deep” because that “slice” of your brain (the neural network we built on a computer) contains many layers, and the learning has to go “deep” through those layers.
Latest Answers