What are neural networks?


With AI becoming more and more prevalent, I am hearing the term “neural networks” being thrown around. What are they and what function do they serve?

In: 27

Neural networks are a method of machine learning that tries to mimic how a brain would work. It’s made up of nodes. A node just takes an input, performs some function, and produces an output. These nodes are arranged in layers. A layer of nodes can take several inputs and produce several ouputs. You can take these ouputs and use them as inputs in another layer and have many connected layers of nodes in a network.

A neural network doesn’t necessarily have to be good at performing its assigned function. The way the nodes perform starts as random. A neural network has to be trained. Basically, you give the network a task, a set of test data, and a way to score how well it did and let it adjust its own settings according to its score. It then tries multiple settings for the nodes testing what works and what doesn’t work until you’re left with a neural network that can perform a task for no apparent reason. People refer to neural networks as a “black box” because you can’t tell how it works it’s just a bunch of seemingly random functions.

Neurons in the brain send signals along based on a set of simple rules. When a signal is sent to them, if it overcomes some minimum threshold value then they send it on to other neurons, otherwise they don’t. By varying this threshold value and connecting neurons in different ways, the brain is able to do amazing things.

Neural networks follow the same idea, but the neurons are bits of software or hardware instead of biological. The amazing thing about these structures is that by varying the thresholds and connections between neurons you can get them to handle nearly any sort of problem. You can also use methods (e.g. back propagation) to make neural networks “learn”, getting better and better at a task.

Neural networks are what your brain is made up of. A network of neurons and synapses.

Artificial neural networks (or ANNs) is a machine learning technique that attempts to replicate the learning process of the brain.

We often learn from positive/negative reinforcement. Our choices result in physiological changes in our brain that remember when something is right/wrong.

ANNs work in a similar way, but just use numbers instead (often called weights/biases).

ANNs have an input and an expected output. When the actual output is much different from the expected output, this is similar to negative reinforcement.

Through a process called back propagation, the network is adjusted slightly in a way that is mathematically expected to reduce the error value.

This is done repeatedly in a “learning” process with a large number of inputs/outputs until the error slowly drops to near zero.

At this point, the network is “trained” to do whatever it’s intended function is.

First of all, as a response to some other answers, neural networks are NOT an attempt to mimic the brain. They may have been inspired by the brain, but trying to talk about them this way is wrong and unhelpful.

[Here is the first video](https://www.youtube.com/watch?v=aircAruvnKk) in a fantastic series explaining how neural networks work. This should hopefully answer all of your questions, but I’ll summarize and elaborate a bit.

Machine learning, in general, is an effort to make computers solve problems that are so difficult they’re hard to even define quantitatively. In the video, the problem is to recognize handwriting–to determine which digit a piece of handwriting is. But how would you even define what a “3” is? If you were to take a traditional programming approach, how would you even start? It’s really unclear. But, we do know what the inputs are (images of handwriting) and what the correct outputs are (the correctly identified digits).

All of the various machine learning approaches are different contraptions we set up to try to make computers figure it out themselves. A neural network has a “neuron” for each input (each of the pixels of the images) and a neuron for each output (each of the 10 possible digits). A basic neural network just connects each input neuron to each output neuron, and so each input effectively votes directly for which output it thinks is right. For recognizing digits, a pixel right in the center being “on” could be a lot of things, but is almost certainly not a 0. So the connection between that pixel and 0 is weighted low, and is weighted something higher for the rest of the outputs. We could even imagine manually dialing in all of these values for the whole system, and maybe getting something that does slightly better than random guessing. But the real beauty to machine learning is that we let the computer figure it out. We give it an image, let it guess, and then slightly encourage it where it was right (turn those weighted votes up just a little bit) and correct it where it was wrong (turn those down). Repeat, and hopefully it gets better on its own.

Directly connecting the inputs to the outputs isn’t very powerful and is probably going to be wrong in most situations, so we can make the system more powerful by adding “hidden layers”. With one hidden layer, rather than voting directly, the input neurons vote on some intermediary neurons, which in turn vote on the outputs. And then you could add more layers. And how many neurons in each layer? That’s also a parameter you can change.

There are several cool things about neural networks, as opposed to certain other machine learning approaches:

1. You can (try to) make them more powerful just by ramping up the number and size of hidden layers. In essence, you can throw computing resources at the problem.
2. You don’t have to understand how to solve the problem to set them up. In the video series, Grant explains how he guesses that his hidden layers might represent some pieces of the digits, like little line segments and curves and such, but really, the system is free to find whatever correlations it stumbles upon.
3. They are great for things that have really big input and output spaces, such as processing images. For a more complicated example, imagine we’re taking high-def medical images (MRIs, X-rays, etc.) as input, and want to highlight which pixels we think are diseased as output. Way more pixels than in the video example, and way more outputs. But not a problem for the neural net setup.

The main downside is that the solution they come up with is impossible for humans to understand. It’s like a giant matrix of numbers. You can’t look at it and see how it’s working. So this can lead someone with a problem they don’t understand to create a solution they don’t understand. This is ultimately where the dangers of AI come from.

**“Neural network” is just a name for a specific machine learning model / method** In other words, one specific “recipe” to tell a computer to “learn” something.

In terms of how that recipe works — well, that’s like asking “how does a regression work.” The answer is going to be a lot of very boring detailed math. But the name “neural network” comes from the fact that the model is kind of inspired by the human brain (neurons).