eli5: How do you design a deep learning algorithm?

229 views

eli5: How do you design a deep learning algorithm?

In: Technology

You take your inputs and multiply them by a random weight.

Inputs of [1, 0, 3]
Weigjts if [2, 5, 1]

[1, 0, 3] * [2, 5, 1] = [2, 0, 3]

Then you take the sum of those
[2, 0, 3] = 5

That goes through an activation function this let’s the model learn more complex patterns. A function is common is relu. How it works is if the sum is > 0 return the sum otherwise return 0.
5>0=5

Then the output

5 would be comparable against the true value say its true value is 7.

Each loop it will take the output (5) and compare to the true value (7)

After it makes the comparison it will calculate the loss and update the random weights until the loss is under a certain score.

Then you’ve trained a simple perceptron.

For a deep learning algorithms you would combine multiple perceptrons.

(Then theres more advanced methods like convolutions and recurring nodes but I won’t go into that)

Deep learning algorithms belong to “black box” family. This means that there is magical black box where you feed inputs and get outputs. You have no idea why or how it produces those results it just does. This is real ELI5 answer.

If you open the box you will see bunch of wires with different widths that are randomly connected to each other. Each time you put input in the size of wires change little bit and might make new connections (or more specify change 0 width to something else). When the output comes out you say how good it was and should the box use same wires next time. Bit by bit it learns but because everything inside the box is random there is noway of saying how it works.

Practical example is social media feeds. They get content as input and give you some post that they think you will watch the most. Now we have woken up in world where feeds are full of angry racist stuff but that’s because people view that stuff and engage with it. Some times connections are more complicated. Sometimes they give social media post from your local restaurant because friend of you liked a post about some recipe with same cuisine. No way of knowing why or how they suggested that place to you because wires are too complicated to analyze. This also means you cannot tweak the algorithm and give it new rules because it would mess the whole interconnected thing.

The basic idea with machine learning is that you tell the computer the “rules” of the game (for game, read whatever problem you’re trying to solve – possibly an actual game), give it some way to score its performance, then let it play using random behaviour thousands of times. If it does well, it gets a reward that encourages that play in the future. If it does poorly, it gets some kind of punishment that makes it less likely to do that again.

One example of this I like is the Matchbox computer, [Menace](https://www.youtube.com/watch?v=R9c-_neaxeU), that can learn to play noughts and crosses (tic-tac-toe). You colour the 9 squares on a grid. Then, you need a box for each possible configuration of the grid. In each box, you out a bunch of beads of the colours which are possible moves from that point. To start a game, you pick the first box, shake it, take out a random bead, then go in that square. Then the opponent makes their move. Then you find the next relevant box, shake it, take out a bead, make that move, and continue. Eventually, if the computer wins, you reinforce the behaviour by adding more beads of those colours to the boxes you used. If it loses, you take out those beads. Gradually, over a lot of time, the machine is more likely to pick the moves that have been successful in the past.

Another approach might be to have two systems battling against each other. This is how some of the Deepfake software works. One machine is trying to generate pictures of a person, and another is trying to work out which picture is real and which is generated. If the guesser gets it right, it wins. If not, the generator wins. Whichever wins gets a “reward” that makes it more likely to do something similar. Repeat for a long time and you have one machine that’s really good at making pictures of faces and another that’s good at spotting fakes.