I get the bit about giving it information that it learns from information you give it and connecting its neural pathways to learn and whatever (I’m bad at explaining things), but how does it learn? How are you supposed to provide it with information, and how is it supposed even to begin learning in the first place?

I know that you input data, and the machine receives that data, tries to find patterns and do things; but how does it do this?

In: Technology

this is a really hard topic to explain, partly because even the experts dont actually understand it, and partly because the bits they do explain are MS/Ph.D level math.

But basically, a machine learning model is a whole bunch of “weights”. You take the input to the model, convert it to a list of numbers, and then multiply all of the numbers in that list by the weight numbers. which numbers get multiplied where, and how you combine the newly created numbers is defined by the architecture of the ML model, a separate generally hard coded human selected thing (but some will “breed” various architectures together and pick the best at each generation)

The output from all of this is the model output, so you start with some known input with a known output and you put it through the system, then you compare the output value to the expected output, and tweek all the weights slightly so that the output number is closer to the expected.

Then you do this for thousands of inputs over and over repeating each thousands of times.

By the time you are done, the weights in the model have been tuned to produce the expected output from the inputs. Now you test inputs you didnt train on, and if the model gets those correct too, you deploy it on unknown inputs and just hope its right.

When you put enough weights together, you can model basically anything, and since the difference between the input and output durring “training” decreases, it looks like the model “learned” what is expected, but really, there is just a bunch of weights set to very specific values which replicate the expected results mathed.

Let’s say you have a neural network with three parameters (A, B, C). You set them to random values (A0, B0, C0). Then you feed them an input, like an image of a cat, and see if they output “cat”. If they don’t, you change them to new values (A1, B1, C1), and see if they say “cat”. Eventually, it will say “cat”. Then you feed it a picture of a “dog”, and see if it says “dog”, repeat until it says “dog” for the dog and “cat” for the cat. Then you get another cat picture and repeat the process and then another dog picture. ….

After a million cat pics and a million dog pics your neural network will be trained. By the way, three won’t do it, you need millions of parameters.

Its learning by changing every variable a bit and check if the outcome is more accurate.

Dont try to think about it in human words, “learning” is something humans do, the model will optimize based on a mathematical function the developers provide to it(a loss function). Its just that noone knows this mathematical function and it has billions of parameters.

But the idea is that if you find the correct parameters you have a function that has a row of pixels as input and its output is the chance that this picture is showing a cat. So you just wiggle all parameters a bit and check if the output is better than before. Do that billions of times and you get an optimal function.

## Latest Answers