If you think of what you want your model to do as a desired function F that takes an input x and gives the correct output y, you can think of a neural network as a function R that tries to be as similar to F as it can by trying to learn to “mimic” it with its many layers of neurons. The idea here being that instead of explicitly defining this function R, you define its architecture, i.e., number of neurons and layers and the network learns the weights and biases through basically trial and error (a bit more complex than this).
A “neuron” takes all of the outputs of the previous layer (which could be your input), weighs them with the learned weights, sums them and adds a learned bias, and applies an “activation function”, you can think of a neuron as learning whats called a “non linear function”. As you could maybe see though, the weights and biases are used to make a linear function w1x1 + w2x2 + … wnxn + b (see the similarities to mx + b?), and this is where the activation function comes in handy because when it is applied to this linear function it makes it non-linear (by for example making its value 0 if it is negative). This is central to the idea of Machine Learning with Neural Networks as theyre based on the theory that we can learn any function F by stacking a lot of nonlinear functions on top of each other, and this is exactly what a neural network does.
Sorry for horrible formatting, im on a phone. Feel free to ask any clarifying questions, if ik the answer ill def try to answer
Latest Answers