I’m reading Richard Powers’ *Galatea 2.2*, >!which tells the story of an engineer and a writer developing a computer-based neural network capable of taking as input a piece of English literature (e.g. a sonnet) and outputting analysis equivalent of a 22-year-old’s level!<. **In the book,** ***weight/ed vectors*** **get multiple mentions and the concept has me stumped.**
In: 0
Machine learning these days is often understood as a function that takes an input and produces an output. For example, f(x,y)=5x+2y is a function that takes in an input and produces an output. So if took in the inputs x=1,y=3, it’d produce 7. Machine learning functions often happen in stages, with lots of values at each stage. Finding a value at each stage involves executing some rule. For example, it might be something like “each value of this stage is equal to (weight 1)*first input + (weight 2)*second input + (weight 3)*third input”, etc. This happens a few times until there is some final output which could be interpreted as words or image pixels or anything else. Machine “learning” happens by picking random values for all of those “weight” values, then tuning them by comparing how that function works to known “correct” answers until the function reliably ends up creating matching answers.
So, in short, the function for how machine learning things will work is a given, but the result of training is a collection of numbers that make the function produce “good” answers instead of random junk. Those numbers are the “weights.” For the example function above, you could think of the “5” and “2” as the weights. Weights are often described and thought of as groups of numbers called vectors.
There are two relevant things here.
First: what’s a vector? It’s a mathematical object that can have multiple independent components.
For a physical example, take something’s position in space. That can be described by the vector <X,Y,Z>.
But that’s not the only thing that can be described as a vector. The input to a neural network can be represented as a vector.
It’ll be easier to understand this if we look at an image instead of text. Particularly, a black-and-white image.
To describe this in a way a computer could work with it, we can break that picture down into an array of pixels. Each pixel has a color value, and each color value can vary completely independently of any other color value. That makes it a vector.
That leads into the second thing: how a neural network actually works. Neural networks (at least, some types) are formed of “layers” of “neurons.” A neuron is a node of the network that can hold a particular numerical value. Several to many of these are arranged logically into layers; typically, every node of one layer connects to every node of the next layer. The nodes and the connections between them have something called a “bias” and something called a “weight.” The nodes also have some threshold value that I’ll explain below.
A “bias” is like a buff or debuff. It adds or removes a certain amount from the numerical value the node holds. A “weight” is like a scale factor. Something with a weight of 2, for example, will typically be twice as influential as something with a weight of 1.
When a signal reaches a layer of a network, the weights and biases are used to calculate the signal each node sends to the next layer. If all of the inputs to a given node (accounting for all the buffs, debuffs, and scale factors) surpass that node’s threshold value, the signal is transmitted along to the next layer.
Eventually, there’s an output layer that doesn’t connect to a further layer. Back to our example of a black-and-white photo… say the neural net in question is designed to apply a Picasso style to a photograph. The input vector would be the color values of all of the pixels in the input image; the output vector would be the color values of all the pixels in the output image. And for every layer along the way, there’s a “weighted vector” that describes the state of the network as the signal propagates through.
In math, a “weight” is usually just a factor of some sort. For example, if you and I decide to mow lawns, we might decide that your “weight” might be 0.6, mine 0.4, and that means you get 60% money, and I get 40% of money we earn as a mowing team.
Weights in artificial neural networks are just values that represent the “knowledge” of the neural network.
[Here’s a comment](https://www.reddit.com/r/askscience/comments/10aqkjj/what_exactly_is_the_process_when_someone_trains/j47s2cb/?context=3) I wrote recently about NN learning.
Mathematically, a neural network takes a bunch of input data, the data goes through some mathematical algorithms and functions, then the results of those algorithms / functions gets multiplied by weights , and that gives you your result.
And these weights represent part of the “knowledge” of a neural network. It’s simply a bunch of numbers that get determined during the training phase of setting up a neural network when answering the question “*what kind of math should I do on this set of input data so I get that set of output data?*”
Latest Answers