In Math what are Tensors, as in TensorFlow?

164 views

In Math what are Tensors, as in TensorFlow?

In: 0

3 Answers

Anonymous 0 Comments

In math, we like to make things easy. Probably the easiest thing we can work with that still actually do something are lines. Even easier still are lines that go through the origin. These have the equation y=mx. Trying to turn things into lines is basically the idea behind derivatives in calculus, and differential equations are useful because of this. So lines are pretty simple, but also really useful.

But the line y=mx is only in one dimension. One dimension in, one dimension out. If your input is 2D, 3D or bigger than 80Dimensions – not unheard of when doing Data Analysis – then you will need something that can work in these dimensions. This is what Linear Algebra is for. Matricies are how we do the equation y=mx, but in higher dimensions as Ax=b. You can input an N-dimensional piece of data and get out an M-dimensional piece of data. Consequently, Linear Algebra is probably the most important modern field of math out there. You can analyze signals, do image processing, facial recognition, and a whole lot of other very important things with computers using little more than Linear Algebra.

An important application of Linear Algebra – the Ax=b equation – is Linear Regression. What happens here is that you have a set of data whose inputs and outputs you think should be linear. Linear regression finds the best possible matrix A so that whenever x is an input for your data, then Ax will be as close – on average – as it can be to the output data corresponding to x. This is the basis for much of Machine Learning. All the neural networks and stuff are just multiple, linked instances of Linear Regression tied together. It’s a web of finding the right matrix that will fit your data in the way you think is the best. There’s a lot of complexity that you can control, which is why it works so well.

But matricies are limited because you can only input one thing at a time. One vector in, one vector out. Tensors expand this. Instead of Ax=b, you have T(x,y,z)=S where x,y,z are vectors of whatever dimension you want (and you can extend well beyond z if you like) and, in a way, S can be thought of as multiple vectors being output. This gives a lot of flexibility and control and can make machine learning much more fluid. This goes beyond basic linear algebra into what we call “Multilinear Algebra”.

Now, in the simple y=mx, m is a single number. For Ax=b, A is a matrix which is a 2D array of numbers and the size of this array tells us about the dimensions of the inputs and outputs. To T(x,y,z,…)=S, it is a multidimensional array whose dimensions tell you about the number and dimensions of possible inputs and outputs. Derivatives find the right “m” to fit a curve. Linear Regression finds the right A to fit a data. And multilinear regression finds the right T.

But, the word “tensor” is way overused. Different mathematicians, depending on the field, will use it differently. Physicists use it differently. Engineers and Computer Scientists use it differently. They’re all related but have wildly different connotations. This is specific to how the word is used in TensorFlow.

Anonymous 0 Comments

In linear algebra, tensors are generalizations of scalars and vectors.

Scalars are just numbers, like 1, 2.5, Pi and so on.

Vectors can be thought as arrows with a given direction. They can be given in relation to a certain reference system called a base. For example, we can say a compass has two axes, one pointing north and one pointing east and I will define here that each axis is 1 mile long from the origin. Say I wanted to give you directions to go somewhere: I tell you to go 1 mile north and 1 mile west. In relation to our compass axes, we could write that down as (1, 1), i.e. go 1 times the north axis and 1 times the east axis.

We might now ask ourselves what happens when we change our reference system. For example, if I made each axis twice as long, the representation for going 1 mile north and 1 mile east would have to become (1/2, 1/2), as the 1 mile we want to go is now only half as long as each of our reference axes. If I rotate the reference system 90 degrees clockwise you could imagine we have to rotate the representation 90 degrees counterclockwise to point the same direction. The representations (1,1), (1/2, 1/2),… are vectors and whichever way we transform the reference system, we have to transform the vector the exact opposite way for it to still point in the same direction with the same length. Thus, vectors are said to be contravariant, i.e. they change according to the inverse of the transformation to the reference system, or basis.

I don’t want to go into too much detail here, but there are also objects in linear algebra that transform in different ways. For example, covectors transform covariantly, meaning when you change the reference system a certain way, they change along with it, in the same way. Scalars are independent of the reference system and thus don’t change at all when it is transformed, thus they are said to be invariant.

Tensors are now a generalization that allows us to better understand how objects transform. A tensor has a defined type that tells us how contravariant or covariant it is. A scalar is a type (0,0)-tensor, it is neither contravariant nor covariant, because it is invariant. A vector is contravariant and thus is a (1,0)-tensor, a covector is covariant and is a (0,1)-tensor. But tensors can be more complex. They can be co- or contravariant in more than one way, so a (2,1)-tensor would, in a way, transform twice contravariantly and once covariantly.

Systems of tensors form interesting structures that can be studied in mathematics. Also, since you specifically asked about TensorFlow, the tensors in TensorFlow have basically nothing to do with this, they are just multidimensional arrays.

Anonymous 0 Comments

Tensor as in TensorFlow is literally just a table of numbers. Here a “table” is like a 2-dimensional grid of number, except it can be more than 2 dimensions (or less, but a 1-dimensional tensor is normally just called a vector). You could think of a matrix as a tensor (in this sense of tensor).

Tensor in TensorFlow is mostly unrelated to tensor used in math or physics. You could say they are *inspired* by math’s tensor, but conceptually the similarities are superficial.

The main reason for the word “tensor” in TensorFlow is for the reason of computer architecture. We have specially designed hardware to make calculation on a table of numbers, in a way faster than doing it on each number individually.