What about GPU Architecture makes them superior for training neural networks over CPUs?

741 views

In ML/AI, GPUs are used to train neural networks of various sizes. They are vastly superior to training on CPUs. Why is this?

In: 679

26 Answers

Anonymous 0 Comments

CPU’s are very versatile and complex. CPU’s machine instructions are not like one operation per instruction, some instructions may take a lot of simple operations and a lot of cycles. GPUs on the other hand are very straightforward, their instructions are mostly like “get this, add this”, and GPU’s don’t like branching “if this do that otherwise do this”, unlike CPUs which handle branching with ease. And by avoiding complexity, GPUs are able to do a whole lot of operations per cycle. Each CPU core is big (on a crystal) and smart, while GPU cores are small and dumb, but you could place literally thousands of them on same area as per one CPU core. Mathematical neurons are simple in principle too, so, it’s much easier to simulate them on simple processing cores too. Even GPU cores are too “smart” for neurons, as basically they need three or maybe four kinds of operations: summation, subtraction, multiplication and comparison, and all of them with same kind of value (while GPUs are able to do computation with single precision rational numbers, double precision rational numbers, 4 bytes integer numbers, maybe 8 bytes integer numbers, etc, neural networks don’t need that, they don’t even need precision, as they’re inherently imprecise, they need maybe 1 byte of data). For this reason, there’s a neural chips, which utilize even simpler cores than GPU, but these cores are designed to work specifically to simulate neurons, so they’re blazingly faster than even GPUs.

You are viewing 1 out of 26 answers, click here to view all answers.
0 views

In ML/AI, GPUs are used to train neural networks of various sizes. They are vastly superior to training on CPUs. Why is this?

In: 679

26 Answers

Anonymous 0 Comments

CPU’s are very versatile and complex. CPU’s machine instructions are not like one operation per instruction, some instructions may take a lot of simple operations and a lot of cycles. GPUs on the other hand are very straightforward, their instructions are mostly like “get this, add this”, and GPU’s don’t like branching “if this do that otherwise do this”, unlike CPUs which handle branching with ease. And by avoiding complexity, GPUs are able to do a whole lot of operations per cycle. Each CPU core is big (on a crystal) and smart, while GPU cores are small and dumb, but you could place literally thousands of them on same area as per one CPU core. Mathematical neurons are simple in principle too, so, it’s much easier to simulate them on simple processing cores too. Even GPU cores are too “smart” for neurons, as basically they need three or maybe four kinds of operations: summation, subtraction, multiplication and comparison, and all of them with same kind of value (while GPUs are able to do computation with single precision rational numbers, double precision rational numbers, 4 bytes integer numbers, maybe 8 bytes integer numbers, etc, neural networks don’t need that, they don’t even need precision, as they’re inherently imprecise, they need maybe 1 byte of data). For this reason, there’s a neural chips, which utilize even simpler cores than GPU, but these cores are designed to work specifically to simulate neurons, so they’re blazingly faster than even GPUs.

You are viewing 1 out of 26 answers, click here to view all answers.