What about GPU Architecture makes them superior for training neural networks over CPUs?

695 views

In ML/AI, GPUs are used to train neural networks of various sizes. They are vastly superior to training on CPUs. Why is this?

In: 679

26 Answers

Anonymous 0 Comments

CPUs are like swiss army knives. They can do many different types of computations well. GPUs are like filetting knives. Very good at doing a specific thing – filleting fish. You can fillet your fish with a swiss army knife, but are much better off doing it with a filleting knife.

In the case of GPUs, the filleting is a type of activity that is also very useful for ML applications.

Anonymous 0 Comments

ML/AI is basically just a lot of complicated calculations and operations.

GPUs can do a lot of math parallelly, at the same time. It is not ‘smart’. You can consider it analogous to the “nerd” kid in the class.
The CPU on the other hand is analogous to the “life-smart” kid in the class, meaning it can do various other tasks (like controlling what to send to the monitor/display, what data to retrieve from the storages etc.) along with some complicated math. As a result, it takes more time to solve the math but it does solve them eventually, because while they are not *that* nerdy, they still are studious and capable if need be.

Anonymous 0 Comments

They’re super fast at matrix multiplication. That’s where you multiply an entire table of numbers with another table. This is because modern GPUs are designed apply special effects, called *pixel shaders*, to entire images in a single pass. Effectively it can multiply a whole picture with another whole picture (and pixels with surrounding pixels) to produce a whole new picture, all at once.

It used to be that the pixel shaders were pre-programmed, baked into the hardware, to apply common effects like light bloom, deferred lighting or depth of field blur. But then they started having *programmable* pixel shaders, meaning developers could go in and write their own algorithms for their own special effects.

It’s when AI researchers got a hand of these newfangled programmable GPUs that they realized what they could do with’em. Instead of just multiplying images to special effect layers, they multiply images with other images using their own formulas. For example, they’ll take thousands of pictures of bikes, then use the matrix multiplication power of GPUs to combine them into a “map” of what bikes should look like.

Modern GPUs aren’t limited to multiplying only 2D images in two dimensions; rather, they can multiply 3D “clouds” and beyond.

Anonymous 0 Comments

AI & ML build out neural networks and train then on data.

A neural network is like your brain, each cell is connected to other cells, so when you get an input a bunch of cells fire off and the eventually decide if something is a traffic light or not.

The math involved in this is very simple, you blast inputs at the NN, see the result, then if it’s right you increase the strength of the links that fired & if it’s wrong you decrease their strength.

The hard part for AI/ML is that you need to do these simple operations many times (once for every node’s connection to other nodes, every time you show it training data (which itself requires a lot of training data).

Graphics cards do this simple math many times to decide what exact color pixels should be.

CPUs are setup to do more complex processing these days, so instead of having a “dual core, or even 32 core machine of CPUs” with a GPU you’re getting far more parallelism.

Anonymous 0 Comments

GPUs are hard to program generically but are easy to program to process lots of things in parallel (graphics/pixels), which is good for NNs and can speed up training by a 100x or more. Ive had training go from days to minutes

Anonymous 0 Comments

GPUs, or graphics processing units, are specialized computer chips that are designed to handle the complex calculations needed for rendering graphics and video. They are able to perform these calculations much faster than a regular CPU, or central processing unit, which is the main chip in a computer that handles most of its tasks.

One of the things that makes GPUs so good at handling complex calculations is their architecture, or the way that they are built and organized inside the chip. GPUs are designed with many small, simple processors that can work together to perform calculations in parallel, or at the same time. This makes them much faster than CPUs, which usually have just a few larger processors that can only work on one task at a time.

Neural networks are a type of computer program that are designed to learn and make decisions like a human brain. Training a neural network involves running many complex calculations to adjust the parameters of the network so that it can learn to recognize patterns and make predictions. Because GPUs are so good at handling complex calculations, they are much faster at training neural networks than CPUs. This is why GPUs are often used for training neural networks in machine learning and artificial intelligence applications.

Anonymous 0 Comments

This is not a GPU vs CPU debate, as only NVIDIA does this and the cards have dedicated cores for neural AI.
No wonder it is a big player in the AI Car development.

Anonymous 0 Comments

CPU’s are very versatile and complex. CPU’s machine instructions are not like one operation per instruction, some instructions may take a lot of simple operations and a lot of cycles. GPUs on the other hand are very straightforward, their instructions are mostly like “get this, add this”, and GPU’s don’t like branching “if this do that otherwise do this”, unlike CPUs which handle branching with ease. And by avoiding complexity, GPUs are able to do a whole lot of operations per cycle. Each CPU core is big (on a crystal) and smart, while GPU cores are small and dumb, but you could place literally thousands of them on same area as per one CPU core. Mathematical neurons are simple in principle too, so, it’s much easier to simulate them on simple processing cores too. Even GPU cores are too “smart” for neurons, as basically they need three or maybe four kinds of operations: summation, subtraction, multiplication and comparison, and all of them with same kind of value (while GPUs are able to do computation with single precision rational numbers, double precision rational numbers, 4 bytes integer numbers, maybe 8 bytes integer numbers, etc, neural networks don’t need that, they don’t even need precision, as they’re inherently imprecise, they need maybe 1 byte of data). For this reason, there’s a neural chips, which utilize even simpler cores than GPU, but these cores are designed to work specifically to simulate neurons, so they’re blazingly faster than even GPUs.

Anonymous 0 Comments

A CPU is a fleet of trucks, a GPU a swarm of a thousand delivery bikes.

AI and neural networks generally work on crossing and analyzing a ridiculous amount of small elements simultaneously, in an extensive dataset, which GPU architectures are more suited for.

Anonymous 0 Comments

GPUs have dedicated circuitry for graphics math, and now recently they’re being included with circuitry dedicated for AI math. CPUs do this math using general purpose circuitry which makes them slower at it.

In addition, GPUs have higher total computing power than CPUs. But most tasks are very difficult or impossible to program to run on a GPU or fully utilize it because of the design of GPUs compared to CPUs. Other comments have explained those differences.

AI training and execution happens to take advantage of GPUs well.

0 views

In ML/AI, GPUs are used to train neural networks of various sizes. They are vastly superior to training on CPUs. Why is this?

In: 679

26 Answers

Anonymous 0 Comments

CPUs are like swiss army knives. They can do many different types of computations well. GPUs are like filetting knives. Very good at doing a specific thing – filleting fish. You can fillet your fish with a swiss army knife, but are much better off doing it with a filleting knife.

In the case of GPUs, the filleting is a type of activity that is also very useful for ML applications.

Anonymous 0 Comments

ML/AI is basically just a lot of complicated calculations and operations.

GPUs can do a lot of math parallelly, at the same time. It is not ‘smart’. You can consider it analogous to the “nerd” kid in the class.
The CPU on the other hand is analogous to the “life-smart” kid in the class, meaning it can do various other tasks (like controlling what to send to the monitor/display, what data to retrieve from the storages etc.) along with some complicated math. As a result, it takes more time to solve the math but it does solve them eventually, because while they are not *that* nerdy, they still are studious and capable if need be.

Anonymous 0 Comments

They’re super fast at matrix multiplication. That’s where you multiply an entire table of numbers with another table. This is because modern GPUs are designed apply special effects, called *pixel shaders*, to entire images in a single pass. Effectively it can multiply a whole picture with another whole picture (and pixels with surrounding pixels) to produce a whole new picture, all at once.

It used to be that the pixel shaders were pre-programmed, baked into the hardware, to apply common effects like light bloom, deferred lighting or depth of field blur. But then they started having *programmable* pixel shaders, meaning developers could go in and write their own algorithms for their own special effects.

It’s when AI researchers got a hand of these newfangled programmable GPUs that they realized what they could do with’em. Instead of just multiplying images to special effect layers, they multiply images with other images using their own formulas. For example, they’ll take thousands of pictures of bikes, then use the matrix multiplication power of GPUs to combine them into a “map” of what bikes should look like.

Modern GPUs aren’t limited to multiplying only 2D images in two dimensions; rather, they can multiply 3D “clouds” and beyond.

Anonymous 0 Comments

AI & ML build out neural networks and train then on data.

A neural network is like your brain, each cell is connected to other cells, so when you get an input a bunch of cells fire off and the eventually decide if something is a traffic light or not.

The math involved in this is very simple, you blast inputs at the NN, see the result, then if it’s right you increase the strength of the links that fired & if it’s wrong you decrease their strength.

The hard part for AI/ML is that you need to do these simple operations many times (once for every node’s connection to other nodes, every time you show it training data (which itself requires a lot of training data).

Graphics cards do this simple math many times to decide what exact color pixels should be.

CPUs are setup to do more complex processing these days, so instead of having a “dual core, or even 32 core machine of CPUs” with a GPU you’re getting far more parallelism.

Anonymous 0 Comments

GPUs are hard to program generically but are easy to program to process lots of things in parallel (graphics/pixels), which is good for NNs and can speed up training by a 100x or more. Ive had training go from days to minutes

Anonymous 0 Comments

GPUs, or graphics processing units, are specialized computer chips that are designed to handle the complex calculations needed for rendering graphics and video. They are able to perform these calculations much faster than a regular CPU, or central processing unit, which is the main chip in a computer that handles most of its tasks.

One of the things that makes GPUs so good at handling complex calculations is their architecture, or the way that they are built and organized inside the chip. GPUs are designed with many small, simple processors that can work together to perform calculations in parallel, or at the same time. This makes them much faster than CPUs, which usually have just a few larger processors that can only work on one task at a time.

Neural networks are a type of computer program that are designed to learn and make decisions like a human brain. Training a neural network involves running many complex calculations to adjust the parameters of the network so that it can learn to recognize patterns and make predictions. Because GPUs are so good at handling complex calculations, they are much faster at training neural networks than CPUs. This is why GPUs are often used for training neural networks in machine learning and artificial intelligence applications.

Anonymous 0 Comments

This is not a GPU vs CPU debate, as only NVIDIA does this and the cards have dedicated cores for neural AI.
No wonder it is a big player in the AI Car development.

Anonymous 0 Comments

CPU’s are very versatile and complex. CPU’s machine instructions are not like one operation per instruction, some instructions may take a lot of simple operations and a lot of cycles. GPUs on the other hand are very straightforward, their instructions are mostly like “get this, add this”, and GPU’s don’t like branching “if this do that otherwise do this”, unlike CPUs which handle branching with ease. And by avoiding complexity, GPUs are able to do a whole lot of operations per cycle. Each CPU core is big (on a crystal) and smart, while GPU cores are small and dumb, but you could place literally thousands of them on same area as per one CPU core. Mathematical neurons are simple in principle too, so, it’s much easier to simulate them on simple processing cores too. Even GPU cores are too “smart” for neurons, as basically they need three or maybe four kinds of operations: summation, subtraction, multiplication and comparison, and all of them with same kind of value (while GPUs are able to do computation with single precision rational numbers, double precision rational numbers, 4 bytes integer numbers, maybe 8 bytes integer numbers, etc, neural networks don’t need that, they don’t even need precision, as they’re inherently imprecise, they need maybe 1 byte of data). For this reason, there’s a neural chips, which utilize even simpler cores than GPU, but these cores are designed to work specifically to simulate neurons, so they’re blazingly faster than even GPUs.

Anonymous 0 Comments

A CPU is a fleet of trucks, a GPU a swarm of a thousand delivery bikes.

AI and neural networks generally work on crossing and analyzing a ridiculous amount of small elements simultaneously, in an extensive dataset, which GPU architectures are more suited for.

Anonymous 0 Comments

GPUs have dedicated circuitry for graphics math, and now recently they’re being included with circuitry dedicated for AI math. CPUs do this math using general purpose circuitry which makes them slower at it.

In addition, GPUs have higher total computing power than CPUs. But most tasks are very difficult or impossible to program to run on a GPU or fully utilize it because of the design of GPUs compared to CPUs. Other comments have explained those differences.

AI training and execution happens to take advantage of GPUs well.