What about GPU Architecture makes them superior for training neural networks over CPUs?

745 views

In ML/AI, GPUs are used to train neural networks of various sizes. They are vastly superior to training on CPUs. Why is this?

In: 679

26 Answers

Anonymous 0 Comments

They’re super fast at matrix multiplication. That’s where you multiply an entire table of numbers with another table. This is because modern GPUs are designed apply special effects, called *pixel shaders*, to entire images in a single pass. Effectively it can multiply a whole picture with another whole picture (and pixels with surrounding pixels) to produce a whole new picture, all at once.

It used to be that the pixel shaders were pre-programmed, baked into the hardware, to apply common effects like light bloom, deferred lighting or depth of field blur. But then they started having *programmable* pixel shaders, meaning developers could go in and write their own algorithms for their own special effects.

It’s when AI researchers got a hand of these newfangled programmable GPUs that they realized what they could do with’em. Instead of just multiplying images to special effect layers, they multiply images with other images using their own formulas. For example, they’ll take thousands of pictures of bikes, then use the matrix multiplication power of GPUs to combine them into a “map” of what bikes should look like.

Modern GPUs aren’t limited to multiplying only 2D images in two dimensions; rather, they can multiply 3D “clouds” and beyond.

You are viewing 1 out of 26 answers, click here to view all answers.
0 views

In ML/AI, GPUs are used to train neural networks of various sizes. They are vastly superior to training on CPUs. Why is this?

In: 679

26 Answers

Anonymous 0 Comments

They’re super fast at matrix multiplication. That’s where you multiply an entire table of numbers with another table. This is because modern GPUs are designed apply special effects, called *pixel shaders*, to entire images in a single pass. Effectively it can multiply a whole picture with another whole picture (and pixels with surrounding pixels) to produce a whole new picture, all at once.

It used to be that the pixel shaders were pre-programmed, baked into the hardware, to apply common effects like light bloom, deferred lighting or depth of field blur. But then they started having *programmable* pixel shaders, meaning developers could go in and write their own algorithms for their own special effects.

It’s when AI researchers got a hand of these newfangled programmable GPUs that they realized what they could do with’em. Instead of just multiplying images to special effect layers, they multiply images with other images using their own formulas. For example, they’ll take thousands of pictures of bikes, then use the matrix multiplication power of GPUs to combine them into a “map” of what bikes should look like.

Modern GPUs aren’t limited to multiplying only 2D images in two dimensions; rather, they can multiply 3D “clouds” and beyond.

You are viewing 1 out of 26 answers, click here to view all answers.