What about GPU Architecture makes them superior for training neural networks over CPUs?

703 views

In ML/AI, GPUs are used to train neural networks of various sizes. They are vastly superior to training on CPUs. Why is this?

In: 679

26 Answers

Anonymous 0 Comments

CPUs are generalists. They can do many things, but are not necessarily specialized in any particular area.

GPUs are specialists. They cannot do most of the things CPU can do or even if they can they would be way slower than CPUs. However, there are a few things which GPUs can do a lot of at the same time (i.e. in parallel), making them way faster than CPUs.

CPUs are way better for some things in a similar way that makes humans much better suited for walking through the thick jungle than bicycles.

GPUs are way better for NNs than CPUs in a similar way that makes airplanes way better for intercontinental travel than bicycles.

Anonymous 0 Comments

Imagine CPU as one person who is really good at doing all the math you can throw at it. However they can only do one task at a time. GPU is a whole high school full of kids doing simple math tasks. A CPU might have few cores, each of them a person who can do maths. GPU has thousands of smaller cores that do simpler math tasks.

The math done in machine learning is actually rather simple. It is just simple vector calculations in an matrix. They are just multiplication and division. However the issue is that there is A LOT of it. Just absurd amount of it. ML/AI neural networks are just complex n-dimensional arrays with multiplie layers. Now this is exactly what computer graphics are also. They are just calculating translation of triangles in 2-3D space (2 or 3-dimensional array). Simple calculations; just a lot of them.

So you can imagine AI/ML calculations to just be graphics without graphics. Intead of calculating path of a light being reflect off the armor of a game character, you calculate the path of information within AI model’s “mind”. But as the white light turning red through shader or reflection, you change the path of the information depending on what path has the most desired value, these are done with basic matrix calculations..

Anonymous 0 Comments

I’ve read through all of this but here’s a real simple example from my actual work experience years ago.

I started out on Wang 2200s, which were fast little things that engineering people especially loved to use because they did math fast. The reason was they had specialized chips for matrix arithmetic.

Before these chips, if I had to init an array of 10 X 10 cells, I’d have to loop through and set each one to zero and then get started on what I wanted to do. When the first machine with these chips came in, all I had to do was say “Mat Y = Zer” where Y was the 10 X 10 array I was looking to init. It was instantaneous. It meant I could spit out reports at multiples of the speed I could before.

That’s the difference between a CPU and a GPU for math stuff.

Anonymous 0 Comments

Focus are tightly focused super efficient machines VS a CPU that is more of a jack of all trades.

What a video card can do, it can do that thing 100x better than a CPU can.

That’s why there is so much effort directed toward breaking things down into chunks that can be offloaded onto video cards for applications like curing cancer or bitcoin mining. You want the processor to be relied on as little as possible and the video card to be relied on as much as possible.

Anonymous 0 Comments

Because GPUs are designed specifically to process graphics, they are REALLY good at manipulating a mathematical object called a “matrix” which we can think of as a box of numbers. CPU’s are designed for general purpose calculations, and are thus not specialised.

The majority of neural nets are built in such a way that they may be written down in terms of these matrices (plural for matrix), which makes GPUs much better at calculating operations than CPUs.

Source: I’m a mathematician with an interest in machine learning.

Anonymous 0 Comments

CPUs are generalists and can do a lot of things. Most of the “stuff” in a CPU is not for doing math but is there to perform complex tasks. For example, the reason you can interact with your computer in real time (when you press your mouse button to open a web browser while using a text editor in the background) is because the CPU can pause a task anytime and resume it later when needed.

GPUs cannot do most things that CPUs can, but everything in a GPU is dedicated to perform math operations. Because neural networks need a lot of math, using a GPU is much more efficient than a CPU.

0 views

In ML/AI, GPUs are used to train neural networks of various sizes. They are vastly superior to training on CPUs. Why is this?

In: 679

26 Answers

Anonymous 0 Comments

CPUs are generalists. They can do many things, but are not necessarily specialized in any particular area.

GPUs are specialists. They cannot do most of the things CPU can do or even if they can they would be way slower than CPUs. However, there are a few things which GPUs can do a lot of at the same time (i.e. in parallel), making them way faster than CPUs.

CPUs are way better for some things in a similar way that makes humans much better suited for walking through the thick jungle than bicycles.

GPUs are way better for NNs than CPUs in a similar way that makes airplanes way better for intercontinental travel than bicycles.

Anonymous 0 Comments

Imagine CPU as one person who is really good at doing all the math you can throw at it. However they can only do one task at a time. GPU is a whole high school full of kids doing simple math tasks. A CPU might have few cores, each of them a person who can do maths. GPU has thousands of smaller cores that do simpler math tasks.

The math done in machine learning is actually rather simple. It is just simple vector calculations in an matrix. They are just multiplication and division. However the issue is that there is A LOT of it. Just absurd amount of it. ML/AI neural networks are just complex n-dimensional arrays with multiplie layers. Now this is exactly what computer graphics are also. They are just calculating translation of triangles in 2-3D space (2 or 3-dimensional array). Simple calculations; just a lot of them.

So you can imagine AI/ML calculations to just be graphics without graphics. Intead of calculating path of a light being reflect off the armor of a game character, you calculate the path of information within AI model’s “mind”. But as the white light turning red through shader or reflection, you change the path of the information depending on what path has the most desired value, these are done with basic matrix calculations..

Anonymous 0 Comments

I’ve read through all of this but here’s a real simple example from my actual work experience years ago.

I started out on Wang 2200s, which were fast little things that engineering people especially loved to use because they did math fast. The reason was they had specialized chips for matrix arithmetic.

Before these chips, if I had to init an array of 10 X 10 cells, I’d have to loop through and set each one to zero and then get started on what I wanted to do. When the first machine with these chips came in, all I had to do was say “Mat Y = Zer” where Y was the 10 X 10 array I was looking to init. It was instantaneous. It meant I could spit out reports at multiples of the speed I could before.

That’s the difference between a CPU and a GPU for math stuff.

Anonymous 0 Comments

Focus are tightly focused super efficient machines VS a CPU that is more of a jack of all trades.

What a video card can do, it can do that thing 100x better than a CPU can.

That’s why there is so much effort directed toward breaking things down into chunks that can be offloaded onto video cards for applications like curing cancer or bitcoin mining. You want the processor to be relied on as little as possible and the video card to be relied on as much as possible.

Anonymous 0 Comments

Because GPUs are designed specifically to process graphics, they are REALLY good at manipulating a mathematical object called a “matrix” which we can think of as a box of numbers. CPU’s are designed for general purpose calculations, and are thus not specialised.

The majority of neural nets are built in such a way that they may be written down in terms of these matrices (plural for matrix), which makes GPUs much better at calculating operations than CPUs.

Source: I’m a mathematician with an interest in machine learning.

Anonymous 0 Comments

CPUs are generalists and can do a lot of things. Most of the “stuff” in a CPU is not for doing math but is there to perform complex tasks. For example, the reason you can interact with your computer in real time (when you press your mouse button to open a web browser while using a text editor in the background) is because the CPU can pause a task anytime and resume it later when needed.

GPUs cannot do most things that CPUs can, but everything in a GPU is dedicated to perform math operations. Because neural networks need a lot of math, using a GPU is much more efficient than a CPU.