It’s a function of FLOPS, floating point operations per second. Almost all AI works essentially by fine tuning massive amounts of numbers with decimal points. These are floats.
There is specialized hardware that does this, and there are certainly times in which companies have used it.
But there is also consumer facing hardware that is already exceptionally good at it: Graphics Cards.
The benefit is that they are mature, which allows for both cost consideration as well as software solutions that have been finely tuned over time for efficiency.
It’s a function of FLOPS, floating point operations per second. Almost all AI works essentially by fine tuning massive amounts of numbers with decimal points. These are floats.
There is specialized hardware that does this, and there are certainly times in which companies have used it.
But there is also consumer facing hardware that is already exceptionally good at it: Graphics Cards.
The benefit is that they are mature, which allows for both cost consideration as well as software solutions that have been finely tuned over time for efficiency.
Compared to CPUs, GPUs are much better at doing a lot of calculations in parallel. CPUs are designed to have the full range of instructions and calculations that a computer might need, and execute them all quickly in order. GPUs, by contrast, are designed primarily for rendering 3D models, which doesn’t require a ton of different types of calculations, but it does require a lot of calculations to be done all at the same time. So GPUs are designed for that kind of parallelization: not the best individual core speed and flexibility, but a lot of cores that can all work at the same time.
AI “deep learning” is basically repeating the same computations for the AI model over and over again, so it makes sense that parallelization would have a lot of benefits there. They could in theory of course design bespoke hardware that did this even better, but it’s far more likely that strapping a couple thousand GPUs to a couple hundred CPUs is far more cost effective.
Compared to CPUs, GPUs are much better at doing a lot of calculations in parallel. CPUs are designed to have the full range of instructions and calculations that a computer might need, and execute them all quickly in order. GPUs, by contrast, are designed primarily for rendering 3D models, which doesn’t require a ton of different types of calculations, but it does require a lot of calculations to be done all at the same time. So GPUs are designed for that kind of parallelization: not the best individual core speed and flexibility, but a lot of cores that can all work at the same time.
AI “deep learning” is basically repeating the same computations for the AI model over and over again, so it makes sense that parallelization would have a lot of benefits there. They could in theory of course design bespoke hardware that did this even better, but it’s far more likely that strapping a couple thousand GPUs to a couple hundred CPUs is far more cost effective.
GPUs are built for matrix operations. CPUs are built for serial tasks.
If you need to solve (A+B)/C + D then a CPU will do it faster than a GPU, that’s what its built for. But if A, B, C, and D are each arrays with 10,000 values and you want to perform the same operation on each slot and make array E with the 10,000 answers then you want a GPU. The GPU will load one value into each core and in a little longer than it takes the CPU to solve the one off, the GPU will have given you a value for each core it has and it can have thousands of cores.
Machine learning these days is a lot of neural networks which are lots of little interconnected weights that determine how it processes an input. The more little weights the more accurate the neural network can be but the more computationally heavy it is. When you’re training it there are thousands or tens of thousands of little weights that all need a bit of math done on them, then their value updated, then a bit more math, and so on until you’ve made it through the training set. Since its thousands of parallel operations this is perfect for a GPU
If you make a processor that’s built to handle thousands of similar operations in parallel we wouldn’t call it a CPU because it’d be pretty bad at normal CPU tasks, we’d call it a General Purpose Graphics Processing Unit even if it didn’t have the ability to connect to a screen like the nVidia Tesla cards
GPUs are built for matrix operations. CPUs are built for serial tasks.
If you need to solve (A+B)/C + D then a CPU will do it faster than a GPU, that’s what its built for. But if A, B, C, and D are each arrays with 10,000 values and you want to perform the same operation on each slot and make array E with the 10,000 answers then you want a GPU. The GPU will load one value into each core and in a little longer than it takes the CPU to solve the one off, the GPU will have given you a value for each core it has and it can have thousands of cores.
Machine learning these days is a lot of neural networks which are lots of little interconnected weights that determine how it processes an input. The more little weights the more accurate the neural network can be but the more computationally heavy it is. When you’re training it there are thousands or tens of thousands of little weights that all need a bit of math done on them, then their value updated, then a bit more math, and so on until you’ve made it through the training set. Since its thousands of parallel operations this is perfect for a GPU
If you make a processor that’s built to handle thousands of similar operations in parallel we wouldn’t call it a CPU because it’d be pretty bad at normal CPU tasks, we’d call it a General Purpose Graphics Processing Unit even if it didn’t have the ability to connect to a screen like the nVidia Tesla cards
Latest Answers