That is done, it plays a large role in the recent performance increases.
For example, since Pentium 1 (earlier if you count exotic processors) desktop CPUs have been able to execute more than one instruction per cycle (normally you need more than one ALU for this). That is not as effective as a 2x speed increase would be, it has some limitations after all. For example if you had `a + b + c` then at 2x speed you could compute it in half the time, but with 2 simultaneous instructions .. you can’t do it because the second operation needs the result of the first operation, the second instruction has to wait for that result.
For an other example, you can make some special new instructions that do a lot of work in one go, for example “add these 4 numbers to these other 4 numbers”. You can use multiple ALUs to make operations like that fast. This is what SIMD is. This has even more limitations in addition to only working for independent operations. Eg maybe you can do 4 additions, or 4 multiplications, but you can’t mix&match, and if you only needed 3 additions then the 4th still happens but it’s wasted. As an analogy, it’s like having a bunch of clones of you to help you out, and they exactly copy your movements – you can’t use them to do different chores for you at the same time, and they don’t take out the trash any faster than you would on your own, but they would be great for raking the lawn.
Latest Answers