CPU performance is an art, not a science. There are many factors affecting performance, including memory speed, processing speed, parallel processing, compiler optimality, and many others, in addition to how wide the architecture is. To get the benefit of an architecture that is 128 bits wide, it will require that compilers can fill up that bandwidth in parallel on enough cycles to make it worthwhile. Holding that back is the reality that useful processing includes branches and has sequential dependencies. Sometimes the CPU guesses, and guesses wrong, and calculations are discarded. And whenever work is done but is discarded the power dissipation still takes place. And power dissipation is ultimately the limit on CPU performance. It may be the case that future algorithms will be sufficiently suitable for wider architectures (maybe neural nets, or other AI), but currently it is too difficult to take advantage of 128 bits or wider.
Edit: I should say, too difficult generally. Meaning that programs in general don’t benefit from a wider architecture. Programs with data objects that are wide can certainly benefit. For example, video screen memory is wide. But to calculate what to display on it, programs optimally use at most 32 bit wide quantities.
Latest Answers