Why do programs have to be manually optimized for multiple CPU cores? Why is single-core performance such a bottleneck?

1.13K views

For a long time, single core performance has been the most important feature for gaming. Though we are getting better multi-threaded games, we are still pushing for the maximum single core performance instead of cores. Why can’t 16* 2ghz cores do roughly as good job as 8* 4ghz (of the same CPU architecture), especially in gaming?

They say that software programmers have to manually split the jobs for multiple cores. Why? Why does the OS even need to operate multiple cores in the first place? To me this sounds like bad workplace management, where the results depend on pushing the limits of the same few people (cores) instead of splitting the work. I feel like making just a big bunch of cheap cores would give better performance for the money than doing tons of research for the best possible elite cores. This works for encoding jobs but not for snappy game performance.

Now, one limitation that comes to mind is sequential jobs. Things where the steps need to be done in a certain order, depending on the results of the previous step. In this case, higher clock speed has an advantage and you wouldn’t even be able to utilize multiple cores. However, I still feel like the clock speeds like 4 000 000 000 cycles per second can’t be the limiting factor for running a game over 150 frames per second. Or is it? Are the CPU jobs in game programming just so sequential? Is there any way to increase the speed of simple sequential jobs with the help of more cores?

Bonus question: How do multiple instructions per cycle work if a job is sequential?

Bonus question 2: GPUs have tons of low power cores. Why is this okay? Is it just because the job of rendering is not sequential at all?

In: Engineering

6 Answers

Anonymous 0 Comments

Short answer, because multithreaded programming is *extremely* hard and often complicates things.

You can have threads that overlap on the same job.

You can have threads that wait upon each other to finish a task before continuing, in and endless loop (known as a deadlock)

You can have threads that have access to the same data and both try to modify it at the same time, leading to a “race condition”

You can have starved threads resulting from having tons of threads with low resources.

Predictability is a problem with multithreaded programs. You have no idea how the program will execute in a multithreaded program, so you in turn have no idea how threads will act with data and if/when they will cause the issues above.

All of these things require different algorithms and features to resolve. Yes, overall it is faster but only if done correctly, so it’s more desirable to have faster cores to limit the need of multi threading.

ELI5:

You are building a house. You have 2 builders, who are experienced and work well together. The house is taking too long, so why not hire 4 more to speed it up?

These builders don’t know who is working on what. They start both trying to take the same brick at the same time, so now the building is halted until one of them concedes the brick.

The builders start to wait on one person to finish the frame, but the person building the frame is waiting for the other builders to finish their part.

And now, with more builders, you have 2 builders sitting around doing nothing because there’s not enough bricks anymore.

To solve it, you hire a manager to manage the site. He keeps tabs on the builders and decides who does what and adds systems to prevent conflict.

The builders are threads. The bricks are processes. The manager is the process management system.

You are viewing 1 out of 6 answers, click here to view all answers.