What Exactly are Cores and Threads in CPU?

457 views

As an Electrical Engineering major, I have taking several introductory computer engineering courses. I have studied ARMv8 for a long time now. I know about registers, instruction fetch, arithmetic instructions, branching instructions, pipelining, and data forwarding. I know that some of these are specific to ARM only. ARMv8 is the only architecture that I know.

However, I am curious to know what exactly are cores and threads? And specifically for cores, how are instructions distributed to each core? And if a dependency exists from one core to another core’s instructions, is there a such thing as data forwarding from one core to another?

Lastly, kinda unrelated, but what is a Graphics Card and what are the differences between GPU architecture and the ARMv8 architecture that I have studied?

If someone could please answer these three questions, I would greatly be appreciated.

In: 13

5 Answers

Anonymous 0 Comments

I’ll go a bit more complex than ELI5 since you seem to know quite a bit.

> what exactly are cores

TLDR: Cores are multiple CPU’s on the same chip.

From about 1970-2005, technology kept improving, and we were able to make transistors inside a chip smaller and faster every year. For example a CPU from 1990 is about 10 MHz, while a CPU from 2000 is about 1000 MHz.

Then around 2005 or so, we started to hit physical limits of speed, but not size. Transistors kept getting smaller, but didn’t get much faster. So they started putting multiple CPU’s on a single chip.

Then the marketing and distribution people said, “Wait a minute. You’re telling me you’re putting multiple CPU’s on a CPU? How does that even make any sense?”

After a bit of confusion they realized that it was a language problem. To an engineer, a CPU is “a thing that executes programs.” To a distributor, a CPU is “a physical chip we can put in a box and sell to customers.”

So they solved the vocabulary problem by deciding CPU == chip, and core == thing that executes programs.

Sometimes you still see this vocabulary issue pop up. For example some OS’s or programs might report “4 CPU’s” when running on a system that has a single 4-core CPU.

> threads

A CPU runs a list of instructions. To run the list of instructions, it needs to keep track of:

– (a) What is the next instruction to execute (instruction pointer / program counter)
– (b) Temporary storage (general purpose registers / stack)

A thread is basically an independent copy of (a) and (b).

Threads are mostly a software concept implemented by operating system software. But the OS does use some hardware features, e.g. it uses a timer interrupt to stop running the current thread, jump to OS code that saves the current thread’s registers, figures out what thread to switch to, restores the registers of the new thread, and then jump to the new thread. (The OS logic for figuring out what thread to execute next is called the *scheduler*.)

If you run multiple programs, each one runs in a different thread. But a single program can also have multiple threads.

Each core can execute only one thread at a time. So on a 4-core system, at most 4 threads can execute simultaneously.

A typical modern PC will have 100 threads or more. This seemingly far exceeds the capability of a typical processor, but it works out for a couple reasons:

– The OS scheduler frequently switches threads. So for 0.01 seconds it executes threads A, B, C, D, then after 0.01 seconds it switches to threads E, F, G, H, then after 0.01 seconds it switches to threads I, J, K, L, and so on…
– A lot of programs use threads that spend most of their time waiting for something to happen (timer, I/O, or something happens in another thread). The OS usually removes these threads from consideration by the scheduler until the thing they’re waiting for happens.

> how are instructions distributed to each core?

Each core has its own instruction pointer (program counter). Each core talks to memory over the bus to fetch instructions and data. Typically all the cores in a physical chip share one set of bus lines and the outermost level(s) of the CPU cache (L2 / L3 cache).

> if a dependency exists from one core to another core’s instructions, is there a such thing as data forwarding from one core to another?

Each core has its own general purpose registers. So core A has no way to access registers on core B.

On the other hand, in theory core A and core B share memory. *But* they have their own L1 caches. And there are multi-CPU systems that have multiple CPU chips in their own motherboard sockets; those CPU’s don’t even share L2 / L3 caches.

So if you want to share memory between threads, software needs to use special instructions to make sure the memory’s adequately synchronized.

(Multithreaded programming has a reputation among software people for being difficult and mind-bending. Partly because there’s a lot of possibilities for subtle bugs that are hard to test for or reproduce, because they depend on the exact timing and sequencing of how different threads interact.

The best way to handle this is to design the program, or even the [entire programming language](https://blog.rust-lang.org/2015/04/10/Fearless-Concurrency.html), from the beginning to only use provably safe inter-thread communication patterns.

For example, well-designed multithreaded software often doesn’t directly share memory, instead it passes messages through queues, and lets the queue library handle low-level issues like memory correctness.)

You are viewing 1 out of 5 answers, click here to view all answers.