There’s a lot of comments on here already, but I really think most of them have missed several key points… Most of these answers definitely are not written by C programmers or hardware engineers. I am both, thankfully, so let’s get started:
I saw one comment touch on this already, so I’ll be brief: Assembly is not *necessarily* fast. It is just a list of steps for the CPU to execute. These are called “instructions”, and modern CPUs have hundreds of instructions to choose from. They can do simple things like “add”, “divide”, “load”, etc. They can even do advanced things, like “encrypt”, or “multiple 8 numbers together at the same time then add them all to one value”.
Not all instructions are created equal. Some instructions can be executed multiple times in a single “timestep”, called a *cycle* – as in, a processor may be able to execute 4 ADD instructions simultaneously. Whereas other instructions, like DIVIDE, may take several cycles to complete.
Thus, speed of a program is dependent on the kind of instructions you execute. 10,000 ADD instructions would be a lot faster to complete than 10,000 DIVIDE instructions.
What an instruction means in the context of surrounding instructions also has an impact too. If one instruction depends on the answer of a previous one, the processor cannot execute it simultaneously (*), as it has to wait for the answer to be ready before it can do the next one. So, adding 10,000 distinct number pairs for 10,000 answers is faster than summing every number from 1 to 10,000 for a single answer.
This is only scratching the surface of how you can write assembly that runs fast. A skilled assembly programmer has deep knowledge of the interior design of the CPU and its available instructions, how they correlate to each other, and how long they take to execute.
I hope this makes it clear that assembly is not fast, it’s how you write it. This should be immediately clear if you realize that everything you run *eventually* runs assembly instructions. If assembly was always fast, it wouldn’t be possible to have a slow program written in Python.
Intro done, now let’s get to C. What do C and other higher level programming languages have to do with assembly?
Programming languages can broadly be separated into two categories – they either compile directly to “machine code” (assembly), or they don’t. Languages like C, C++, Fortran, Rust, and others are part of the first camp. Java, Python, Javascript, C#, and more are part of the second camp.
There is absolutely nothing that requires C to compile down to *good assembly*. But there are many things that encourage it:
1. There is no automatic safety checking for many things. Note that checking something takes assembly instructions, and not doing something is always faster than doing it.
2. There are no things “running in the background” when you write C. Many languages feature these systems built-in to make the programmer’s life easier. In C, you can still have those systems, but they don’t exist unless you write them. If you were to write those same systems, you would end up at a comparable speed to those other languages.
3. C is statically typed, so compilers know exactly what is going on at all times before the program ever runs. This helps the optimizer perform deductions that significantly improve the generated assembly.
The last point is particularly important. Languages in the C camp would be nothing without a **powerful optimizer** that analyze the high-level human readable code and turns it into super fast assembly. Without it, languages like Java and Javascript can regularly beat C/C++/Rust due to their runtime optimizers.
In fact, optimizers in general are so powerful that Fortran/C++/Rust can very often be faster than C because of the concepts those languages let you express. These languages let you more-directly write things like a sorting function or a set operation, for example. The optimizer thus knows exactly what you’re doing. Without these higher level concepts, the optimizer has to guess what you’re doing in C based on common patterns.
This also applies to Java and Javascript. They have very powerful runtime optimizers that actually analyze what is happening as the code runs, and thus can make even better logical deductions than what could be attained statically. In rare cases, this can even result in code that is faster than an optimized but generic C equivalent. However, this is only evident on smaller scales. Whole programs in these languages are typically significantly slower due to a combination of the 3 points above.
**C is not fast. Optimizers make it fast. And **
PS: C shares the same optimizer with other languages like C++, Rust, and a few others (this is called LLVM). So equivalent programs written in these languages are usually the exact same speed, give or take a few % due to a combination of the 3 points above.
(*) Processors can actually execute dependent instructions somewhat simultaneously. This is done by splitting an instruction into multiple distinct parts, and only executing the non-dependent sub-tasks simultaneously. This is called “pipelining”.
TLDR: C is not fast. Optimizers make it fast, and optimizers exist in multiple languages, so the question and many other answers start off with wrong premises.
Latest Answers