Compiler needed a compiler to compile – a lot of compilers can actually self-compile. But once that’s done, they’re now in binary, and no longer need a compiler.
Essentially, your code does get turned into binary, but unless you change the code, you can keep reusing the same binary. That’s what you do with most anything that runs on your computer, including compilers.
A compiler compiles your program into machine code (or any other language, but let’s ignore that for now). The compiled program can then be run on your machine (computer) because it’s already in the machine code the machine can run. So yes, someone had to compile the compiler at some point, but then it’s already compiled, so it doesn’t need the compiler anymore.
What you’re thinking about is probably the interpreter, which is a program that takes a program in some language, and actually performs the actions the program was supposed to perform. Therefore it needs to be present any time you want to run the interpreted program. You could say that the processor is an interpreter for machine code.
Compilers don’t run code, they just convert it into a form that processors can work with.
Processors are fundamentally circuits that are hard wired to do certain things when they are given a certain combination of 1’s and 0’s, and if you string enough of these small functions together you can do any computational task. Compilers make this code out of more human readable code, though it is possible to just program something directly in machine language if you really wanted to which is how things worked before compilers were invented and how the first compiler was made.
The code that the computer actually runs is called machine code. And if you have a very simple computer, you can program directly in machine code.
But for anything other than the simplest kind of computer or the simplest kind of program, it’s simply not efficient.
So you can create a language that makes it easier. And this is what we did, and it was called Assembly. ~~Assembly basically lumps common operations and sets of operations and makes them more comprehensible at a higher level.~~ It basically created what are known as “mnemonics” which are more easily intelligible by people, that map to machine code instructions.
To take a set of assembly instructions and convert them into machine code you need a program called an “assembler.” But, the first assembler had to have been built with machine code.
But even Assembly is fairly low level and only really suitable for simple programs. The more complex a program we want to make, the higher level language we want.
So people started making higher level languages. But this languages need to be converted into machine code which requires a compiler. The first compilers were built using the low level languages (such as Assembly) at the time. Once they were made, you could use existing higher level languages and compilers to make new ones.
This is the compiler bootstrapping problem. Nowadays the compilers are juste written in another language where a compiler already exist. Once the program have compiled there is no need for a compiler any longer so you can run this program without having the original compiler. This is how you can create an independent compiler. Some procramming languages can also be interpreted, where you need the interpreter to run it. So you can write a compiler, run it in the already existing interpreter, and have it compile itself.
In the old days, when compilers were still quite rare. You could write machine code directly, or partially helped by existing tools. There are several such known projects where a compiler were first compiled to machine code by hand. Once the first hand written copy was down you could use this to compile the code properly.
The compiled software is also just data stored on the computer, with the difference that it can be understood by the computer as something that can be executed. So the simple answer is – nothing stops you from creating such data without a compiler. Hence you could create a simple compiler (for a simple language) “on paper” and feed it into the memory one way or another as is. You can then preferably use that language/compiler to write a better compiler for possibly a new language that be enables a human to write programs more efficiently.
And this is pretty much exactly what happened.
Say you have an idea for MyLang, a brand-new programming language.
You write the first version of the MyLang compiler in an already-existing programming language, like Go for example. You compile it with the Go compiler.
Then the *second* version of the MyLang compiler can be written in MyLang and compiled with version 1 of the MyLang compiler.
The first version of the MyLang compiler doesn’t need to support all of MyLang. If you can write a MyLang compiler in MyLang without using all of MyLang’s features, you don’t need to implement those unused features in Go in the version 1 compiler — you can save them for later, when you have the ability to write the MyLang compiler in MyLang.
“Okay that’s fine for 2022 when we have Go and Java and Rust and C++ and all these high-level languages we could write a compiler in. But how did they do it back in the day, in the 50’s / 60’s / 70’s when you’re making the first high-level language for an early computers where there *was* no previous language / compiler you could use?”
The answer to that is you can always program a computer in its native language, machine code. Programmers usually don’t do that today since it’s super tedious, but you can definitely write a simple high level language in machine code. (Especially if you write tools to help you work with machine code first, like an assembler and a debugger.)
Lots of great answers – I’m going to give a really simple ELI5 example.
How was the first hammer made? First we used a stick to break up a rock. Then the rock was a tool we could use to make a pointier rock made of harder rock stuff. Over time we figured out how to make bronze, which was amazing because it was way way harder than any rock and we could make it into any shape.
Thousands of years later we have factories which produce 1000s of steel hammers were silicone grips. And other factories which produce big yellow tractors with hydrolic jackhammers.
Programming languages were made this way as well. Except it took less than 100 years.
You are correct: compilers are programs, and to become programs, they need to be compiled by another compiler. The very first compilers were written directly in machine code to avoid needing a compiler.
There are plenty of existing compilers that one can use to build a new compiler, and at some point the compiler can become “self-hosting” in the sense that an existing compiled version of that compiler can be used to compile the next version.
Latest Answers