Eli5: If a compiler is a program that converts your code into binary form for the computer, unless my understanding is incorrect and it isn’t just a program, wouldn’t the compiler also need a compiler to run it since how do you run a program without a compiler?

1.06K views

Eli5: If a compiler is a program that converts your code into binary form for the computer, unless my understanding is incorrect and it isn’t just a program, wouldn’t the compiler also need a compiler to run it since how do you run a program without a compiler?

In: 360

25 Answers

Anonymous 0 Comments

It doesn’t convert your code into binary, it converts it into machine code that can be executed on the hardware. That is assuming you are compiling a program using a language like C++ which is compiled, there are just in time compilers like Java that converts it to java bytecode and then interpreted languages like python.

Say you are testing your code on Visual Studio and you do the ‘compile and run’, it compiles in some temp directory then runs as if it were installed on your computer, submitting the converted code to the operating system. All of that is hidden away from you but that is kind of what is happening under the covers.

Anonymous 0 Comments

A computer runs on what’s called “machine language”, basically what you are calling “binary form”. This machine language is not very friendly from a human’s perspective, however it is possible for a human to write a program directly in machine language. A compiler being just another program you can write. As u/Gnonthgol has pointed out, at this point since we have other compilers written it’s just easier to write the compiler in another language. You can also use a cross-compiler. The is the case where you develop the compiler for your new machine by writing it on a different type of computer that already has a compiler, but outputs a binary/machine language program that works on your new machine.

Anonymous 0 Comments

People can write programs in binary form for the computer without a compiler. This is called machine code. It’s much easier for people to write programs in programming languages, but it is possible for people to write machine code by hand.

The first tools that convert programming languages into machine code were written by people using machine code.

Anonymous 0 Comments

So the “binary form” is also a programming language itself. It’s just a much “dumber” language that is very hard to use, but can be understood by the machine it’s supposed to run on (namely the CPU).

A compiler is a program that takes a more abstract and easier to use programming language and translates it into that binary executable form.

You don’t need a compiler to run a program. A compiler is basically taking your abstract program and converting it to a binary program that you can run on the machine/operating system itself.

Now there are interpreted languages, but these are usually called *scripts*. An interpreted program (script) does need another program in order to run, that program is called an interpreter and it’s usually written and compiled in a different language.

One more bit of nuance is that most programs today are implicitly compiled to run within certain environments. Namely an operating system. The OS provides a lot of existing code that you can take advantage of when running your program. So the compiler will build your program to run within that environment. The operating system is technically a program itself, and has a lot of control over loading and executing binary programs.

Anonymous 0 Comments

Yup.

The first programs had to be written in machine code. Just a sequence of numbers. They’d write in Assembly language, and convert by hand. Then they got computers to do the conversion.

Once you have computers to do that, you can write a compiler or an interpreter for a more complex language. Eventually someone will write their own compiler in the higher level language.

Once you have at least one high level language, it makes things a lot easier.

Anonymous 0 Comments

This was one of the questions that pushed me into studying software development. How does the computer understands what Im writing?:

Theres a main brain inside the computer that only understands “low level” instructions. These instructions are very simple some allow you to move memory spaces or make arithmetic operations and some other actions. You can research MIPS32 and x86 to see more about this set of instructions.

So when we write code like this:

Console.log(“hello world” + anyVariable);

A compiler’s job is to traduce that line of code into something the main brain can understand using its set of instructions so the main brain can actually execute it. Usually one line can be traduced to many small instructions.

I built 2 compilers in college and that was the most fun Ive ever had in my life.

Anonymous 0 Comments

Binary files are directly executable by the processor itself – they have a short header that tells the operating system where to load them into memory and where to start but are otherwise ready to run.

Here’s an example pulled from Microsoft Edge’s .text section (encoded as hexadecimal to aid readability):

41 57 41 56 41 55 41 54 56 57 55 53 48 81 EC 08
01 00 00 48 8B 05 0E 10 2F 00 48 31 E0 48 89 84
24 00 01 00 00 48 8B 02 48 85 C0

And here’s how it breaks down into instructions:

00007FF600271000 | 41:57 | push r15
00007FF600271002 | 41:56 | push r14
00007FF600271004 | 41:55 | push r13
00007FF600271006 | 41:54 | push r12
00007FF600271008 | 56 | push rsi
00007FF600271009 | 57 | push rdi
00007FF60027100A | 55 | push rbp
00007FF60027100B | 53 | push rbx
00007FF60027100C | 48:81EC 08010000 | sub rsp,108
00007FF600271013 | 48:8B05 0E102F00 | mov rax,qword …
00007FF60027101A | 48:31E0 | xor rax,rsp
00007FF60027101D | 48:898424 00010000 | mov qword ptr…
00007FF600271025 | 48:8B02 | mov rax,qword…
00007FF600271028 | 48:85C0 | test rax,rax

The first column is the address in memory – typically used to identify where the program is during execution. The second column is the byte sequence – each one identifies what action the processor should take for that instruction and any inputs it should take them on. The last column is the human readable name of the instruction and arguments.

In very early computers, you would start by writing down a set of instructions, then looking up each byte sequence and careful “filling in the bubbles” similar to a scantron used at school (or in the very earliest computers by connecting wires on a plug board).

Once hard drives became commonplace, instead the byte codes could be stored directly on disk – usually by building on already working computers to get started.

Next someone wrote a program that looks up the byte codes automatically – this is called as assembler. (Not sure if assemblers were written in the punch card era or not.)

Gradually assemblers started to include more features to aid in reducing errors and increasing productivity – shorthand for common sequences of operations. This is where programming languages start to become distinct from the set of operations provided by the hardware directly – there is no longer a 1:1 translation from raw bytes to commands. Similarly, the program is no longer called as assembler – but the more generic term compiler.

(There are other parts to a compiler – notably linkers, preprocessors, etc – but this is enough to give a good idea of where things started)

Anonymous 0 Comments

Refinements on a process.

You can write in machine code. It’s a nightmare but doable.

Machine code to create a super basic language like Assembly.

You can then use Assembly to make a more advance language like Fortan.

You can then use Fortran to write a more nuanced and powerful language like C

You can then use C to write languages like Java.

Languages like Python are not just created out of nothing. There is a long history of refinements and advancements that it’s built on top of.

Anonymous 0 Comments

The information you are looking for, is how was the first compiler created ? Well it was developed in binary, to translate code into binary

In the beginning, adding 3 and 5 together looks something like this 0101 1010 0000 0011 0000 0101. After the first compiler, we could write ADD 3, 5 and it will be translated into the binary string written above.

Now, the program that translates “ADD 3, 5” into “0101 1010 0000 0011 0000 0101” had to be written with 0s and 1s.

That program would be something like
“If the first character is A and the second is D and the third is D, output 0101 1010”
If a character is 1, output 0000 0001
If a character is 2, output 0000 0010

Of course this paragraph could be hundreds of segments of binary code, but it is achievable. And once you are able to write the addition (ADD), the substraction (SUB), division (DIV), multiplication (MUL), … You can then use this code to write a new compiler in code that translates code to binary, instead of using the compiler written in binary

Anonymous 0 Comments

Going to keep my answer as short as possible.

The first compilers were written directly in machine code (1s and 0s) or assembly (basic English instructions like ‘mov’ or ‘jmp’ which got directly translated 1:1 to machine code instructions).

As compilers got better, more common, and we had a few lying around, it became a lot easier just to write the “new and improved,” compiler using the older compiler.