What does the code that makes up programming languages look like?

1.16K views

Take a language like Java. How was it originally created? I can’t wrap my head around how someone invented a computer language to run without having some “prior” language that it allows the first lines to function. Is it just Java all the way down, like someone wrote a single line of Java and then every other line was built on that?

What about the first computer language? What was the basis that that functioned on?

Thanks for any help, I hope that was phrased in a mildly intelligible way.

Edit; I’m trying to think of it like human language: at some point there was a first “word” spoken by someone and understood by another and from there the structure started to be born. What were the first “words” on a computer that led to where we are now?

In: Technology

36 Answers

Anonymous 0 Comments

The CPU only talks in numbers. Every number is an instruction. You have to know the “codebook” of what number, in what order, does what. Instruction 43 might be “multiply these two numbers”, for instance. So 43 2 3 gives an answer 6.

That number is literally represented in binary on the CPU’s input pins when you want it to do something. You set those pins, they correspond to the binary for 43, and the CPU knows what to do next.

But that codebook is a pain to program in. Ask the Apollo astronauts who had to calculate their trajectories, etc. by setting “Verb 23, Noun 15” on their computer system to get it to do, say, a basic multiplication instruction. That’s all they had to communicate. Numbers. That’s “machine code”.

But those numbers were assigned those tasks by the human CPU designer. So somewhere there’s a codebook that tells you that 43 is multiply, for instance. So… why not let the human just use a shorthand. Say, “MUL”. And computers are designed to do all the boring legwork, that’s their entire purpose, so why not get the computer to take the text “MUL” and output “43”? Congratulations, you just made an assembler. A program that takes the text instructions and converts them to machine code.

The first one would have been written in “machine code” by someone. Tricky, but you only had to write the assembler and then everything got easier. You obviously don’t sit and write all your programs in machine code once you have a working assembler.

But even “assembly language” (the codebook language that has “MUL”) is a bit tricky to program in. So you make the computer do the work again. Using assembly language, you make a program that takes more complex text, and converts it into assembly language for you. So it might take something like “A = B * C”. And it works out that it has to get B and C from memory, run the MUL instruction on them, and put the result into some part of memory called A. That program that does that might be called a compiler.

There is a programming language called C. Generally this is the first compiler that anyone writes for a new type of computer, because it’s relatively easy to write a C compiler in assembly language, and relatively easy for a human to write a program in C. That C compiler takes your C code (which looks a lot like Java) and converts it to simply assembler or machine code.

Now that you have a C compiler you find that you can compile most things! Parts of Java itself are written in C.

So when you have an entirely new type of machine, someone (who knows the “codebook” well) writes an assembler for it. The next (or even the same!) person then finds or makes a C compiler that itself can be written in assembler (there are lots of them already, but sometimes we have to make new ones!). Then the person after that? They have a C compiler and a whole raft of operating systems, kernels, programming languages applications, etc. that are already written in C.

Notice, though, that all it took was two programs – some way to get assembly language into the computer, and then some way to convert C code down to assembler. Those are probably the most difficult types of programs to write, and sometimes you have to write parts of them from scratch (e.g. if a chip has never been seen before and is different to everything that existed before), but the assembler you can literally write by hand, and the C compiler is just a case of tweaking an existing C compiler. And only those two programs are needed (a slight exaggeration, but once you have them, everything else can be compiled from C!) to get everything else working.

Computers in the old days (e.g. Apollo missions, and home computers right into the 80’s) used machine code only and often had machine code tutorials in their “starter manuals”. One of the programs they often got you to write was an assembler! And then C and other languages came later. Nowadays nobody bothers because it’s all done for you, but still someone, somewhere, sometimes has to write an assembler (or modify an existing one to work on a new chip) or a compiler.

It’s like having to learn the alphabet, then learning how to form that into words, then words into sentences, then sentences into paragraphs, then paragraphs into chapters, and so on. It all starts at the bottom. And I could probably teach someone who knew nothing about computers how to write a very basic program in machine code, and a basic assembler to help them, in a few days. It would take far longer to write a C compiler but you probably wouldn’t need to.

Even today, when starting on a new chip, the people at the chip manufacturers will do that part for you – and they often start with the machine code to write an assembler, then using that assembler to compile a “miniature” C compiler (e.g. tcc), then that mini-C compiler to compile a full compiler (e.g. gcc), and then give that to you in a download so that nobody ever has to do that part again and we can all just take C code and compile straight to the new chip without having to have anything to do with machine code or assembler.

You are viewing 1 out of 36 answers, click here to view all answers.