What does the code that makes up programming languages look like?



Take a language like Java. How was it originally created? I can’t wrap my head around how someone invented a computer language to run without having some “prior” language that it allows the first lines to function. Is it just Java all the way down, like someone wrote a single line of Java and then every other line was built on that?

What about the first computer language? What was the basis that that functioned on?

Thanks for any help, I hope that was phrased in a mildly intelligible way.

Edit; I’m trying to think of it like human language: at some point there was a first “word” spoken by someone and understood by another and from there the structure started to be born. What were the first “words” on a computer that led to where we are now?

In: Technology

It’s easy to build up a small sand castle with just a single small bucket, and this is like binary. If you want to build something amazing, though, you’ll need to have more tools to make the job doable. You can technically use binary to create whatever you like, although it is much easier to program simple programs which encompass basic tasks like adding or subtracting. Then, using this new program as a ‘frame’ for a new program, you can layer on complexity (or remove complexity, depends how you look at it) for accomplishing more complex tasks. This is the difference between coding “bare metal” and using a high-level language like Java.

There was a first word, but instead of thinking of it as a baby learning English, think of it as how humanity learned to communicate – we’re talking ancient runic text instead of “mom” or “dad”. Compared to programming, English itself is like a high-level language.

Java is actually run by a “virtual machine” entirely written in C++. So it’s a pretty bad example.

But C++ is being compiled by compilers entirely written in C++. How do you compile such compiler? Well, with an older C++ compiler!

But, but, what about the first compiler? It was written in assembly, a long time ago. Although people like to bootstrap C++ from time to time, aka. start from scratch with a smaller language, easier to compile. But that’s just for the kicks of it.

By the way, did you know we need a mill to make a mill?

Lots of stuff in the world is built on previously built stuff, it’s kind of a fun chicken and egg problem. Almost all of them, originally started from painstaking manual work.

Edit: It’s even more fun when you realize C++ compilers have bugs, yet produce newer C++ compilers with less (hopefully) bugs.

The very first “words” on a computer were some poor, patient soul litterally plucking in every 1 and 0 by hand. They’d do this by baking it straight into the circuitry rather than programming it with a keyboard, because, well, how would a keyboard even work if programming doesn’t exist?

People still do this all the time, by the way. A common tool that lets you experiment rapidly is called a “breadboard”, which lets you plug and unplug wires and simple chips to create complex circuits.

After the first literal hard-coded computers were in place, they were extended to be more modular and accept arbitrary input from users via input peripherals, that could then be run as new code. Everything snowballed from there.

Many prior languages existed and some were similar to Java or c++.

I am a retired mainframe assembly language developer, I did that in the 80-90s. Assembly was the first mainframe language, it’s a step above the machine code executed by the chip. Assembly language is what many mainframe languages compiler generates as it’s native to the chipset.

Assembly is crude and rude, you better know WTF you’re doing or bad things happen. Sometimes you can have to read and understand the machine code to debug your code. Principles of Operation is the reference manual for the language.

ETA: One of our programs for class was to be written in machine code to demonstrate our knowledge of the language.

ELI5 answer:
It look very, very simple.
The code that makes up programming languages itself is very simple.

The language can then be quite complexe. Java is one of the worst example you could pick, because it is an interpreted language (this means than it relies on another programm, the interpreter, to run).

But the codes that build it is simple.

It starts with basic instructions:
I want to add data. Instruction: add
I want to store my result: store

You might want to creat type: a byte might be an integer, or a character.

Then, you need to repeat the same instructions in your language.
You start to put a label at the start of your instruction: label my_little_routine
Then you say that you want to go to your label again: goto my_little routine

Then you say that this set of instruction should be used everywhere, so you create a function. This is a mechanism where you store your current data somwhere, then go to your label, then extract your data back from storage once you finish.

You can use a compiler of compiler to create an advanced programming language out ofthese basic operations: [YACC](https://en.wikipedia.org/wiki/Yacc) is one of them

Modern languages implement natively many high level mechanism called design pattern, which is why you are confused: templates in C++, interfaces in java, iterators in python…
These design pattern are common solution to frequent problems.

So we grow programming language, from simple, obvious instruction, to a complexe toolbox.

[Bonus track](https://www.ibm.com/support/knowledgecenter/en/ssw_aix_72/generalprogramming/ie_prog_4lex_yacc.html)

You can think of it as layers of an onion. The lowest layer is machine code – functions of the 0’s and 1’s that computer data is stored as.

The next layer is Assembly language, which is just easier to read machine language – the simple functions of 0’s and 1’s are given a name, and grouped. It, like the machine language are specific to the computer chip that will run the program.

The next layer toward JAVA is C++, (which derived from the C programming language), and now it is no longer computer chip specific, it is a general language used to write code for any computer. You run this code through a “compiler”, that converts the C++ into the machine specific assembly language.

The outer layer is the JAVA, which is an easier to use language than the C++. Think of this like a paragraph of C++ does what you want, but you can just give that paragraph a JAVA name, like “A”. Instead of writing all that C++ to do the function you want, you just say in JAVA “do A”.

The first general purpose computers were one-off machines. Their language were the lowest level opcodes, the binary bits that were used to drive what circuit paths the data inputs were going to undergo. There’s an opcode to add two values, to multiply, to load from a location, to store to a location, etc…

With that, programs were very simple and it was easy to think of programs in terms of CPU instructions alone. These programs were written in terms of punch cards or punch tape. The devices to do this had already existed due to the telegraph system. Indeed, Linux is directly compatible with telegraph teletype machines from the 1910s, you can see some guys on YouTube login to a Linux terminal on one.

Of course, that doesn’t scale. Eventually, especially by the later 1950s, computers already got large and powerful enough, not that programs HAD to get more sophisticated, but that they COULD. Then it became more valuable to manage the complexity by raising the level of abstraction from raw machine instructions to the first programming languages.

They existed in private before 1955, but that was the year the first commercial languages debuted. I can’t remember which came first, Lisp or FORTRAN. Both are still used today, and both represent nearly polar opposites of how to approach computation. FORTRAN is an abstraction of hardware, and Lisp is an abstraction of *a* calculus notation that can express computation, called Lambda Calculus.

The first compilers and interpreters were written on punch cards or punch tape in machine instructions. Once the compiler was loaded, source code in that language was text encoded in punch cards. Again, this sort of thing already existed in the telegraph industry. ASCII encoding was developed for telegraph, and so ASCII and even unicode are both backward compatible with late telegraph development, hence why those early telegraph devices still work with modern computers.

If you had a nice punch card key, it would even type, as a typewriter, the corresponding character on the card in ribbon ink. Each card would be a line of code, and you would try to cram as much code into a single card as possible. Verbosity was your friend when you had to keep thousands of cards in order. So instead of nice variables today, like “velocity”, you would just use “v”. It’s a habit we haven’t needed to have since the 80s and we’ve been trying to shed it as an industry since.

Well, you get yourself a shiny new mainframe, and when you turn the hulk on, New York dims. Ok, how do you write programs for the damn thing? Well, you already have that other mainframe over there, with compilers and programs… Use IT to “transcode”, to “cross-compile” a program for that new machine. Programs are just a form of data, and compilers merely interpret source code into *a* machine code, it doesn’t have to be for the machine the compiler is currently running on, it just means the program that’s output won’t run on *this* machine.

Ok, so then take your paper tape from that machine, bring it over to the new one, and feed it in. Bingo bango, you just compiled your compiler from source code to machine instructions for the new architecture. Now you can reuse all your source code from the other computers.

Oh, you have a new magnetic storage device? Cool, skip the punch cards and store your data on that. Now you can just copy floppies and sell them.

This whole process was repeated from scratch many times, because often enough it was easy enough to do in the early days. When the micro-computer was invented and people had their first Commodore 64s and ZX Spectrums, they even had interpreter firmware built into them. You could just boot them up with nothing and go straight into writing code, typically a form of BASIC or Forth.

Java is a language that is compiled into a universal byte code. These are machine level instructions, but for no machine that actually exists. The machine is virtual, a fiction that exists as a spec document. Java compilers conform to that spec. When you run a Java program, the instructions are first interpreted, so the program can start running all the sooner. Meanwhile, the byte code gets compiled again (Just In Time(tm), aka JIT) into the actual hardware native instructions. The next time that code is executed, it switches over to the machine native form.

The first Java byte code compiler and interpreter, because it didn’t start out with a JIT, was itself written in Lisp.

There are many different “levels,” to programming. (Warning, lots of words, but pretty simple)

The lowest level of programming is transistors. We can build up simple logic gates such as AND and OR directly in the circuitry, and by turning certain switches on and off, we can get out certain behavior such as switching between different components, storing information, or doing simple math. These on and off switches can be represented as 1s and 0s or “binary.”

**01010101 01001111 01101011** -> Would give the instruction “store number 79 in memory slot 107.” The first block (or “byte”) would be the “store,” instruction, followed by the binary number for “79,” and finally the binary number for “107.” (the actual binary would differ depending on CPU architecture, so this is just pretend).

Writing complex programs in nothing but 1s and 0s would be a nightmare, so instead we wrote just enough 1s and 0s to build a program that translates a more readable language called “assembly,” directly into the 1s and 0s we need. So instead of binary, we now write:

**mov 79, 107**-> When run through our new assembler program, gets directly translated to the binary 1s and 0s from the first example. It basically just looks up what “mov,” translates to in a large dictionary and swaps it out for the needed binary. Basically just simple cut and paste.

An assembler is a *very* simple program, but allows us to start thinking closer to human language, and thus allows us to develop more complex software. From there we can write a “compiler,” which is a program that can read over a text file in whatever language we want to come up with and translate that into the binary we need.

**int myVariable = 79;** -> This gets read over by our compiler and translated into binary to be executed. This is how languages like C, C++ work.

From there it’s a self contained loop. A compiled language can be used to write new/better compilers, which in turn can be used to write new/better compilers, etc.

Languages like Java and C# are one level above this. They aren’t compiled into binary, but instead into “bytecode,” which is like binary but instead of being run by the CPU, it’s run using either the Java Virtual Machine or .NET framework which are programs on the users machine designed to read this special bytecode. This allows the individual software developer (like you or I) to write a Java/C# program once, and it will work on any computer system someone has programmed a virtual machine for (most likely programmed in C or C++) which is designed to read these instructions.

Finally we have “Interpreted,” languages like Python and Javascript which are the highest level. With these languages the actual text the programmer typed is what is sent to the end user, and the actual conversion to binary happens as each line is run on the users machine. This is why you can press “F12,” right now and see all of Reddit’s code, since HTML5/Javascript is interpreted.

Don’t think of it as a language, think of it as the evolution of a tool and its uses. Digging, for example. Originally, to dig one would use sharp objects. Eventually, the spade was invented, along with various sizes to accommodate different tasks. Larger tools for digging were invented until machinery allowed for even larger digging equipment and thus more tasks could be accomplished with this greater digging power. In the same way, programming started as soldering circuit boards, eventually moved to punch-cards and and tape reels. Then, with the advent of monitors and keyboards, people could do things like data entry and complex calculations and it just kept going from there.

Like teaching a kid. First the alphabet. Then combining them into words. Then combining them into sentences.

Computers are built with a language in hardware. Put a value in the part telling it what to do, put a memory location to tell it what number to do that command to.

Command 1: load accumulator with value at location 1000. Command 2: add to it the value of location 2000.

Then you write a whole program like that. Next step is to use that to write an assembler to make this easier. Instead of punching numbers into locations and executing, you say LDA 1000 and ADC 2000. The assembler program is just easier to remember machine language, and more complex ones do variables and such.

Then you use the assembler to write a compiler for a higher level language like C to make it even easier to program.

And then you use C to write Java.

The CPU only talks in numbers. Every number is an instruction. You have to know the “codebook” of what number, in what order, does what. Instruction 43 might be “multiply these two numbers”, for instance. So 43 2 3 gives an answer 6.

That number is literally represented in binary on the CPU’s input pins when you want it to do something. You set those pins, they correspond to the binary for 43, and the CPU knows what to do next.

But that codebook is a pain to program in. Ask the Apollo astronauts who had to calculate their trajectories, etc. by setting “Verb 23, Noun 15” on their computer system to get it to do, say, a basic multiplication instruction. That’s all they had to communicate. Numbers. That’s “machine code”.

But those numbers were assigned those tasks by the human CPU designer. So somewhere there’s a codebook that tells you that 43 is multiply, for instance. So… why not let the human just use a shorthand. Say, “MUL”. And computers are designed to do all the boring legwork, that’s their entire purpose, so why not get the computer to take the text “MUL” and output “43”? Congratulations, you just made an assembler. A program that takes the text instructions and converts them to machine code.

The first one would have been written in “machine code” by someone. Tricky, but you only had to write the assembler and then everything got easier. You obviously don’t sit and write all your programs in machine code once you have a working assembler.

But even “assembly language” (the codebook language that has “MUL”) is a bit tricky to program in. So you make the computer do the work again. Using assembly language, you make a program that takes more complex text, and converts it into assembly language for you. So it might take something like “A = B * C”. And it works out that it has to get B and C from memory, run the MUL instruction on them, and put the result into some part of memory called A. That program that does that might be called a compiler.

There is a programming language called C. Generally this is the first compiler that anyone writes for a new type of computer, because it’s relatively easy to write a C compiler in assembly language, and relatively easy for a human to write a program in C. That C compiler takes your C code (which looks a lot like Java) and converts it to simply assembler or machine code.

Now that you have a C compiler you find that you can compile most things! Parts of Java itself are written in C.

So when you have an entirely new type of machine, someone (who knows the “codebook” well) writes an assembler for it. The next (or even the same!) person then finds or makes a C compiler that itself can be written in assembler (there are lots of them already, but sometimes we have to make new ones!). Then the person after that? They have a C compiler and a whole raft of operating systems, kernels, programming languages applications, etc. that are already written in C.

Notice, though, that all it took was two programs – some way to get assembly language into the computer, and then some way to convert C code down to assembler. Those are probably the most difficult types of programs to write, and sometimes you have to write parts of them from scratch (e.g. if a chip has never been seen before and is different to everything that existed before), but the assembler you can literally write by hand, and the C compiler is just a case of tweaking an existing C compiler. And only those two programs are needed (a slight exaggeration, but once you have them, everything else can be compiled from C!) to get everything else working.

Computers in the old days (e.g. Apollo missions, and home computers right into the 80’s) used machine code only and often had machine code tutorials in their “starter manuals”. One of the programs they often got you to write was an assembler! And then C and other languages came later. Nowadays nobody bothers because it’s all done for you, but still someone, somewhere, sometimes has to write an assembler (or modify an existing one to work on a new chip) or a compiler.

It’s like having to learn the alphabet, then learning how to form that into words, then words into sentences, then sentences into paragraphs, then paragraphs into chapters, and so on. It all starts at the bottom. And I could probably teach someone who knew nothing about computers how to write a very basic program in machine code, and a basic assembler to help them, in a few days. It would take far longer to write a C compiler but you probably wouldn’t need to.

Even today, when starting on a new chip, the people at the chip manufacturers will do that part for you – and they often start with the machine code to write an assembler, then using that assembler to compile a “miniature” C compiler (e.g. tcc), then that mini-C compiler to compile a full compiler (e.g. gcc), and then give that to you in a download so that nobody ever has to do that part again and we can all just take C code and compile straight to the new chip without having to have anything to do with machine code or assembler.

It is hard to wrap the head around it because there are layers and layers and layers of abstraction. When taken one at a time, each one of them is much easier to understand, at least on the level of ELI5.

I highly recommend this youtube series: https://youtube.com/playlist?list=PL8dPuuaLjXtNlUrzyH5r6jN9ulIgZBpdo They explain everything more or less like eli5, starting from the bottom and going up that ladder of abstractions. Very entertaining, too.

The very first “languages” weren’t anything you would consider to be a language. For example, index cards with holes punched in them to represent 0 or 1. A “programmer” would write 0001001001110101010 in a very precise way for that particular computer to move data around in memory, perform simple operations on it (ex: add, substract), or what have you. They were literally manipulating individual bits on the hardware.

The truth is, all computers still actually work this way! Down at the level of your actual CPU, the only language it can understand is still 0010010100101010. But what has happened since is what’s called “abstraction.”

For example, someone invents another language around the binary that contains larger “ideas” that would normally be expressed as longer binary. Instead of 0010101010010101, maybe the language defines it as “move 0010 to register 4” or “add the contents of register 2 and 6.” They write in that language, but the language is then translated back into its original 00101010 for the CPU to actually run.

Then someone else comes along and goes “hey so, having to remember all these registers and where things are in memory is a huge pain in the ass. What if we let people just define simple variables (x = 4, y = 3, z = x+y) and then we keep track of where everything is for them, and first we translate into “move 0010 to register 4” and then we translate THAT into 00101010 for the CPU to actually run.

Keep doing this 30-40 times over half a century, and you get to the modern languages we use today. But realize *nothing* *can actually run these languages as written*. They have no meaning to even a modern CPU. They have to be parsed, lexed, compiled, and assembled all the way down (through multiple intermediate steps) back into the same 00001010101010 that people were printing on punch cards 60 years ago.

Modern languages are really just window dressing. No matter what you write in, it all gets compiled/baked down into the same 0s and 1s that your particular CPU needs to run, because that’s all it can run. Languages are just layers of shortcuts and decorative fluff that we’ve build up on top. And all of the arguments over modern languages are mostly about the tradeoffs of one cosmetic change or shortcut or another. The CPU doesn’t care.

I’m a third year computer engineering major, so the following is all my personal dumbed down view of the answer to your question.

The basic hierarchy looks like this, from highest level (farthest from hardware) to lowest level (closest to the hardware):

– object oriented languages like Java, C++

– C

– Assembly language

– Machine code

Java compiles into C, C compiles into assembly langauge, assembly language is turned into machine code. These different levels of abstraction generally come from developments in programming over time. You can Google image search each of those names to see examples of them. As you get closer to the hardware, things generally get simpler in complexity but much more tedious. Generally, the farther you get from the hardware, the easier it is to quickly write powerful programs that do a lot of stuff.

In assembly language, doing something like taking an average of 10 numbers then adding that to a running total would probably take more than 100 lines of code (guessing). In C and Java it would take 1 line. In Java it might even happen without your knowledge when invoking some operator or function. I’m not familiar with Java since I mostly use C.

There’s two important ideas you’re hitting here, and I want to talk about both of them.

First, there’s history. Each early language was usually made the same way: lazy programmers. Ugh, who wants to write all this binary when I could make a program that converts words to the binary patterns I want for me! <fast forward a few years> Ugh, who wants to write all these assembly instructions when I could just write a program that converts higher level abstract code into assembly for me!

Then later: Ugh, people keep making mistakes in this code- I’ll write a program that forces people to follow good patterns, a ‘computer language’. This is followed by decades of research into the very concept of ‘computer languages’ and some wonderfully rich and interesting history. Each compiler was written in some lower language compiler, typically.

But the second concept is, I think, more interesting: self-hosted compilers. Rust (a pretty new language) had a compiler originally that was written in OCaml (a very hard language to learn, believe me). Then one day, they wrote a Rust Compiler in Rust, used the OCalm compiler to make that program, and then threw away the OCalm compiler.

The Rust compiler is written in Rust. When they improve the compiler, they use the old one to make the new one.

There’s no magic here. Any programming language that is ‘Turing Complete’ can compute any function that any other language can. Compiling is just a function from text (code) to a binary (executable program). In theory, you can write any compiler in any language, including itself.

Source: I knew this Computer Science degree would come in handy someday.

You don’t need a computer to create a programming language. A programming language is at its most basic just a set of rules defining the syntax and behaviour of the code.

What you need the computer for is the write the compiler (and/or in the case of an interpreted language, the interpreter/runtime environment) and libraries/API. Once you have the compiler, the libraries *can* be written in the same language, but often they are written in an another language (usually a lower-level language or even partly assembly).

Compilers/Interpreters are very often written in the same language they compile (self-hosting), but obviously that’s a cyclic dependency – the first/earliest versions would have been written in prior existing programming languages.

Taking Java as an example: the official Java API is mostly written in Java, with parts of it in C or C++. The official compiler (Javac) which converts Java source code to Bytecode, is today written in Java but originally was written in C and C++. The official Java Virtual Machine (JVM) which runs the bytecode on the various platforms that Java supports, as far as I know is always written in C. Note that I keep saying official – anyone is able to write their own version. Oracle’s licensing shenanigans aside, nothing stops you writing the JVM in C#, or Javac in JavaScript.

C++ is a special case; the first C++ compiler was written in C++, but my understanding is there were some ugly hacks and conversions to actually get it compiled in a C compiler. The first C compiler originally evolved from a language called BCPL. Go up the chain far enough, and eventually you reach someone who was programming a computer by writing assembly language, pulling wires, flipping switches, or typing on a punch card.

It’s a chain.

The first program was very basic, and could only understand on and off. But a human could use it in a clever way to make a second program which could understand a set of different on and off settings in a pattern, like when you see someone hold up two fingers instead of one, and that means something.

The next program after that could understand where one pattern like that ended and where another began. At this point, it’s like words, only more basic – just lots of “on” and “off.” But see, here it gets useful, because these are like symbols that the computer understands. It can move electricity in patterns defined by these sets of on-off.

At this point programs were just sets of cards with holes in them that a computer would read. There were no monitors or keyboards.

Well, that program was used to make a new program that not only understood these patterns, but could show a human on a screen letters and numbers that represented those patterns. And with those letters and numbers – stuff like “PUSH 1” and so on – it was a lot easier to write more complicated programs.

This is where it really took off. Now people were telling the computer how to understand even more words and turn those into long, long sets of on-off that it could use to direct electricity in _really complicated_ ways.

These “compilers” are what form the basis of all programming languages today. Programs that teach computers how to understand other programs.

Sorry if that was too long.

Lots of them are written in C, or (like java) run in VMs written in C or C++. The C/C++ compilers (GCC, Clang, etc) are also written in C and C++ (i.e., self-bootstrapping.) The early C compilers were bootstrapped with a minimal version in a hardware-dependent assembly language that could compile the C version. C is still heavily used.

The code is assembly where you manually instruct the computer to set values, move values, and perform mathematical operations on those values. The values are 1’s and 0’s,or sets of those. Everything that a computer does is represented by these values. However, programmers created layers of code to abstract from this hard stuff. They created simple commands to perform dozens of assembly actions at once. Over the years, it got easier to write code as the layers were added and perfected and standardized.

Programming languages are higher layers that interact with lower layers in different ways. Some are more efficient at performing tasks. Some are easier to understand. Some make code that can be reused more effectively and shared. Which one is best is subjective.

In the end, it is all about talking to the various parts of a computer and moving values from point A to point B. Everything else is about aggregation of combinations of those actions.