When I was studying computer engineering in college, one of my professors had a saying “bits are bits”. To the CPU of a computer everything is just high/low and nothing else.
But as others have said those combinations and strings of high/low can be assigned meaning. The CPU doesn’t need to know or care about the meaning as long as we program it to do the right thing with the data.
Here’s an example. Music inside a computer is represented by a series of numbers that tell the the sound card how high or low to make the voltage on the output at a specific time. When you click the little slider and drag it to the right to “turn up the volume” this is telling the CPU to take the current audio data and to multiply it by some factor that is related to how high you set the volume. In the context of making the sound louder, this simple math actually makes sense.
The CPU doesn’t know or care that the data it’s multiply is “music” or “volume level” it just does the math and then sticks the result in whatever memory location it was instructed to. In the case of music that “memory location” is actually the sound card.
Another example is graphic design. Let’s say you are making a flyer for an event at your local library. You have a picture of your friend Ellie who has green eyes. You decide you want to match the background to her eyes. In the program you can use a mouse tool to click on her eyes and it stores the color in the program so you can make it the background.
The CPU doesn’t know or care that you’re making a flyer. It just asks the mouse where it is, then it asks the program what color is on the screen where the mouse is and then it stores that color in system memory. But all that data looks the same to the CPU, and basically it’s all numbers. It gets the data from the mouse by looking in a memory location that is actually the mouse. When it goes to ask the application for the color, it doesn’t know the application’s name, just its ID number. To get that color data from the application the CPU is actually just looking in another memory location. When it stores the color it’s just putting it back to a specific memory location. So all the data the CPU works on is literally just high/low for *everything.*
This concept is actually part of the reason why computers can crash or lock up. The programmers have to be careful to tell the CPU to get and work on the proper data because the CPU can’t tell the difference between music, an essay, email, printer settings etc etc. It will happily try to “add” a picture of your dog to the setting for how long before your screen saver turns on and then spit out the “result” even though that makes no sense.
Every single application you’ve ever run can be broken down to 5 things:
1. Reading and writing data
2. Operations on this data such as addition and subtraction and a couple other logical operators
3. Branches. If one value is “true” then do this instruction. If it isn’t true, do some other instruction.
4. Jump. Jump to a specific location in the instructions and start executing from there.
5. Hardware IO. This is like making a pixel a certain color or writing to a file or reading information from the network / web.
How do we do each one.
1. All data can be stored as some number. We can assign every letter / character to some value and represent this value in binary. All numbers can obviously also be represented in binary. Etc. Reading and writing is also quite simple. Let’s say that the number for a read instruction is 1 and writing is 2. You could tell the computer “2 <location, which is a number> <data, which is also a number>” and there ya go you just wrote arbitrary data to an arbitrary location.
2. Operations can also just be a number. Let’s say addition is 3. So now we can tell the computer “3 2 5” and it knows to add up 2+5
2. Jump. I’m going to do Jump first because it helps explain branching too. Let’s say the operation number for Jump is 4. Now we just have to tell it a location in the code to jump to. Let’s say we just represent this as the line number in the code. So “4 3” will move where we are executing to the 3rd line of code.
4. Branching. Guess what, we can also represent this instruction as a number. Let’s say it’s 5. Now we also have to give it some value that is true or false, and then tell it what to execute if it’s true and if it’s false. Let’s use the jump statement to tell the code where to go if the statement is true/false. So “5 <true/false> <jump statement if true> <jump statement if false>”
5. Hardware IO is a little out of my realm because I’m not an electrical / computer engineer but essentially, if you give a pixel on a screen a high voltage level, it’s going to light up. If it’s a low voltage, then it doesn’t light up. So this translates quite easily to binary
People don’t normally code things directly in binary anymore. A few do, usually hackers, modders, and software pirates, but it’s very uncommon.
You say that all binary can store is numbers, and you’re pretty much right: as far as a computer sees, all the data stored on it is (with apologies to the BBC) a big ball of wibbly-wobbly, numbery-wumbery… *stuff*. The trick is that if you can come up with standard notations for writing down numbers in place of other things, then you can store those things as numbers. And since binary can store numbers, you can use this trick to make it store other things.
Let’s say I make up a code for letters: A = 1, B = 2, C = 3, and so on. Now I can store letters, and if I add some more numbers for spaces and punctuation, I can store nicer text.
Let’s do pictures. Let’s say I break a picture into a number of small squares, like a tile mosaic. Then I use numbers to deacribe the color of each of those tiles: say, maybe one number each to control how much red, green, or blue is in the tile. Now I can store pictures.
How about sound? Consider that sound is really just vibrations in air or some other medium. I can make sound by telling a speaker to vibrate. If I use a number to conteol how fast it vibrates, I can vary the sound. Now let’s use a bunch of numbers, so that the vibration can vary quickly: now I can reproduce sounds.
Video? We’ve got pictures, we’ve got sound, so we just need to store sound alongside a string of pictures.
And that’s the basics. You’re right that binary is just numbers. The trick is that anything that can be turned into numbers -in other words, *digitized*- can be turned into binary, and then we can store it. All you need to do is agree upon the ways in which you *do* that.
It’s just like another language. English has 26 letters. Binary has 2 letters. It means the words have to be longer in binary, but if the computer has been taught to read that language then it can read it quite quickly.
DNA, which is the language that cells speak, only has 4 letters and it can make a whole person!
(Dont shout at me about RNA, this is for a 5 year old)
Everybody is explaining to you what binary is and how to encode stuff
But to answer your question the vast majority of people do not code in binary like they’re not sitting there typing ones and zeros into a notepad
Programmers these days use high level languages that abstract themselves from the machine code. Languages like C and c++, can manipulate binary objects but it’s up to the compiler to actually transform that into machine code that your PC actually understands
The closest you can get to programming in binary is called assembly but even then you’re just giving the CPU instructions not actually doing ones and zeros
So, all computer chips aka CPUs actually come with a list of valid instructions that the CPU understands. These commands are very, very simple instructions like:
– ADD – two numbers together
– LEA – calculate an address in memory
– MOV – move data from memory into the CPU, or from the CPU to memory
– CMP – compare two numbers
– JMP – go to another spot in memory and start reading the instructions from there
– CALL – start a procedure
– RET – go back to where you were before the last CALL
And lots, lots more.
Now, each of these instructions has an “opcode”, basically, a number that that represents the operation. So ADD can be operation 0, LEA can be operation 141, and so on.
Now finally, each instruction can also have arguments. So the people who make CPUs also document the format to specify the arguments that tell the CPU how to do that instruction.
On a 64 bit computer like the one you have, each instrument is 64 bits long. Aka each instruction is a string of 64 1s and 0s. This is also 16 hexadecimal digits.
This is totally made up, but let’s say want to LEA 32 + the address in register 2, into register 1. Registers are basically like scratch paper your CPU gets to do math with. That instruction may look like:
8D 00 00 02 00 00 01 20
where 8D is the opcode, “00 00 02” is register 2, “00 00 01” is register 1, and “20” is 32 in hex. In binary that’s:
1000110100000000000000000000001000000000000000000000000100100000
When you run the program, these 0s and 1s will be loaded into the CPU. It’ll read it and go “oh okay calculate a memory address by taking 32, adding it to..”
Now if you just stack operations like that together you’ll be writing entire programs in binary.
Back in the SUPER olden times, people used to do this by hand. They’d get cards of rows of squares, where each row has 64 squares (okay back then it would’ve been, like, 8). They would write their code by hand, calculate the opcode and arguments, then manually punch 10110010 or whatever into the card. And they would end up with a big stack of cards and shove them through the computer. If you mispunched a card, or got them out of order, or an actual bug flew into your computer and jammed the holes (that’s where we get the word bug from) the program would break.
Then they started making assembly languages. Basically you typed MOV, LEA, CMP, JMP, ADD into a file on a computer, and an *assembler* would turn it into the numbers automatically.
Then, they started making programming languages like FORTRAN, Lisp, and C, that would take code you could write once, and compile it to the assembly language for your specific machine.
Then they used those programs to make virtual machines. And they made languages like Java or C# or Python which are programming languages that compile to a fake, universal assembly language for a virtual machine, and then another program runs the virtual assembly.
And then they made languages like Typescript, which compiles to another language called Javascript, which runs in a sandboxed engine in your web browser, which is itself a virtual program run by a program run by your operating system run by your CPU.
So we have built a lot of layers over the years but at the end of the day it’s binary that’s running
Latest Answers