Ok, so I know that the alphabet of computers consists of only two symbols, or states: zero and one.
I also seem to understand how computers count beyond one even though they don’t have symbols for anything above one.
What I do NOT understand is how a computer knows* that a particular string of ones and zeros refers to a number, or a letter, or a pixel, or an RGB color, and all the other types of data that computers are able to render.
*EDIT: A lot of you guys hang up on the word “know”, emphasing that a computer does not know anything. Of course, I do not attribute any real awareness or understanding to a computer. I’m using the verb “know” only figuratively, folks ;).
I think that somewhere under the hood there must be a physical element–like a table, a maze, a system of levers, a punchcard, etc.–that breaks up the single, continuous stream of ones and zeros into rivulets and routes them into–for lack of a better word–different tunnels? One for letters, another for numbers, yet another for pixels, and so on?
I can’t make do with just the information that computers speak in ones and zeros because it’s like dumbing down the process human communication to the mere fact of relying on an alphabet.
In: 264
Many of these answers miss the point of OP’s question (as I understand it), and definitely are not ELI5 level.
OP, the binary strings (and their hexadecimal equivalent) for functions, characters of text, etc., are defined in standards. The simplest reference would be ASCII https://www.rapidtables.com/code/text/ascii-table.html so you can see what that looks like.
Data is structured in defined block sizes and sequences to let the system “know” what a segment of code is for (this next bit is for a character in a word doc), and the value passes to the system then has meaning and instructions (type an “A”).
Computers set extra bits that interpret the other bits. (meta data). Ex, data stored as a “signed integer” could have its first bit represent negative, or positive. Each computation step will already be informed what type of data it’s about to use. A practical example is a sql database where each field is set as a type, like text or bool.
Programmers have standards like Unicode and ASCII and JPEG that translate 1’s and 0’s into meaningful information like characters and emojis and pixels. When you open a file, your computer reads the file format and other metadata and based on that data, it knows which standard to use to decode the 1’s and 0’s with.
This is why, if you change the file format of a JPEG file into .txt, and then try to open it in notepad, you just get a bunch of gibberish, because your computer is applying the wrong decoding format to the 1’s and 0’s.
As for the physical process of converting, there’s a table full of the possible pixel outputs, and the 1’s and 0’s control which cell of the table gets sent to the monitor, by electrically turning on the electrical connections between that cell and the monitor, and electrically turning off the connections between all the other cells and the monitor. This works because computers are made with transistors, which are voltage-controlled switches. The 1’s and 0’s are just high or low voltage, and the transistors are arranged so that a particular pattern of 1’s and 0’s will turn the right pattern of transistors on and off to access a particular cell in the table.
>*EDIT: A lot of you guys hang up on the word “know”, emphasing that a computer does not know anything. Of course, I do not attribute any real awareness or understanding to a computer. I’m using the verb “know” only figuratively, folks ;).
The issue is, once you get down to the level of detail required to understand what’s going on, using the word “know”, even figuratively, doesn’t make sense. It’s more like a very complex and large combination of wires and switches that are interconnected in a way that allows for a chain reaction of switches (either on or off) to happen. I guess you could say the “knowing” happens in the fact that these chain reactions can happen where one pattern of 1 and 0’s can turn into another pattern of 1 and 0’s.
I’ll explain in my own way, maybe someone else has as well, too many responses to read.
The computer only “knows” what a string of ones and zeroes is (called a byte, the ones and zeroes called bits) refers to because at that point, it expects something in particular.
Say you are using a word processor. You press a key. The keyboard creates the string associated with that character. For instance, the letter A is the following 8-bit string: 01000001. The letter B is 01000010 and so forth. There’s another signal that is sent which is called an interrupt signal. It tells the computer something external is going on, in this case, a key was pressed. When that happens, it goes to the program that handles the keyboard interrupt. This program is part of the Operating System. The value in binary (the 01000001 string for A) is stored somewhere in memory and control is passed back to the program running, at the point it left off. In a word processing program, most of its time is spent waiting for a key to be pressed. It’s a small routine that cycles basically checking if there is new data in the memory location that holds the value of a key that was pressed. If that location is not empty, it acts on the value stored. If you pressed A, it then goes off to the display routine that actually puts an A on your screen. If it was Alt-S, it checks to see what that code means (Save) and goes to the routine that saves your work on a file and then comes back, resetting the value in memory, ready and waiting for the next key to be pressed.
Another software uses the strings differently because that’s how it is programmed. It may also be 01000001 but in this case, the string means something different and the program does whatever it was told to do with that string. A spreadsheet sees that string and at that point in the program, it may be told to add it to another string. It doesn’t “know” it’s a number, it just does what it’s told to do with it. That same string of bits in another area of the memory may mean to the program that this is the color red to show on your screen.
The table or maze you allude to is the memory. Each program (application) is assigned some memory to run in and use for its data. The programs are told to look in their specific block of memory only, that’s where its data will be. The program controlling your screen knows that all the data needed to actually create what you see on the screen is in a certain memory area. The bits and bytes there represent the data to do so, from the color to the brightness of each pixel. If another program accesses that memory location, it would read it and do what it is told to do with the byte but the result may not make any sense, it may even crash the computer.
Does that clear things up?
So you seem to understand that there are items made up of combinations of 0 and 1 that represent different things — a color, a letter, etc.
There are sets of such combinations that define a computer’s ‘instructions’; a modern-day computer is a machine that executes sets of such instructions. These instructions are things that the computer can do with other combinations of 0s and 1s; for instance, a computer instruction can indicate that the computer is to add the 1s and 0s from one memory location to a set in another location, and store the result in a third location.
From such basic instrucdtions, computer progrrams are built to do things with large sets of 0s and 1s. Some computer instructions read 1s and 0s from ‘ports’ that are part of the computer hardware; in order to make any sense out of what is read, the computer must be programmed to expect a certain set of 1s and 0s to come in on that port. For instance, some old-time computers used to communicate with “dumb terminals” which sent codes for letters on the ports to which they were attached; ‘A’ was 0100 0011, ‘B’ was 0100 0100, and so on. This particular set of codes is named ASCII; there are others that represent letters as well.
If someone had connected some other machine to that port, and the other machine had transmitted the SAME code, the computer could have read it, but if that machine had transmitted some other code, the computer would have attempted to read it as ASCII, and it wouldn’t have worked out well because what was being input was not ASCII code.
This illustrates the basic answer to your question — in order to interpret a set of codes as colors, letters, numbers, etc., the computer needs to have some designation of what the codes are. Although, in limited circumstances, the computer could try different interpretations and perhaps come up with one if the circumstances were right, mostly the computer has to ‘know’ (be programmed) to expect particular sets of codes in particular situations.
I’m happy to expand on this if it’s helpful; let me know if you want further information.
Latest Answers