# how do computers know how to differentiate binary without there being spaces or separations?

108 views

I know the jist of binary, but something that I have never understood is how it know where one “word” stops and another starts.

so if 9 is 1001 and 57 is 111001, how does the computer know that the “1001” in 111001 isn’t whatever 11 is and then the number 9, 1001? only having two digits, 1 and 0 seems like not enough to differentiate.

If you want to have the word “apple” written in binary, do you take the binary code of each letter and smoosh them together or is there a separate specific code for “apple”?

In: Technology

You are thinking of it backwards. We translate binary into our numbers because it is easier for us to understand and read.

A computer sees on-on-on-off-off-on

It doesn’t care if its 3 and 9 or 57, it just has a series of switches which are on or off. We translate it into words like when we use words instead of telephone numbers, because it is easier for us to work with and remember.

Computers don’t need spaces or separation, they just need what switches are involved and whether they are on (1) or off (0).

At the level of machine code and binary instructions, there are no ‘spaces’. Instead, the computer just starts reading the instruction list. Each instruction is a specific, known length, and once that length is read the instruction is over and the next one starts. Theoretically, getting the CPU to start in the middle of an instruction is possible, but in practice the code will quickly crash because none of the instructions make sense. This is actually a potential source of bugs in programs. *Edit*: an additional option is a [NOP](https://en.wikipedia.org/wiki/NOP_(code)) instruction, which acts almost identically to the ‘ ‘ character in english: it tells the computer that there’s nothing here and it can be used to separate other more useful instructions.

And as for data (which seems to be more what you are asking about, since you mention turning binary into numbers), it’s very similar. A chunk of numbers will be separated out when a variable is ‘declared’. This chunk of numbers will translate into a specific piece of data. So if you ‘declare’ a byte integer next to another byte integer, you will always know the first 8 bits are to that first number and the second 8 bits to the second number.

When there’s something in the code that needs to be read without a preknown end point, there will be something like an [EOF](https://en.wikipedia.org/wiki/End-of-file) character. It’s a number or operand or similar that the computer knows as “this is the last number of the previous set”, which can be how you get words like “pie” and “bookkeeper” working with the same instructions.

I’m not a computer expert, so maybe I’ll get a “Well actually”, but I did build an 8 bit computer on breadboards. The bus is 8 bits wide, so while you only need 1001 to indicate a 9, the computer sends the full 8 bits for each data transfer, so it puts 00001001 on the bus when transferring a 9 between modules. Since every data transfer between 8 bit modules uses the full 8 bits, there is no confusion as to when a value starts and stops. The memory only uses 4 bits, so only the first 4 bits are read from the bus and the last 4 bits ignored. I would expect a more modern computer using a 32 or 64 bit bus operates in a similar manner.

Simple: you program it with either everything having a fixed length and position, or reserve a space in a fixed place with a fixed length that tells you how long the data and/or where it is.

Computers have a ‘word size.’ think of a computer program as a long list of instructions, where each instruction is the same size. A cpu will then have an instruction set. So, hypothetically you might have a 3 bit computer. Well that means the computer will read 3 bits of a time and can only have 8 instructions. So let’s say:

000 means put the next value into register A

001 means put the next value into register B

010 means add register A to B and store it in A

011 means sub A from B and store it in A

100 means print register As value to screen

So then a program might look like :

000001001001010100

Well the computer reads 000 which says store the next value on register A, so it reads 001 and stores it. Then it reads 001 which says rad and store in register B (001 again). Then it reads 010 which adds the registers, so now register B is 001 + 001 (which is 010) and then it prints that to screen, so the program will print the number 2. This is also why programs compiled for one chip won’t work on another and compiling for 64 bit won’t work on 32 bit. Also, the reason AMD and Intel chips both work on the same programs is because they both follow the same x64 (and x86 standard for 32 bit) so the programs are consistent. They migjt be different internally, but the same instruction on both chips should produce the same results