If digital data is stored in 0s & 1s, how does the reader know how many of the digits to take into consideration?

926 views

Must be a very basic and dumb question. But ‘1001’ can be 9 and also 2 & 1 if ’10’ & ’01’ is taken seperately. I’m confused.

In: 49

48 Answers

Anonymous 0 Comments

Bytes are usually organized into words which can be multiple bytes and are the basic unit handled by computers (primary width of the registers used by the CPU usually). The computer itself just performs the requested operation on the word whether that is some arithmetic, logical, store, rotation, shift. The computer does NOT care what the data represents it it just does what it’s told.

Interpretation of the data is left up the the software. I (or my compiler) will frequently stuff multiple items within a single word. I do a lot of microcontroller stuff and we are very limited on the amount of program and data memory available. My code will know that my data is located in bits 4 through 8 of the word–because I wrote the code and designed it that way. To access this data I need to do extra operations like shifting the word 4 bits to the right and then masking (setting to zero) all the bits 4 and greater of the word. This leaves me with the data of bits four through eight.

In the example above I’ve reduced the required data memory by packing the data into just the required bits; however, I’ve slowed down my code–it requires extra operations to access the data. On modern computers, the memory is essentially limitless and you’d never really bother to pack the data. Speed is more important so you’d just put your 4-bits of data in its own word and waste the unused bits. (I’m talking simple program data/variables–if you’re doing movies or something you will likely compress the hell out if it).

Anonymous 0 Comments

Modern computers, in terms of data storage and processing, basically only operate on bytes (groups of 8 binary digits [bits]). So at least in most cases you can assume that 00001001 should be treated as a single value.

Beyond that, it’s really up to the software interacting with that data to determine how to process it. This is where file formats come into play. The file format is a specification that clearly defines how to interpret the data in a file. So it will tell you what each byte in a file means.

Sometimes the rules are very strict, like a format will say “Every byte of the file represents a character of the alphabet, here’s an ANSI table that maps binary numbers to characters”. Or it might be less rigid, like “The first section of the audio file is free text ANSI metadata, which ends when the null byte (00000000) is encountered. The next section…”

Without some context as to what the data represents, it’s meaningless. Often this can be conveyed by following the conventions for file extensions – the part of the file name after the last dot (eg .txt is universally recognised as text data encoded with the ANSI or Unicode standards). Often there is also a specific pattern of data at the very beginning of the file (a magic number) that indicates what type of file it is. The file is stored in a file system, which is a particular arrangement of data on a storage device following file system standards. Programs are stored using standard data formats built into the operating system, which in turn send a series of electrical signals to the CPU and other processors following a standard instruction set. It’s standards all the way down.

Binary data is ultimately just a series of binary digits – an abstract representation of on/off electrical signals – that the program (by way of the programmer and/or user) has to figure out what to do with. If your friend came to you and blurted out “Eleven! Seventy four! Two! Five thousand, nine hundred and sixty six!” it’s not going to mean anything without context.

Anonymous 0 Comments

It’s predefined. There is a fascinating (or not) history to and technical justification for how technology developers settled on an “8 bit byte” which then also became a 16, 32, 64, 128, or bigger bit byte, but in every case the answer is the same: it’s predefined how many digits the reader will consider.

Anonymous 0 Comments

That problem is not specific to computers. 123 456 can be either one single number or 123 & 456 taken separately. Heck, negro can be dark if you read in Portuguese or black if you read in Spanish. You will know which one is the right one by using context.

In the digital world it’s up to the software to decide what 1001 means, based on context. That’s why if you open a png file with mspaint you see a picture but if you open it with notepad you see gibberish

Anonymous 0 Comments

The actual ELI5 answer: the same reason you understand 4803 as 4803 and not 48 and 3.

1. Separators: in this case, spaces or special sequences between words

2. Conventions: we use byte-sized words, so each 8 bits is a separate word

Of course, these have to be agreed upon by the sender and receiver.

Anonymous 0 Comments

That problem is not specific to computers. 123 456 can be either one single number or 123 & 456 taken separately. Heck, negro can be dark if you read in Portuguese or black if you read in Spanish. You will know which one is the right one by using context.

In the digital world it’s up to the software to decide what 1001 means, based on context. That’s why if you open a png file with mspaint you see a picture but if you open it with notepad you see gibberish

Anonymous 0 Comments

It’s predefined. There is a fascinating (or not) history to and technical justification for how technology developers settled on an “8 bit byte” which then also became a 16, 32, 64, 128, or bigger bit byte, but in every case the answer is the same: it’s predefined how many digits the reader will consider.

Anonymous 0 Comments

Modern computers, in terms of data storage and processing, basically only operate on bytes (groups of 8 binary digits [bits]). So at least in most cases you can assume that 00001001 should be treated as a single value.

Beyond that, it’s really up to the software interacting with that data to determine how to process it. This is where file formats come into play. The file format is a specification that clearly defines how to interpret the data in a file. So it will tell you what each byte in a file means.

Sometimes the rules are very strict, like a format will say “Every byte of the file represents a character of the alphabet, here’s an ANSI table that maps binary numbers to characters”. Or it might be less rigid, like “The first section of the audio file is free text ANSI metadata, which ends when the null byte (00000000) is encountered. The next section…”

Without some context as to what the data represents, it’s meaningless. Often this can be conveyed by following the conventions for file extensions – the part of the file name after the last dot (eg .txt is universally recognised as text data encoded with the ANSI or Unicode standards). Often there is also a specific pattern of data at the very beginning of the file (a magic number) that indicates what type of file it is. The file is stored in a file system, which is a particular arrangement of data on a storage device following file system standards. Programs are stored using standard data formats built into the operating system, which in turn send a series of electrical signals to the CPU and other processors following a standard instruction set. It’s standards all the way down.

Binary data is ultimately just a series of binary digits – an abstract representation of on/off electrical signals – that the program (by way of the programmer and/or user) has to figure out what to do with. If your friend came to you and blurted out “Eleven! Seventy four! Two! Five thousand, nine hundred and sixty six!” it’s not going to mean anything without context.

Anonymous 0 Comments

Modern computers, in terms of data storage and processing, basically only operate on bytes (groups of 8 binary digits [bits]). So at least in most cases you can assume that 00001001 should be treated as a single value.

Beyond that, it’s really up to the software interacting with that data to determine how to process it. This is where file formats come into play. The file format is a specification that clearly defines how to interpret the data in a file. So it will tell you what each byte in a file means.

Sometimes the rules are very strict, like a format will say “Every byte of the file represents a character of the alphabet, here’s an ANSI table that maps binary numbers to characters”. Or it might be less rigid, like “The first section of the audio file is free text ANSI metadata, which ends when the null byte (00000000) is encountered. The next section…”

Without some context as to what the data represents, it’s meaningless. Often this can be conveyed by following the conventions for file extensions – the part of the file name after the last dot (eg .txt is universally recognised as text data encoded with the ANSI or Unicode standards). Often there is also a specific pattern of data at the very beginning of the file (a magic number) that indicates what type of file it is. The file is stored in a file system, which is a particular arrangement of data on a storage device following file system standards. Programs are stored using standard data formats built into the operating system, which in turn send a series of electrical signals to the CPU and other processors following a standard instruction set. It’s standards all the way down.

Binary data is ultimately just a series of binary digits – an abstract representation of on/off electrical signals – that the program (by way of the programmer and/or user) has to figure out what to do with. If your friend came to you and blurted out “Eleven! Seventy four! Two! Five thousand, nine hundred and sixty six!” it’s not going to mean anything without context.

Anonymous 0 Comments

It’s predefined. There is a fascinating (or not) history to and technical justification for how technology developers settled on an “8 bit byte” which then also became a 16, 32, 64, 128, or bigger bit byte, but in every case the answer is the same: it’s predefined how many digits the reader will consider.