eli5 How are pictures and images literally written on a drive

968 views

I know this is ultra general, but Ive always wondered how pictures or sound is written to a drive. I understand slightly that data is stored as little bits of sequences of on or off switches, but if you were to write out a drive in english what would that look like for pictures. Does each pixel have a specific code like the #F000F4 or however you see them online? Hope there can be some sort of minor explanation. Thanks.

In: Technology

6 Answers

Anonymous 0 Comments

Yes, each pixel has a code. Let’s say the image is 1080p, so 1920 pixels wide and 1080 pixelz high. That’s about 2 million pixels. Now, if this was a black and white image, you could save it as two million numbers. Say 0-100, with zero being black and 100 white. A string of two million numbers between 0 and 100 would give you a 1080p black and white image. You’d just need to agree on an order, like say right to left, top to bottom. Now you can share a long string of numbers with someone, and they can decode a picture. Could do it by hand even, though a computer is going to be a lot faster at it.

If you want a colour image, we use RGB. Red green blue. So now we need three number for each pixel. One for the brightness of red, one for green, and one for blue. So 05-78-100 would give you some sort of colour. High on the blue and green, low on the red. So a cyan or aqua or something. A two million string of those triplets gives a 1080p colour image.

But computers work in binary. 0-100 is a bad scale. 0 to 1 would work, but that’s going to give us a very poor range. Only two levels. If we use two binary digits, that’s four levels. 00, 01, 10, 11. Aka, 0, 1, 2, 3. If we use three binary digits, that’s eight levels. If we use 8 binary digits, that’s 256 levels. That’s a decent scale. So every pixel is now 01010101-11110000-11101011 or something like that, and there’s still two million of them.

Except that’s hard as hell to write out. Enter hexadecimal. As four binary digits has 16 options, a new number system with 16 options would sure make that a lot shorter to write out. We have decimal, but decimal (10) is not a factor of 2 of two doesn’t play nice. 0001 is 1 in binary, so in hexadecimal that would be 1. 0100 is 4, so that would be 4 in hexadecimal. 1001 is 9, so that would be 9. But 1010 would be 10, and that’s a problem. 10 doesn’t have its own number in decimal (what we use), so let’s call that A. B is 11. F is 15.

Two hexadecimal numbers next to each other is 16×16. So two of these hexadecimal numbers is 256 options. So two numbers can replace 8 binary, it’s just a shorter way to write it. FF is 256. 00 is 0. So your example of F000F4 is the colour 15×16 + 0, 0x16 + 0, 15×16 + 4. So 240, 0, 244. In other words, 240/255 red brightness, 0/255 green brightness, and 244/255 blue brightness. In other words, that’s a bright purple pixel. String two million of those together, and you got a 1080p image in binary.

As for audio, what is sound? Well, it’s air pressure changing fast over time as a wave. We can measure air pressure and a wave, and we can just measure it at a few points in time and pull off some numbers. Let’s use those same 8 binary digits again that give us 0 to 255. 255 is high air pressure, 0 is low. Now let’s make a string of those numbers, say 44,000 long. And say every 44 of these, it goes from 0 to 255 and back to 0. Now, let’s say we hooked this up to a speaker, and made the speakers position run through this list of numbers, 44,000 times per second. Every 44 numbers the speaker is going from max in it max out and back, and we’re going through 44,000 numbers per second. What’s that mean? Well, speaker position goes in and out 1000 timers per second. What’s that mean? We just made a 1000 Hz tone. Why 44,000? That’s what a CD uses, and 44,000 was chosen as it is slightly over double the 20,000 Hz human hearing range, meaning it can do sounds up to human hearing. You need numbers, or samples, at twice the sound frequency to store them. So that’s how sound is stored in binary, just a lot of numbers recording the sounds wave height thousands of times per second.

This all sounds like a lot of numbers, especially for something like a movie. Yes, it is. It would be absolutely massive. A HD movie should fill up your entire drive. But it doesn’t, thanks to compression. You can take everything I’ve said, and then make it smaller through tricks. A small mp3 file might cut our high frequencies you don’t care about. A video file doesn’t need a new complete image every frame, just needs to know what pixels to change from the last frame. An image that is all white doesn’t really need 2 million pixels, it just need one and the instructions to repeat it for all of them. That’s what a JPEG does, groups similar colours regions into on blob to save space, and you can see this quite easily on a jpeg image.

You are viewing 1 out of 6 answers, click here to view all answers.