Why does audio and video take so much storage?

320 views

So for example, videogames are super heavy on disk space. And aparently most of that space is just the sounds, textures, models, etc. But the code takes very little space.

Why can something as complex as a physics system weight less than a bunch of images?

Code takes very little space, media takes more. But an image is just code that tells the computer how to draw something (I think)

So how come some code gets to be so small in size and some code doesn’t?

In: 53

18 Answers

Anonymous 0 Comments

I’ll start in reverse order:

Text (code) doesn’t take up a lot of space, compressed or uncompressed, as there isn’t a lot of details that need to be kept. Compressed text especially, as during compression you can change patterns into single letters, say “wherever I see axy change it to p.” And during decompression you would know that p=axy. (Special characters tend to be used, but just want to get the point across.)

Before code can be executed it needs to be compiled down into instructions that the computer understands. Compilers do an amazing job of optimizing the number of instructions needed.

So why do audio and video take up a lot of space? Well that depends.

For clarification, videos store information about frames, whereas each frame has information on what the individual pixels must look like to recreate some scene.

The higher the resolution the more pixels which are utilized for the same scene. (Meaning more data needs to be stored.)

Audio stores information about what signal must be reproduced, to create some sound. The higher the quality, the more signals that are stored OR the more levels for a given signal that is stored. (This is a completely separate topic, that will make this too long if I go into details.)

The files used to store them, is just a large instruction set that essentially says “if you want to recreate me, these are the pixels/signal values that you need to utilize.” The higher the quality, the more instructions each file contains. The program you use to open them, is the one that would have code to interpret the instructions.

Now that we talked about what the instructions say, we can discuss why the amount between low and high quality differ so much. And it’s as simple as: the amount of detail.

Files tend to be high quality because they store a tremendous amount of data about what was recorded. They tend to utilize compression techniques that are similar to Lossless. Meaning the amount of data that differs from the original file and the uncompressed file should be minimal.

Whereas low quality images tend to utilize lossy compression, as they can sacrifice a good amount of data, and still get their point across. (I.e fine details doesn’t matter.)

Uncompressed raw video takes up a ton of space, because the amount of details it initially records is staggering. Say you had a grey table, and to the naked eye the table seems to be entirely uniform in color.

When recording this table, the camera may produce a file that shows that each individual pixel on the table is a slightly different color of grey. While that information may be useful, from a player experience standpoint it’s entirely useless. As we mentioned that it’s completely indiscernible.

So rather than storing those different pixel colors in the file, the file that is sent to players would just have all the pixels set to the same color. I.e we maximize the amount of details that is perceivable to users, but minimize the amount of data that is utilized for unperceivable details.

Anonymous 0 Comments

Uncompressed audio, images, and video take up a surprising amount of space.

Let’s start with audio. The human ear hears roughly up to 20,000 Hz frequency. You want to play samples at least double that to recreate that frequency. Let’s say you have 44,100 samples a second, a common sampling frequency. CD quality gives you 16-bits or 2 bytes per sample. And there’s 2 channels of audio for stereo, one for each ear.

Now for each second of uncompressed stereo audio, this is 2 bytes x 44,100 samples x 2 channels = 176.4 kilobytes per second. A song 3 minutes long is roughly 31 megabytes! Now let’s add up all hours of spoken dialogue, sound effects, and music in a game and it gets large, fast.

Many games don’t compress audio to save the CPU from having to decompress it. This can lead to huge game install sizes.

Images get ridiculously large in uncompressed form too. Let’s say we use 4k resolution (3840 x 2160 = 8,294,400 pixels). Each pixel has 8-bits for each red, green, and blue values so 3 bytes each. Each uncompressed 4k image is at least 3 bytes x 8,294,400 pixels = ~25 megabytes.

Now let’s make a video of 30 images per second. Each second of video uncompressed is 25 megabytes x 30 images/secknd = 750 megabytes/second… This is why video compression is almost always done to avoid dealing with these massive uncompressed video sizes.

Anonymous 0 Comments

Think about it as the difference between an art gallery versus a booklet of instructions. The art gallery takes up an entire building, more and more space if you want to display more and more paintings. Meanwhile the instructions will hardly ever be bigger than a book and if they are truly massive they *might* fill up a few shelves in a room.

Images/video are the paintings, and they take a set amount of space that increases depending on how high definition they are and how many of them are desired. Meanwhile, the code for the game is the instruction booklet, and while it might be complex, it just takes up less space to keep words/instructions easy to read than it takes to keep pictures (and audio is almost as complex) easy to view.

Anonymous 0 Comments

Shakespeare’s complete works amount to about 900k words, an approximately 8MB text file (if I recall correctly). The first Harry Potter book is around 100k words, for comparison. A letter takes 1 to 4 bytes to encode.

An uncompressed 8192×8192 (“8K”) texture has 67108864 pixels. Each of them are encoded with one byte each for Red, Green, Blue and transparency, for 4 bytes per pixel, a total of 268435456 bytes – 256MB.

Anonymous 0 Comments

Imagine a color by numbers game.

You have one 20th of a page with the rules, which state: Color each section according to the number it contains. 1 is red, 2 is yellow, 3 is green, 4 is blue, 5 black.

And then you have pages and pages of shapes, which you’re supposed to turn into pictures by coloring them in.

That’s how digital games work too. They have the code, which are the rules the program follows and then you have all the actual material (textures, audio, videos, images…) it can work with.

These materials are storage intensive, since they often can’t be stored as simplified code or as an algorithm, but have to be stored pixel by pixel or sound wave by sound wave. (It’s a bit more complicated, since compression can store it pixel group by pixel group etc., but you get the gist.)

Anonymous 0 Comments

Easiest explanation

Write down the directions go dri e or walk some were

Now draw almost photo realistic pictures of those places in your directions

Anonymous 0 Comments

>But an image is just code that tells the computer how to draw something (I think)

That describes vector image formats, like svg, but bitmap images just store the actual value for each pixel. In a game vector might be used for something like text, but textures and such will generally be bitmaps, or some compressed bitmap format.

Game physics engines are simplified versions of real world physics. They’re storing the rules by which the objects behave. The rules are pretty simple, it just gets to be a whole lot of computation needed when you have many objects interacting with each other.

Anonymous 0 Comments

I’ll try to structure this like I would a kids book about this, so enjoy my presentation of

#Abstraction and Me:

We store things in something called binary, it’s a bunch of ones and zeros. One “bit” is one space that we can store a one or a zero, and our computers actually physically store all information by using these spaces for ones and zeros. 8 bits is commonly called a byte, and a million of those is a megabyte.

First we have text. To store text, we store every individual letter, and every letter takes 8 bits to store. If you opened your computer and wrote up a document, the size of it is about how long it is.

Next we have pictures. We store these by using a big grid of pixels, which also take about 8 bits per pixel. To make pictures look better, we can add more pixels, and make the grid have more spaces for its size, but that means more bits and more space.

To make videos, all we need to do is store a bunch of pictures, because a video is just a bunch of pictures one after the other. The frame rate is how smooth the video looks, usually we store 24-60 pictures of a video per second. The higher the frame rate, the more pictures used, and the more bits used, so the more space it takes up.

To make audio, we take a sound and slice it up! It’s like taking a video and creating pictures of each frame. Just like with how many pixels we want in a picture, or how many pictures we want in a video, the amount of cuts we want in audio is called the bitrate. Higher bitrates sound better, but take more cuts, so more bits! Actually storing audio is a bit complicated, so for the sake of the question, I’ll leave it at this.

Now, I kind of lied a little. I told you about some common ways we store data, but in fact for every type of information we store, we have multiple ways of doing it. That’s actually what file formats are. The difference between a .jpg and a .png is the method of how they store the information of the picture.

All of this is possible due to something called Abstraction! It’s the ability to take a very small piece of information, and by using math and logic, we can conveniently store very complicated ideas by building a “factory” to do so. Bits turn into pixels turn into pictures turn into videos, but the video is using the same code for the bits themselves!

#Aterthought
Videos and sounds take up so much space these days because in the past we worried about our space since we didn’t have the technology to store that much stuff in computers.

These days, storage is not a problem for most devices, so instead of worrying about how big things like games or videos are, we care about how complex things look, how detailed graphics are, and how good music sounds. We have the space for it, so people like game devs care less about optimization of space and more about trying to make the game as good looking as possible, with as many unique assets as possible, which is the tradeoff.

Anonymous 0 Comments

Because they’re a LOT of data.

This is a simplified example.

Imagine for example that you have a single pixel. A dot.

For this single dot we need to know the color. So you have the Red, Green, and Blue (RGB) values. each one on a scale of 0-255. A single byte can hold a number up to 255. So for each single dot, we are using three bytes of space.

Now let’s look at a single picture. The standard video size is 1920×1080 (1080p.)

1,920*1,080*3 = 6,220,800 Or around 6 Megabytes for a single frame of video.

Now consider that a video is generally 24 frames per second:
6,220,800*24 = 149,299,200 or roughly 150MB per second for HD video.

We make these much smaller of course with compression, which would probably be a whole other ELI5.

Anonymous 0 Comments

You don’t have to tell a computer how to draw every ray, or place every plane, or calculate every collision.

You just have to tell it one time. It can repeat the same process for every similar action.

But you do have to tell it the color of every pixel in an image, every vector in a model, every sound in a recording. Those items are all individually unique and harder to generalize.

Computers and gaming consoles also have a lot of code already. The game code can piggyback off the existing functionality of the device code, in place of having all new code running entirely from scratch.

Like how a lot of online games and such will require you to download or update Java. There’s a lot of generic code not actually included in the game, but it is still used to make the game work.