what is a video file format? and is it any different then a video container format?


hello, i’m trying to understand what a video file format is and if it’s any different then a video container format, for example









are these all “video file formats” or video container formats? if these are container formats then what is the video file format of each of them and the difference between the video file format and the container format?

thank you

In: 3

File is what programs can open it (how the data is stored), format is what compression/program was used.

Think about how Microsoft, Google, and Apple all offer their own programs for word processors, slideshow presentations, and spreadsheets. The content can be near identical, but the saved file will look different (as a code) and have a different extension.

GIF is a bit different as it’s sort of saving each frame as a photo, but all the others will have different looking compression and different file sizes. Though you can easily change the file extension/type after the fact and sometimes that’s enough for it to be recognized as that file type.

Video container = Video file + Audio file + possible additional resources, such as subtitles tracks.

Video containers can be more or less advanced in features: for example, mkv is quite advanced since you can have chapters, subtitles and several audio tracks.

Common video containers are mkv, mp4, avi, flv…

For the audio the most common are mp3, aac and oga (unsure about the last one)

Regarding the video part, we usually designate it as codec instead of format. Here is a list of codecs (maybe a bit outdated though): https://helpdeskgeek.com/windows-xp-tips/the-most-common-video-formats-and-codecs-explained/

Take video, at its core. It’s a rapid slideshow of images, 30 per second or so usually, and also often sound that goes with it.

In computers, we’ve found lots of ways of digitizing images and audio. That applies to video too. Video is ripe for compression- because most of the time there’s not much difference between one frame and the next, most video CODECs (Compressor-Decompressor) work by storing one full image (key frame) and then only the differences from it for the next several frames. There’s lots of variants on that, but it’s the general idea.

Audio is then stored separately in an audio format like WAV, MP3, AAC, OGG, etc.

That means you have a compressed video stream, and an audio stream, and you have to tie them together in one file. That’s what the container does. It packages the video and audio stream, along with metadata like telling the decoder what codec to use for the video and audio, what the title of the video is, etc. The container is usually what determines the file extension.

It gets confusing because often the video codec and the container are part of the same standard, sometimes not.

Take for example MKV- MKV is just a container. MKV’s benefit is that it can hold multiple video and audio sub-streams, like a DVD where you can switch audio tracks. It’s also royalty free.

MP4 is both a container and a codec. An MP4 container will almost always contain MP4-format video and AAC-format audio.

There’s also times when a file will contain raw video data without a container. For example .m4v is mpeg-4-video, aka the raw video bitstream without the container. .m4a is mpeg-4-audio, usually in AAC format. Etc etc.

Video file formats are based on containers, containers are not always video file formats.

MPEG-2 TS is intended to be sent over a network, and can be played from the middle of a stream, so it’s considered a container. When written to disk as a mp2ts file, it is a file format.

Some containers (MP4) are made to save to a disk, and store the information about what codec was used, and information on how to feed the decoder, usually at the end (moov atom), and thus is considered a file format.

In more detail:

Video and audio is a lot of data, and it has to be made smaller (encoded).

There are lots of ways to make it smaller (codecs).

Sometimes people want to send video to other people, so they can play them over an antenna, or the internet, without downloading them entirely.

Sometimes they want to save their video to a disk, to play later.

When video and audio are played it has to be made bigger again (decoded), and it has to be played together correctly (synchronized). Containers store the information on how to perform both of these things, as well as synchronize other data (subtitles for instance).

Networks send data in small units, files are read in chunks and not completely into memory, and decoders need that data on very specific boundaries depending on the codec used.

Some containers (MPEG2 TS) are made to send video and audio over a network in tiny packets, and to allow people who start listening to those packets anywhere in the middle to put those packets together into the right sizes, to synchronize them, and to know how that data was compressed (codec), so that they can decode them without having seen the whole stream.

Some (MPEG4) are made to be played directly from a disk, and require the player seek (potentially the entire file) for information they need to decode and play the encoded streams it contains.

A “file format” is an agreement about how to represent some sort of information as a file so that someone else can know how to get the information back out of the file later. A “video file format” is a file format where the information stored represents a video.

A “container format” is a file format that arranges information in “chunks” of information with some additional description of the chunk. If you know what a zip file is, it’s a container where the “chunks” represent other files. A container has a way of describing chunks of different types. A “video container format” is a container format where the chunks contain video information, and stuff that goes along with it (sound, subtitles, a list of cast and crew, release date, copyright, an icon for the file, etc.).

In the case of video, the information has to be compressed otherwise it’s too big to be useful. There’s all sorts os ways of doing that – simplifying the number of colors, saving only the bits that change from frame to frame, etc. In a simple file format, you’d expect the video to be represented a certain way every time. In a container, you expect to find a chunk of “video” that has a description of how the video was compressed. The creator of the video has a lot of flexibility using a container on how they shrink their video down, and they can use any of the methods that the container can describe. A video player will use the description to decide which method it will use to playback the video.

So, a .gif, which is not a container, always stores video using the same approach. It doesn’t include sound, subtitles, and you can’t add those things.

An .mp4 file is a container; it can store many chunks of audio and video using a whole bunch of different methods. It can have multiple audio chunks so the same file can multiple languages, or commentary. It can have subtitles, text, a preview icon, etc. The container not only contains the information (in chunks) but also a description of it all (“audio track 1 is English stereo, track 2 is Spanish stereo”, etc.).