Why are there so many file formats for audio, video, and photos?



I know there are some formats that are for raw files, but most others are compressed. Why are there so many of those?

In: Technology

[Relevant xkcd](https://xkcd.com/927/)

Sometimes they serve different purposes. For example GIF and PNG are lossless formats so they are meant for graphics, while JPEG is a lossy format so it’s better for things like photographs. Sometimes they’re just advanced versions of previous formats, for example PNG is the successor of GIF, with better compression and quality. Sometimes they are just competing formats developed by different companies.

NB: lossless and lossy are terms used to describe the method the raw information is compressed. Some formats, like JPEG, actually damages the images…so you lose things like transparency, and get jagged edges on your lines

From a photo perspective, raw files are determined by the camera manufacturer. Mostly because each camera company has their own way of processing image data. Keep in mind that raw files are actually just raw data tables that another program has to turn into an image. Camera companies used to saddle their devices with their own software, but now with Adobe being the industry standard, companies often just develop plugins than can be installed to make it easier to process the Data.

Other formats like PNG (Portable Network Graphics) are used in graphics because it is lossless, meaning, each time it’s used/opened the information isn’t affected.
So it comes in handy when working with graphics and/illustrations. It’s not great for photos since it’s colour profile is strictly RGB (red green blue) which makes it great for screens but not good enough for print; which is why TIFF (Tagged Image File Format) is used.
TIFF is a lossless format that has some compression but without degradation and this is the format most print and photographers work with since the colours can be worked in both RGB (screen stuff) or CMYK (printing and paper stuff).
After work is done all images and digital anything are often exported to JPEG which is a lossy format…since it’s designed to be on the web the compression puts more emphasis on size instead of quality. So to get the best JPEG you have to start with the best possible version of everything else.

These formats were all invented at different times in computer tech history, under different constraints.

A lot of early image formats, were devised to go with a particular OS or computer hardware, and the format’s colour scheme matched up with the colours that computer was capable of producing, etc. An early frontrunner as a standard was .bmp or “bitmap”, which was supported by a wide variety of systems including Windows, Macintosh and Amiga It wasn’t until the Internet started to become important, that interoperability and data compression began to matter – and, critically, intellectual property.

In the early Internet, .gif and .jpeg were the two ruling image formats, and they used quite different types of image compression, and had quite different features. Like .gif could only encode 256 colours, but it could animate. .jpeg could encode millions of colours, but the compression degraded fuzzily and could make text look bad.

And the patents to these formats were owned by different entities – and both of those entities required you to pay a licensing fee if you wanted to make and sell software which used these formats.

And as time went by, the math and computing theory behind data compression was advancing, so people were exploiting newer techniques to get more usable info into less storage space the whole time too.

It was the same with video formats. .mpeg and .avi and .mov (and RealMedia, and some others…) all came with different licensing expenses if you wanted to release software to play them. And the art and science of getting high-resolution video to compress well and look nice, well, there were a lot of approaches with quite different strengths and weaknesses. Around the early 2000s there was a big explosion of different software vendors releasing their own slightly-improved optimized codec as its own uniquely branded and licensed thing, and it became an interoperability nightmare. DivX and XviD were just 2 out of dozens of mutually incompatible and confusingly named formats out there. Various [‘Codec Packs’](https://en.wikipedia.org/wiki/K-Lite_Codec_Pack) started popping up, to try and give users an easy “this should play most videos you ever find online” bundle.

Each of these formats had a company behind it fighting for market dominance, whereas people writing video player software for end-users, just wanted to be able to support everything without the user having to worry about it.

So the end result is that we have a great big installed base out there in the world, of video and image editing software that can all read and write to a pretty big handful of well-known file formats, and a lot of those format wars are now old enough that the original patents have expired anyway. But that’s the historical reason for having so many. It’s just kind of the messy leftovers of a patent gold-rush.

File formats are, according to my understanding, simply “markers” that applications can use to understand files.

Extensions are signposts that tell applications how to approach the reading of a file.

There’s nothing special about mp4, png, or gif. It’s just that companies, businesses, governments, and regulative authorities have decided that they are a standard, and have therefore integrated it into mainstream applications and software.

You could make your own file format, and developers do it all the time. It’s just that nobody would really care, and you couldn’t expect, say, Adobe Illustrator to recognize and read your file.

Because technology goes on.

Consider the venerable JPEG, for instance.
When it was released in 1992, it was really groundbreaking stuff.
It was, for years (arguably a decade or more) the best or near the best for what we could achieve for general-purpose compression of photographs.
Everybody scrambled to add support to their graphical software to compress (save) or decompress (open) JPEG images.

But time goes on.
Mathematicians keep doing research.
Computer scientists keep doing research.
Hardware becomes more capable.
JPEG, once the pinnacle achievement of graphical compression, is now, frankly, dog shit.

We can do much better now.

But, we have all this software around that’s already written to use JPEG.
And we have billions of people around the world who have saved all of their life’s memories in JPEG format.
So we can’t just *get rid of* JPEG.
And we can’t just change it, either, since that would cause confusion with older software.

So we introduce new file formats like WebP or HEIC or AV1.
Some newer software will support these new (and much superior) formats in addition to JPEG, while some older software will only support JPEG.
As time goes on, you may see JPEG very slowly and gradually decrease in prevalence.

But it won’t disappear completely.
People are very stubborn and there are a lot of people out there who have been using JPEG for 25 years and will fight tooth and nail to keep using JPEG for another 25.

So it goes.
Just in the world of graphics (let’s not get started on audio and video) we have TIFF, XPM, BMP, GIF, JPEG, PNG, MNG, JPEG2000, JPEG XL, FLIF, WebP, HEIC, AVC, AV1, BGP, etc.
There are a few situations where different file formats have different advantages (most famously PNG vs JPEG, where PNG can only be used for lossless compression and JPEG can only be used for lossy compression, and so both co-exist side-by-side).
But for the most part, the newer formats are just flat-out superior to the old ones.
AV1 is just flat-out superior to both JPEG and PNG.
But JPEG and PNG will both survive for probably at least a couple more decades, just due to people being used to them and being resistant to change.

Ask this question again in a few years and there’ll probably be another few acronyms to add on to the list.
I doubt the list will *ever* get shorter.
Every time we make a new technical breakthrough and get a better file format, we still have to support the old ones for legacy purposes.