what is a file format?

230 views

Hello, i’m trying to understand what a file format is, what it means, and what the sorrounding context is.

when people talk about a file format, are they really just referring to any type of file?

mp4 is a file format?

mp3 is a file format?

pdf is a file format?

txt is a file format?

doc is a file format?

is that what people are talking about when they say “file formats”? just different types of files that can or should be opened by different programs?

thank you

In: 0

7 Answers

Anonymous 0 Comments

Data is just a bunch of bits, `0`s and `1`s. When I make a file, the only real rule for those bits is that they come in groups of 8 called bytes. As far as the file is concerned, nothing else really matters.

So what makes a “format”? What makes a picture a picture vs what makes text just text? We have to give those bits some structure. They have to mean something. The computer must understand that meaning, and we call that the “format” of a file. Just some examples:

* Plain old text (TXT) normally adheres to the ASCII standard. You can check out an [ASCII table](https://www.asciitable.com/) to see how to turn bytes into english text, and how things like byte #10 (LF) represents pressing Enter.
* DOC is Microsoft’s document format, adding more information about the text like formatting (**bold**, *italics*) and information about the page it’s going to be printed on.
* MP4 contains both audio and video, so it needs to specify how to separate them and provide other data like a framerate. Arguably the audio and video are also their own formats with sample sizes, resolutions, etc.
* A PDF is meant to be a digital version of a printed page, so the page has a size and things get drawn on it, sometimes freehand and sometimes text with a font.
* An MP3 is just audio. In fact you could put it into an MP4 file as the audio track under the rules of MP4 formatting.

And so on and so forth.

Some formats are well documented, available as a manual telling you how to read and understand them. You could write your own program to use them if you have the skill. Others are just used internally by the software and not meant to be understood, like the data files of a video game containing the maps; only the game and its developers need to understand that.

When you hear the file’s “format” is “MP4”, you usually think the file’s name ends with `.mp4`, and that’s a convention so that humans and software have expectations. When your video player sees a file with such a name, it goes straight to the MP4 file format reader which will try to understand what the file actually contains.

Guessing formats can work, but it is prone to making mistakes. Many formats intentionally start a file with a special few bytes just to identify “I am a ZIP file”, or “I am a PNG”, so that if guess work is done there is a very obvious hint right at the start of the file, making a snap judgement of “I am probably right” or “I obviously guessed wrong” easier.

You are viewing 1 out of 7 answers, click here to view all answers.