How does computer understand whether some file is text, video, image. As it only understands binary at the end of the day ?

324 views

I tried going through various articles and videos available online, but couldn’t really understand, properly!!

In: 0

9 Answers

Anonymous 0 Comments

The simplest way to tell is to give the file a name that tells you what it is. This is a file extension, like the `.jpg` in `somecoolphoto.jpg` or `.docx` in `FinalThesis.docx`. These extensions aren’t any kind of magic, they’re literally just names. You can still rename `FinalThesis.docx` to `FinalThesis.jpg` and open it up just fine (probably, if the program isn’t actively checking the name). You can even simply not have one at all. But some systems, particularly Windows, uses the file extension almost exclusively to determine what type a file is. Windows, in particular, by default will even outright hide some of the more common filename extensions when you look at them in the file explorer unless you explicitly tell it to show them to you.

When that fails (or isn’t used), many (but not all) file types contain headers. These are chunks of data at the very beginning of the file that tell the program reading the file what’s inside, like a tiny little included user manual. These headers usually begin with a short sequence of bytes that are unique to the kind of file it is. These are called “file signatures”, [and you can see a list of more of the well-known ones here.](https://en.wikipedia.org/wiki/List_of_file_signatures) For example, all PDF files start with the bytes `25 50 44 46 2D`. So if your program starts to read the file and the first five bytes are this sequence, it can be pretty confident that what it is reading is *probably* a PDF file. Unix systems like Linux often use this kind of type identification. It’s not uncommon to see files that have no extension in their name at all and yet the computer can still tell what kind of file it is using the header information.

If you’re specifically downloading files off of the Internet through a browser or sending attachments over email, there’s also MIME types. These are essentially the server outright telling the downloader what kind of file it is, e.g. “I’m going to send you a PNG image file, please treat it like one.”

If all of these fail, the computer basically gives up and says “lmao I don’t know, it’s a binary file ¯_ (ツ)_/¯”.

You are viewing 1 out of 9 answers, click here to view all answers.