[ELI5] Why does Notepad display a wall of garbled text when opening a non-.txt file?

515 views

Every so often, if I open a non-text based document in either Microsoft Word or Notepad, it will open a massive file with an endless wall of completely garbled, gibberish text, most of the characters being either rectangle boxes or characters that can’t normally be typed. What does each of these characters represent? What happens if I insert or delete these characters?

Usually files would refuse to open with an incompatible format. How do these text-processing softwares somehow manage to run virtually any file?

In: 15

13 Answers

Anonymous 0 Comments

Most Programms that refuse to open incompatible formats have sanity checks and similiar build in that expect at a certain point of the file or its metadata(the data about the data(aka when it was made, the name of it, the file extension) to get a specific set of data(1s and 0s) if it dosnt find those the programm assumes the file is A) incompatible, or B)potentialy damaged or C) it trys to read it anyway, but what the programms does with it causes the programm itself to fail and crash.

Notepad has far less checks buildin, and it does one thing that(unless you specificaly try to exploit bugs etc) is almost impossible to “fail”, it takes whatever data is in the file, and displays it as text, using usualy UTF-8(or ANSI/ASCII) to decode it, each set of 1s and 0s represent one character in UTF-8 and ASCII

NOTEPAD gets something “readible” out of it even if its garbled and not in the right format becuase at its core every file is the same, its a collection of 0s and 1s. notepad just takes this data and assumes its all Text and parses it acordingly.

You can see it happen pretty easily if you for example open almost any(if not even all) modern executables(.exe file) in notepad, the first 80 or so symbols are pure gibberish, they represent instructions your PC executes HOWEVER right after you have a line “This programm can not be run in dos mode” its clear text because it was meant to be displayed as clear text if you try to execute this file under DOS

why dont you see more “clear text”? because most (or really any)of the actual text of the programm wont be located in the Executable but in other files, on the example of Chrome you can find a good chunk of clear text between instructions(gibberish) in the locales files for your language

You are viewing 1 out of 13 answers, click here to view all answers.