eli5 why pdf files are “Madness inside.”

452 views

I made a passing comment of asking how hard it would be to convert a pdf file to another file format by writing a discord bot for it (for our ttrpg game) and one of the players said “Hell, because pdfs are madness inside.”

Can someone explain to me why pdfs are so weird?

Edit: a typo

In: 185

12 Answers

Anonymous 0 Comments

In engineering everything is a tradeoff to achieve a stated goal.

What is a stated design goal of PDF?

1. It should be easily sent to printers
2. It should be rendered the same on any machine (regardless of fonts, OS, graphic adapters, locales, etc).
3. It should be small size for large documents (hundreds of pages)

You see how there is no goal “It should be easy to extract meaningful information from a document”?

PDF documents (and programs that create PDFs) are concerned only about how it looks, not that content is semantically makes sense.

For example, if you have 5 paragraphs on a page, there is no guarantee that they will go in the same order in the document file. The only thing that matters is how it looks.

For this reason PDF is almost as hard to read as a picture. And programs that do read PDFs do it because they coded hundreds and hundreds of real-world PDF hacks into their readers.

You are viewing 1 out of 12 answers, click here to view all answers.