The Trouble with PDFs

Adobe's Portable Document Format is so advanced it makes you wonder why anyone bothers with primitive HTML. It's a completely vector-based layout format, both display and resolution independent. With PDF, you sacrifice almost nothing compared to traditional book and magazine layouts except the obvious limitation of resolution. Here's Kevin Kelly extolling the virtues of PDFs:


This is a companion discussion topic for the original blog entry at: http://www.codinghorror.com/blog/2008/01/the-trouble-with-pdfs.html

PDFs make sense in some circumstances.

PDF is a packaging format. HTML is not. That’s the reason people use PDF, not because of the better layout.

Really? Then why is this “packaged” as a PDF?

http://www.si.umich.edu/~pne/PDF/howtoread.pdf

How does this “packaging” help me, the reader?

Not hatting or anything but today is the 3rd and it says January 1st.
Anyway Jeff i have to agree with you, PDFs are great, but unnecessary, theres no reason why a competent developer couldn’t just build the HTML properly.

I mainly use PDF for read-only email attachments.

On websites it is useful to save it locally for viewing later (which for web pages is harder) but yet again I generally I agree this blog yet again - good work.

My main problem with PDF files is their forced page size. HTML/CSS is very good at resizing content to the available window size for good screen readability, whereas PDF’s almost perfect replication of paper layout invariably results in either viewing complete pages at a smaller than comfortable zoom level, or scrolling awkwardly in two dimensions to read the content.

What about

http://www.kk.org/cooltools/archives/002538.php

as an example for packaging?

i do not think the problem is the pdf itself - as stephan points out its self-contained structure is a huge plus in my eyes. the problem is the poor integration of pdfs in current browsers

you can watch movies, etc directly in your browser window. why not do this for pdfs too?

the fixed layout of a pdf IS the huge plus compared to html! have you ever assembled a report in an html table ( lets say some statistics ) and printed this out from different browsers to different printers? no chance. line breaks are different, page breaks are different – the whole structure of the document is lost. using pdfs is the only solution you have.

– as said before. not the pdf itself is the problem. in contrary. only the poor integration is the problem

One of the main reasons I disliked PDF used to be what you mentioned: on “PDF pages”, Ctrl+W didn’t have the same functionality as in normal pages.

The absolute best use of PDF files is, in my opinion, cross platform compatibility. I used OpenOffice on Ubuntu in college and my professor used Microsoft Word.

Needless to say, despite saving files in the “compatibility” mode, they would not render correctly when opened with MS Word. Tables were screwed up and the formatting was weird.

PDF files were the only way to go. That way I could be sure that the professor was seeing the exact same thing as I was.

hi,

i do not want to start a flamewar here … but i think the problem you have with pdfs is a pure windows problem. have you ever looked at the neat integration of OS X and also linux desktops nowadays? there a pdf is just handled equally good as any html page or pure text file. with full text search etc etc

-john

you will always find examples like the one jeff linked - there it absolutely makes no sense to use a pdf as a normal html file does the job too. but i am sure there are also examples out there where a pdf would have been the right choice …

PDF’s are good for reference, and because they are contained within a single file. They are good for manuals, ebooks, articles people may refer to for medical research, etc
They are easy to send around, especially since they display pretty much identically on all platforms.

The inconvenience of PDFs are justified when the content is static, and someone may want to refer back to it often, without having to worry about a web-server vanishing of the internet.

I’d think of it more as a book than a web-page.

@john - I’m not sure what you’re getting at. Adobe’s Windows PDF reader also has “full text search etc etc”.

The problem I have with PDFs is that they are restricted as if they are pages from a book. HTML isn’t like that, HTML pages accept that the web is not a book and that things work differently. As others have mentioned it is tedious to have to navigate a PDF by moving left and right as well as up and down.

I often browse the web on my Wii or my mobile. HTML is by and large resizeable (some sites better than others, some browsers better than others but fundamentally things tend to work well) PDF simply isn’t. It isn’t designed to be. This is for good reason, the intent was to produce something that looks the same everywhere which is fine for many applications but not the general provision of information…

How do you embed a font in an HTML page? You can’t. Default browser fonts suck. Times, Arial, even Georgia all look terrible at print resolution. And more specialized fonts for mathematical typesetting are usually not even available. Never mind that HTML rendereing doesn’t do kerning or properly adjusted lines.

How dou you put footnotes and sidebars on HTML pages so that they stay in the visual vicinity of the text they are referring to? You can’t, not without fixing the layout in such a way that you might as well use PDF in the first place.

I’ve encountered an employer’s website where they heperlink between PDFs - that’s normally fine, except they insist on using relative links, which means it’ll only work if Adobe Reader is configured to run as a browser plugin. As soon as I reset things to my preferred mode of working (PDFs open outside of the browser), all the links broke. The employer response was not to fix the problem, but to force me into the ‘one work mode suits all’ model. Bah!

PDF is great for read-only documents, that for whatever reason may not necessarily be resident on a web accessible location.

Especially if they’re going to be printed out. The HTML printing story is still quite commonly dreadful, I’m afraid.

I also have a sneaking suspicion the typical information worker who isn’t in a programming related field finds the concept of sending one PDF file to someone via email easier, if the source is something like a Word document or something that can be printed to the PDF printer.

The Adobe PDF tools on Windows are uniformly crap though, my biggest annoyance is the entire browser instance locking up while loading the PDF. Hello, 1998 called, they want you to not keep the UI thread busy and prevent you from doing anything else with the browser while it loads, or randomly stop responding to paint requests for whatever reason.

OS X does this so much better. (Evince on Linux still has some rendering speed “issues”, shall we say).

But let’s face it: To produce HTML, you need to be an advanced techie user. You can’t just create good nice looking HTML (which is viewable in Safari, Firefox, IE, …) with no knowledge. There are very few HTML applications around, which help producing HTML, but on the other side: Using PDF, you can basically design and write your content with ANY application out there (Word, PowerPoint, Pages, OpenOffice, …) and the output does not need to be “optimized” for one browser or the other.

I agree CSS and stuff is getting more standardized across all browsers, but there are still plenty of tweaks needed (e.g. setting IE to strict, otherwise it will fall back to poorold rendering mode). Also embedding fonts (fonts are the other 50% of a good design!) is not that easy with HTML, as well as hyphenation.

So to speak, the good old Bible, Gutenberg printed long long time ago, offers more sophistication than HTML, but could easily transfered to PDF.

just my 2ct :slight_smile: