PDFs on the Web (Part 1): Problems

A crying mask with a red-on-white PDF logo overlaps a laughing mask with a white-on-red PDF logo.It’s happened to all of us. You’re on a new restaurant’s website. You click on “Our Menu“…or you’re on a local nonprofit’s website. “I wonder what’s in their ‘2010 Annual Report‘…”

*Click* And… your… computer…… loc… ks………… up. It’s a PDF on the web!

PDFs (“Portable Document Format” for the acronymically-inclined) are all over the web. They’re created using programs intended for word processing, desktop publishing, and scanning, but they find their way onto the web by the thousands.

There are appropriate times to put PDFs online, but there are a lot of issues to consider before doing so. In this first part of a two-part series, I’ll examine the problems that arise when putting PDFs online. In part two, I’ll review appropriate uses of PDFs on websites and best practices for creating and distributing them.

Why People Put PDFs on their Website

I’ve worked with small nonprofits. I’ve worked with consultants. I’ve even worked on a 6,000+ page website for a small college. No matter the client, PDFs creep into websites. Let’s think about reasons why people post PDFs:

  • “This looks great!” Example: Event invitation
  • It’s time-consuming to format certain content on the web. Example: Restaurant menu
  • It’s technically challenging to make a web version. Example: Map or Fillable Form
  • “Someone already wrote this; why write a second version?” Example: Brochure
  • It’s faster to put up a PDF than copy and paste the content into separate pages. Example: Newsletter

PDF Problems

All these reasons are well-intentioned and understandable. But with each PDF comes a slew of risks and problems.

Slow to Load

PDFs are often slow to load for two reasons:

  1. Large file size
  2. Opening a new program

The file size usually results from a combination of document length, number of images, and saving at print quality. On top of file size, a standard PDF link usually opens in a new program by default or in a sluggish browser plugin.

“Power users” can work around these issues with program configurations and powerful hardware, but people using older computers or slower internet connections cannot. This also affects another set of devices with slower processors and internet connections: mobile phone users.

Any interruption in a user’s experience on your website is a reason for them to leave, so avoiding long pauses should be a priority for any website owner.

Bad for Accessibility & Search

I’m combining two problems that stem from the same technical issue. Unless PDFs are formatted and created by someone who knows what they’re doing, they can be unreadable by people relying on screen readers and by search engines like Google, Yahoo!, and Bing. Some PDFs made from scans don’t even technically contain text (just an image of text)!

To state the obvious, any technical issue that makes your content impossible to read for a segment of your audience (not to mention Google) should be avoided at all costs.

Low Granularity

“Granularity” is a useful concept when thinking about information on the web. In this context, “granularity” means the size of content chunks (“granules”) that can be individually accessed.

For example, many nonprofit organizations post their newsletters in PDFs on their websites. The lowest level of granularity available in this format is the issue. To read any article, you have to open an entire newsletter. A better level of granularity for a newsletter on the web is by article.

There are many advantages to increasing granularity:

  • Let your users only read the content they want.
  • Let your users share individual pieces of content on social media sites such as Facebook and Twitter.
  • Promote only the best pieces of content on your front page or newsletter.
  • Improve the ability for search engines to show relevant content. ((Due to the number of words in a newsletter, it’s hard for search engines to whittle down what’s most important in a newsletter. In an article with fewer words, search engines have better luck. That translates to more relevant search hits.))

With a low level of granularity, the bad goes in with all the good. To publish the exciting fresh content in a PDF, you have to include the boring and out-dated stuff too.

Not Written for the Web

Among the reasons for posting PDFs, I mentioned the example of a brochure. This is a prime example of content posted on the web but intended for print. In the “real world,” brochures are held, folded, unfolded, refolded, passed around, and sometimes even cut up.

When posted in a PDF, the pages appear out of order (see below) and contain language intended for someone reading something in their hands rather than on a computer screen. Good web writing requires shorter paragraphs, different page and sentence structures, and strong keyword use, considerations that differ from a brochure.

The 6 panels of a brochure, ordered from left to right: 2, 6, 1, 3, 4, 5.
A common two-page layout for printing a back and front, two-fold brochure. Note that when viewed as two pages rather than six panels, the order is wrong.

A Little Time Gained. A Lot Lost.

Reviewing the reasons people put PDFs online, it boils down to a lack of time or capacity ((In this case, “capacity” is the time required to post non-PDF content or the money to pay for someone with technical expertise.)) . I can’t help but think that this is often short-sighted. The time gained by putting up the PDF rather than a web page could be as little as 20 minutes or as long as a couple hours. But the time gained by the page author is time lost by the reader:

  • Each reader loses time opening the file.
  • Once open, the reader sometimes has to sift through irrelevant content.
  • Some visitors may decide to not even open the file or may leave sooner than if the content were posted in a more web-friendly way.

Looking Forward

In Part 2 of this series, I’ll review ways to responsibly post PDFs online and how to get back that reader’s lost time.

Talk Back

Share your own reasons for posting PDFs online or other problems PDFs have caused you.

Join the Discussion

This site uses Akismet to reduce spam. Learn how your comment data is processed.