Many major search engines now have the capability to index PDF files
created by Adobe Acrobat and return them in search results. If you are a
Web site owner with PDF files on your site, this is good news.
What you may not know is that this capability presents potential
usability problems, especially for searchers. What is the big deal?
Let's find out.
Searching for "blessing of a Christmas tree" on Google returns a link to
a PDF file in the results. (www.usccb.org/publishing/advent2003/XMASTREE.PDF)
If searchers click on this listing, the link automatically opens a PDF
file with no navigation to the main site. Users are trapped! They have
no way to explore other pages of the site for more information.
So, what's going on and, more importantly, how to do we fix it?
Essentially, the PDF format is not the culprit; the real problem is the
author's failure to create the files with Web users in mind. This is not
unusual since pdf files are often documents created for other media and
not specifically for the web.
PDF authoring software, such as Adobe Acrobat 5.0, offers the ability to
include both a navigational structure and hyperlinks on a PDF page. This
will allow users who land on this page from a search engine to continue
to navigate the site. Whenever possible, use the built-in capability of
the software to add navigational elements before publishing the
document.
Ideally, the best solution is to create your pages in HTML, rather than
PDF format. If the information contained in the pdf files is very
popular or highly requested, consider making the effort to convert them
to HTML for best results.
Depending on the purpose of the document, a PDF format can be
preferable. For example, PDF files offer better functionality for pages
that are highly structured and commonly printed, such as application
forms and price lists.
To fix the PDF USER TRAP, you will have to republish your files, adding
some type of navigation structure and/or link back your main Web site.
An easy way to accomplish this is to add a footer to the bottom of each
page that includes a link back to the home page. Often the pages
published as .pdf files were never intended to receive traffic from the
search engines in the first place. If you have .pdf files on your site
you do not want to be accessible to the search engines, the best
solution is to place all of your PDF files in a single folder and do a
robots exclusion (http://www.robotstxt.org/).
Don't overlook the potential traffic from your .pdf files. Take a few
extra steps to help users continue on to your site and you may be
surprised and pleased by the results.
Craig Geis is the search engine specialist for the The Karcher Group
(http://www.thekarchergroup.com), a full-service web design and
marketing company based in Canton, OH.