Optimizing PDF Documents for Search Engines

From my experience, PDFs are one of the most widely used, yet under optimized elements of a huge portion of websites.  All too often, I see websites using pdfs in situations that would be much better served taking the extra time to create a new website page.  The situation is exacerbated by the fact that these documents are rarely, if ever, optimized.

My first recommendation is almost always “don’t use them”, take the time to create web pages and a stylesheet for printing.  If you absolutely must have them, there are a few things to consider:

PDFs are crawled and indexed in much the same way as a regular website page, and many of the elements are optimized in the exact same way.

Basic Text Crawl

The biggest blunder you can make is outlining the fonts in the document.  While this is pretty rare these days, it still happens when someone want to retain a specific graphical feel.  These documents are basically blank as far as search engines go, and all the value of the content is lost (not to mention, you will annoy users with the larger file size and the inability to copy and paste information from the document).

File Name

Just like a regular web page, the actual file name of a pdf should be basically human readable and should let you know what the file is generally about.  Having a file name like “pdf-optimization.pdf” will not only help you rank for the words in the title “pdf optimization”, but is also user friendly to the end user (and your own sanity when you end up looking for this file a month later).  If the document starts ranking in the SERPS, a user friendly file name also has a distinct advantage when it comes to click through rates.

Title and Description

PDFs have meta data available for title, description, keywords, author, etc.  Most of these are a waste of time (including keywords), but the title and description are extremely important in two ways.  First, they are most definitely used in the ranking algorithm for most search engines.  Second, and arguably more important, together, they comprise what will be the documents listing in a search engine results page.  This is, effectively, your ad to a potential visitor.  It needs to contain target keywords, give a good gist of what the document is about, and entice the potential visitor to click, all in a very small amount of space.

Anchor Text

Just as with any web page, the text of the link pointing to the pdf document will have a huge effect on the terms it will rank for.  If the link to your document says “download the pdf”, you are in essence telling the search engines that your document is about downloading pdfs.  Another common mistake is to use the standard pdf graphic as the link anchor.  If you absolutely must have the link as a graphic, remember to add alt text to the image tag.

Remember above where we optimized the file name, this has the added effect that inbound links from other sites will often use the file name as the anchor text.  If you have done your job correctly, these links are preset with good anchor text.

Outbound Links

A major pitfall of using pdfs that is almost never accounted for is that they are dead-ends.  It usually comes as a surprise that you can, and should, link out from a pdf.

First of all, if you are successful and your pdf starts ranking and getting search engine traffic, a document with no links gives no outlet for potential customers.  Not only should it be clear where the document came from (branding), but you should make it as easy as possible to move back to the main website.  Ideally, you should always be pushing visitors toward some goal (a sale for example) or deeper into your website (for informational sites).  A pdf is no different in this respect than any other web page.

Second, any link from your site to a pdf is leaking your hard earned link juice to a place with no outlet, not to mention any link from an outside site.  At the bare minimum, you should make a standard template with a targeted link in the header or footer that goes to your homepage.  After that, feel free to link throughout the content to related pages on your website.


Just Search SEO blog has an excellent point that I didn’t think of, namely that Word and Powerpoint files will also show up in the SERPs, pretty much all of the same tips above will apply to these files.

Posted in SEO


Comments are closed.