A couple months ago, Google released an incredibly useful feature in the Webmaster Tools labs named fetch as GoogleBot. It basically allowed you to see what GoogleBot sees, enabling you to see crawl issues, hacks, injected links and other webmaster related issues as a GoogleBot. But when it came to PDFs, I don't think the tool worked properly (yes, it is in labs).
A thread in the Google Webmaster Help forums has one webmaster asking why the feature doesn't work with his PDFs. He asked:
For example, in the URL in question, http://www.knowitall.com/literature/spec/95731_Pharmaceutical_Excipients_Spec_Sheet.pdf, the text "Pharmaceutical Excipients Database" is in the pdf, but in the "Fetch as GoogleBot" results window, none of those terms are found--the results are basically in binary format. The document is found by the Google Search engine so it is apparently extracting the human readable text.
I couldn't run a test on that document, but I used a PDF on my server to compare. I ran five different tests on two different domains, with a bunch of different types of PDF documents and they all came out with gibberish binary format results in the fetch as GoogleBot. Here is one sample screenshot:
Not, Susan Moskwa from Google said in that thread:
FYI we're looking into this issue, so sit tight. If it looks okay in search results the problem is probably not with your site (we've been able to reproduce it for other sites as well). Thanks for letting us know.
So it seems like they may get the fetch as GoogleBot feature working for PDF documents?
Forum discussion at Google Webmaster Help.