Bill Slawski's Google Omits Needless Words (On Your Pages?) at SEO By The Sea has Google texts, patents and patent applications that discuss how Google may handle boilerplate content.
Boilerplate content on a web page might be content that is found repeated on every page of the site. It may include legal texts, copyright notices, terms and conditions, and so on. Bill runs through some scientific documents by Google and comes up with some observations on the possibilities of how Google may treat boilerplate content. They include:
- Most important, search engines might ignore boilerplate content
- If the above is true, the location and number of times that text is used on your site may be a critical detail
- Google may still look at the anchor text within your boilerplate content
- We do not exactly know how Google treats boilerplate content now and if that may change in the future
Bill's post, like most of his posts, makes it into the forums for discussion. We have threads at Sphinn and Cre8asite Forum.
IAMLost in the Cre8asite Forum thread expands Bill's thoughts a bit by adding some more context from the patent Bill explored:
identification and classification of repeat non-content across pages (boilerplate):
* certain words especially if attached to links, i.e. home, about us.
* certain spacial areas, especially if including links, i.e. blogroll, nav links, but even if few/no links, i.e. header, footer.
* certain markup, i.e. javascript, but possibly also CSS id/class names such as header, footer, nav.
Forum discussion at Sphinn and Cre8asite Forum.