Every day that goes by, I become a bigger fan of setting up an XML Sitemap file for Google and the other search engines to chew on. I think Sitemap files are important for sites to take full advantage of being indexed in Google. Clearly, submitting a Sitemap file is just one small step you need to do to enjoy ranking well in Google. You can often submit a sitemap file and Google won't index all your pages. We discussed this topic yesterday with My Pages Are Dropping Out of Google: What Do I Do?
What I find interesting is not only does a Sitemap help you tell Google about your pages, it also gives Google another document to index and include in their search results. Yes, Google may index your XML sitemap file and rank it in the search results. For example, a search on inurl:sitemap.xml returns Google's XML Sitemap towards the top of the search results for me:
That being said, hundreds of Sitemap files are indexed and in the search results. They typically only come up for very specific searches, that likely won't impact the normal searcher.
Two Google Group threads are discussing this. One, JohnMu of Google replied to, saying:
It does look like we have some of your Sitemap files indexed. It's possible that there is a link to them somewhere (or perhaps to the Sitemaps Index file). At any rate, I wouldn't worry about this since these are generally not URLs that will come up in the search results, so apart from people like you who look at the details, nobody will really be seeing them.
If you really don't want them to show up in the Google results, you have a way out. Here is how according to JohnMu:
If you do want to have them removed from the index, you could have your server send a "x-robots-tag" HTTP header tag with the contents of the file. Since they all appear to be originating from a single script, I imagine adding this would be fairly easy. For more information on the "x-robots-tag", please see our blog post.
Is Google indexing your Sitemap file? Do you care?