We had a slew of tweets triggered by Gary Illyes of Google and then followed up by John Mueller of Google around robots.txt and XML sitemap files ranking in Google. In short, if they rank for normal queries, John Mueller said "that's usually a sign that your site is really bad off and should be improved instead."
Let's start with Gary's tweet:
Triggered by an internal question: robots.txt from indexing point of view is just a url whose content can be indexed. It can become canonical or it can be deduped, just like any other URL.
— Gary "鯨理/경리" Illyes (@methode) November 6, 2019
It only has special meaning for crawling, but there its index status doesn't matter at all. pic.twitter.com/bBMXy1XcRF
A robots.txt file can be indexed and ranked in Google is what he is saying.
John then adds that you can block these from being indexed using the x-robots-tag HTTP header.
Use the x-robots-tag HTTP header to block indexing of the robots.txt or sitemaps files. Also, if your robots.txt or sitemap file is ranking for normal queries (not site:), that's usually a sign that your site is really bad off and should be improved instead. https://t.co/DpWz6sYanN
— 🍌 John 🍌 (@JohnMu) November 7, 2019
But if you do see your robots.txt file ranking or your sitemap file ranking, he said " if your robots.txt or sitemap file is ranking for normal queries (not site:), that's usually a sign that your site is really bad off and should be improved instead."
You can also use the disallow John added:
Tip: "disallow: /" also includes /robots.txt.
— 🍌 John 🍌 (@JohnMu) November 7, 2019
Maybe I am misunderstanding, but John said disallow doesn't work here in 2018? Here is his tweet from back then?
It doesn't affect how we process the robots.txt, we'll still process it normally. However, if someone's linking to your robots.txt file and it would otherwise be indexed, we wouldn't be able to index its content & show it in search (for most sites, that's not interesting anyway)
— 🍌 John 🍌 (@JohnMu) June 29, 2018
I guess when you disallow in robots.txt it is too late anyway.
John said there is really no reason to let Google index your sitemap file, Google processes that differently:
No. A sitemap file is usually just meant for direct usage by programs, it doesn't need to be indexed.
— 🍌 John 🍌 (@JohnMu) November 7, 2019
Anyway, I thought you'd find these tweeted, compiled in one post, useful.
Forum discussion at Twitter.