This morning, I spotted a group of SEOs arguing about if a URL can be "indexed" if the page is blocked by a noindex tag or robots.txt file?
It is a valid SEO question but when you ask that in a room of SEO geeks, the responses can get pretty wild.
You've all seen examples of URLs in the search results that just list the URL but not the actual title tag and snippet of the page. That is typically because Google has a copy of the URL in their database as a reference but has not crawled or indexed the content on the web page because it is restricted to do so for one reason or another.
The question is, is that URL considered indexed or not? That depends on the definition of "indexed" and which SEO you ask.
Let me share the tweets about this:
Yep -> RT @rustybrick: @AndyBeard but a page URL can be indexed without being crawled @seosmarty @rishil
— Ann Smarty (@seosmarty) February 13, 2012
@rustybrick @seosmarty @rishil In google terminology that is just a reference
— Andy Beard (@AndyBeard) February 13, 2012
@AndyBeard but a page URL can be indexed without being crawled @seosmarty @rishil
— Barry Schwartz (@rustybrick) February 13, 2012
@AndyBeard but a page URL can be indexed without being crawled @seosmarty @rishil
— Barry Schwartz (@rustybrick) February 13, 2012
@rishil indexation=appears in search results (index). crawling=going through the page itself - right? cc @AndyBeard
— Ann Smarty (@seosmarty) February 13, 2012
@seosmarty @rishil A page has to be crawled first to be indexed... a reference link != indexed
— Andy Beard (@AndyBeard) February 13, 2012
@rishil indexation=appears in search results (index). crawling=going through the page itself - right? cc @AndyBeard
— Ann Smarty (@seosmarty) February 13, 2012
The discussion went on for dozens and dozens of tweets with no one winning.
Matt or John, want to chime in and give the final answer?
Forum discussion at, um, Twitter.