If your like me and other obsessive SEO’s addicted to the details of fixing everything on your own or a client’s website then Google’s recent addition of showing you where your 404 crawl errors come from is a much applauded feature. Before you might ended up in the Web Crawl section of Webmaster Tools only to scratch your head quite a few times and ponder how Googlebot has discovered many of these broken links in the first place. Could this be Google’s attempt to make us better forensic SEO’s or torture us with a dribble of incomplete crawl information? You might even take it a step farther and scour the Google index and your logs trying to locate those errors. As of Monday, those headaches have ended and we can now get a source on those sneaky broken URLs and sites linking improperly to your site.
Since this information is a couple days old, I thought I would explore some of the discussion on the benefits of this new tool. WebmasterWorld has a pretty healthy discussion about this new feature and examples of how people using this tool to locate incoming links to valid urls and other 404 errors.
G1smd explains “I think people will be extremely shocked as to how many duff links they have pointing at their site and how careless the average netizen is when they cut and paste links. My pet peeve is people who post links with lots of unnecessary parameters in them, including session IDs”.
This tool has also served to help find pages that might have disappeared due to a server move or site redesign as icedowl says “I found that I'd lost the page when I did a site rebuild back in 2005.”
The discussion moved to how Google might be trying “guess” at url structure on websites and the resulting 404 errors generated because this.
“WMT also shows some 404s with no linking page for www.mysite/dir/zzzz.html where zzzz is a random number. Is Google guessing at pages on my site ? ”
SEO Mike explains “GBot will do that sometimes in order to determine how your server responds to random queries.” However, I can’t confirm myself whether Google is actually doing this or not. If your learning about this new feature for the first time I would recommend logging in and check you web crawl report and fix those urls!
Discussion continued on WebmasterWorld