Google has added a help document that has shows how How HTTP status codes, and network and DNS errors affect Google Search. The page goes through each type of code and error and tell you how GoogleBot and Google Search will be impacted by it.
You can find this help document over here. It says "This page describes how different HTTP status codes, network errors, and DNS errors affect Google Search. We cover the top 20 status codes that Googlebot encountered on the web, and the most prominent network and DNS errors. More exotic status codes, such as 418 (I'm a teapot), aren't covered. All issues mentioned on this page generate a corresponding error or warning in Search Console's Crawl Stats report."
I am not going to copy and paste all the details, but I took a screenshot of the page to see if it changes in the future. Some items that stand out to me include:
- Googlebot follows up to 10 redirect hops and then after 10, Google Search Console will show a a redirect error
- 301 redirects are a "strong signal" but a 302 is a "weak signal" for a redirect.
- Don't use 401 and 403 status codes for limiting the crawl rate. The 4xx status codes, except 429, have no effect on crawl rate.
- And you have to love the teapot 🫖 418 code
Those are some that stand out to me, but the whole document is just wonderful to have from Google.
This document is really awesome, it tells you what to expect from Google Search for HTTP status codes, DNS errors and typical network errors.
Some commentary:
"We cover the top 20 status codes that Googlebot encountered on the web, and the most prominent network and DNS errors."
— Gary 鯨理/경리 Illyes (@methode) June 26, 2021
I feel like we've been pretty clear for quite some time that the difference is minimal if at all -- and that there's no value in switching from 404 to 410. Now I wonder what else we're suggesting is minimal but which folks are over-indexing on in the name of "optimization"...
— 🍌 John 🍌 (@JohnMu) June 27, 2021
The arguments I find useful are that a 410 is "technically correct" in many cases, and that it's easier to separate 410 from 404 in log files. IMO both are good arguments for using 410 & 404; they're non-SEO arguments, but not everything has to be SEO.
— 🍌 John 🍌 (@JohnMu) June 27, 2021
For urgent individual URLs, use the removal tool. For everything else, just let it crawl & drop naturally. If you remove 100k URLs, I doubt a 404 vs a 410 will be noticable for the rest of the site's visibility, even in a best-case scenario. If they're already gone, gone is gone.
— 🍌 John 🍌 (@JohnMu) June 27, 2021
Forum discussion at Twitter.