Earlier this week, we reported that Google would fix the soft-404 errors that were inaccurate showing up in the new 404-like reports within Webmaster Tools.
Today, I spotted a Google Webmaster Help thread that notes that Google also includes 5xx-like status errors in the soft-404 reports.
JohnMu from Google said in the thread:
Soft-404 errors include server side errors that are shown on your pages which do not result in an error HTTP result code (5xx or 4xx). For example, it could be that some of your pages were showing database errors as part of the content.
I have not seen real life examples in my reports of a 5xx resulting page showing up in my soft-404 reports. But yet, I don't have any soft-404s. But John from Google said it happens. He even showed examples of such pages in the search results.
So what do you do? John from Google offers a suggestion:
These are issues which are hard to diagnose, which is one of the reasons we've started reporting them in Webmaster Tools. A possible solution would be to make sure that all of your content could be fetched from the database before returning it with a HTTP result code. If you run into server errors along the way, you could return a 500 or 503 HTTP result code with the content. This would still allow you to show the error messages to users (and you -- as you are working on your site) but would prevent search engine crawlers from accidentally crawling and indexing those errors. I'd also recommend logging those errors and working to try to find out more about them, so that you can prevent them in the future, making sure that your site is always a great user-experience, even if a bunch of users should come at the same time :).
Forum discussion at Google Webmaster Help.
Update from JohnMu at Google:
Hi Barry - We don't directly include 5xx HTTP result code URLs in the soft-404 section; it's more that there are many sites that return server errors within the context of normal pages. For instance, when a database-query fails, it might be that they include the database-error as a part of the content of a page that returns 200 OK (instead of changing the HTTP result code to 5xx).