Google's John Mueller explained how Google handles redirects within its index. At the 5:30 mark in his November 10th SEO office hours hangout, John said they "put them (the URLs) into a shared cluster that we then use for canonicalization."
Once they are in this shared cluster, Google goes through the dupe detection process for selecting the canonical. Signals Google uses for that, as we covered before, include not just if there is a redirect but also the "internal links within your your website, external links, sitemap files, other annotations that you have on these pages," John said. Keep in mind Google does not assign weights to these signals manually, machine learning handles it as Gary Illyes said.
In short, you want to be as clear and consistent as possible with all those signals. This way Google knows which URL you want to be the canonical URL. So if you have a 301 redirect from URL A to URL B, but there is a canonical link back from URL B to URL A, that is not a consistent signal. Things like this happen, and happen a lot on the internet. So Google uses many signals to try to figure out which URL should be the canonical.
An visible issue you may see with redirects and how Google Search Console displays them is "Of course the part that's also associated with this is in Search Console we show the data by canonical URL. So you will also see that shift in Search Console and it makes tracking a little bit confusing sometimes. So I agree that can be annoying," John Mueller said. But you should be aware of this.
This is not a ranking issue because one of the URLs will rank - the question is, is the URL you want to rank, ranking in Google or not.
Here is the video embed with the question and answer:
Here is a transcript of the answer John gave, although I didn't transcribe when he talked about internationalization, so keep listening if you want to hear that part:
When you have redirects from one page to another. From our point of view what happens there is we take those two urls, the old url and the new url and put them into a shared cluster that we then use for canonicalization.So we essentially say these two urls lead to the same content, which of these should we be showing.
And we use redirects to try to figure out which of these we should be showing but we also use a lot of other factors things like internal links within your your website, external links, sitemap files, other annotations that you have on these pages, all of that kind of comes together. And with all of those extra pieces of information we then say okay for each of these pages that are in this cluster of pages that essentially lead to the same content which of these is the best one to show. And sometimes that can be the redirect source, sometimes that can be the redirect target, depending on essentially what what the bigger picture says. So just that things shift from the redirect target to the redirect source at some point even if those redirects have been in place for a while, that's not necessarily broken from our point of view. So that's not something that we would say is unusual or anything that you explicitly need to fix. Because purely from a ranking point of view nothing changes there. It's really just the url that is shown in search. There's nothing with regards to kind of different ranking there, it's purely just the url.
Of course the part that's also associated with this is in Search Console we show the data by canonical URL. So you will also see that shift in Search Console and it makes tracking a little bit confusing sometimes. So I agree that can be annoying.
One of the things that i would recommend doing in a case like this, especially if you're seeing this change happen on a large scale across your website, is to double check all of the other factors that are involved with canonicalization. So in particular I would look at things like internal links, sitemap files, other annotations that you have on these pages with regards to hreflang or any kind of cross link that you have there and make sure that they're all aligned with the url that you now want to have indexed. And the more you can make all of those factors align the more likely we'll choose the url that kind of comes out on top after canonicalization.
So from from our side again it's not a sign that we're de-indexing pages or that anything is broken it's just we're picking the other url to show instead of this one and we're showing it the same place.
Hat tip to Glenn:
More from John about "canonical clusters": In GSC, Google will show the data by canonical url. You'll see the shift there too. If you are seeing this happen, double check all factors links internal links, sitemaps, rel canonical, hreflang, and more. Make sure they are aligned: https://t.co/XnETAmRLoJ
— Glenn Gabe (@glenngabe) November 12, 2020
Forum discussion at Twitter.