Google Search Difference Between Clustering & Canonicalization

Dec 6, 2024 - 7:11 am 0 by

Google Cluster Chips

Google's John Mueller explained the difference between clustering and canonicalization within Google Search. He said, "Clustering is basically taking the pages that we think are the same. And then canonicalization is, from those pages, which one is the best one." John said this at the 3:03 minute mark of the interview.

This came up in the excellent Search Off The Record interview of Allan Scott from the Google Search team, who works specifically on duplication within Google Search. Martin Splitt and John Mueller from Google interviewed Allan.

Allan explained at the beginning of the video, "when people think canonicalization, they sort of imagine this one black box that does all the magic things together. And it's very difficult to handle requests from people that are like, "Well, why is canonicalization wrong?" And so I tend to push people to think of it as, well, canonicalization is one step. It's I have a bunch of URLs, and I want to know which of them is the canonical, but there are other steps that are as, if not more important here, like the first one being clustering."

Allan continued to explain, "Usually, when people come to us and complain about canonicalization, the immediate thing we say is, "Oh, that's a clustering problem, because these two pages shouldn't be in the same cluster, let alone cases of canonical selection." Like, if you want to bring a canonicalization problem to me, what that is, is these two pages are in the same cluster, but they aren't actually, like we picked the wrong one. The most dire case being a hijacking, we see those and we act really fast because those are just disasters."

That is when John Mueller stepped in to summarize the answer as, "Clustering is basically taking the pages that we think are the same. And then canonicalization is, from those pages, which one is the best one? Is that about right?" In which Allan responded, "Exactly. Yes."

Then Alan gave this example, "So, for example, rel="canonical" is a bit of a magic factor that crosses both these lines. rel="canonical" will actually first try to put two pages in the same cluster. It may or may not succeed, but if two pages are in the same cluster and there is a rel="canonical" between them, then it's also a canonical selection signal."

This started pretty much at the beginning of this video if you want to listen to it:

Forum discussion at X.

 

Popular Categories

The Pulse of the search community

Search Video Recaps

 
Video Details More Videos Subscribe to Videos

Most Recent Articles

Search Forum Recap

Daily Search Forum Recap: January 8, 2025

Jan 8, 2025 - 10:00 am
Google Search Engine Optimization

Google On Losing Lots Of Links Fast: SEOs Often Overestimate Links

Jan 8, 2025 - 7:51 am
Google Ads

Google Tests Local Pack Ads Without Review Stars

Jan 8, 2025 - 7:41 am
Google Search Engine Optimization

Google Search Console Validate Fix Does Not Expedite Fixes

Jan 8, 2025 - 7:31 am
Google Ads

Google Ad Scheduling & Smart Bidding Campaigns Doc Update Confuses

Jan 8, 2025 - 7:21 am
Google Ads

Google Ads PMax Pause/Enable/Remove Conversion Actions On Asset Group Level

Jan 8, 2025 - 7:11 am
Previous Story: Google Android Boxer Statue By Fitness Center