Google Search Difference Between Clustering & Canonicalization

Dec 6, 2024 - 7:11 am 0 by

Google Cluster Chips

Google's John Mueller explained the difference between clustering and canonicalization within Google Search. He said, "Clustering is basically taking the pages that we think are the same. And then canonicalization is, from those pages, which one is the best one." John said this at the 3:03 minute mark of the interview.

This came up in the excellent Search Off The Record interview of Allan Scott from the Google Search team, who works specifically on duplication within Google Search. Martin Splitt and John Mueller from Google interviewed Allan.

Allan explained at the beginning of the video, "when people think canonicalization, they sort of imagine this one black box that does all the magic things together. And it's very difficult to handle requests from people that are like, "Well, why is canonicalization wrong?" And so I tend to push people to think of it as, well, canonicalization is one step. It's I have a bunch of URLs, and I want to know which of them is the canonical, but there are other steps that are as, if not more important here, like the first one being clustering."

Allan continued to explain, "Usually, when people come to us and complain about canonicalization, the immediate thing we say is, "Oh, that's a clustering problem, because these two pages shouldn't be in the same cluster, let alone cases of canonical selection." Like, if you want to bring a canonicalization problem to me, what that is, is these two pages are in the same cluster, but they aren't actually, like we picked the wrong one. The most dire case being a hijacking, we see those and we act really fast because those are just disasters."

That is when John Mueller stepped in to summarize the answer as, "Clustering is basically taking the pages that we think are the same. And then canonicalization is, from those pages, which one is the best one? Is that about right?" In which Allan responded, "Exactly. Yes."

Then Alan gave this example, "So, for example, rel="canonical" is a bit of a magic factor that crosses both these lines. rel="canonical" will actually first try to put two pages in the same cluster. It may or may not succeed, but if two pages are in the same cluster and there is a rel="canonical" between them, then it's also a canonical selection signal."

This started pretty much at the beginning of this video if you want to listen to it:

Forum discussion at X.

 

Popular Categories

The Pulse of the search community

Search Video Recaps

 
- YouTube
Video Details More Videos Subscribe to Videos

Most Recent Articles

Web Analytics

Google Analytics Real Time Reporting Glitching

Apr 23, 2025 - 10:54 am
Search Forum Recap

Daily Search Forum Recap: April 23, 2025

Apr 23, 2025 - 10:00 am
Google Search Engine Optimization

Google Stops Supporting Special Announcement Structured Data On July 31

Apr 23, 2025 - 7:51 am
Google

More Studies Show AI Overviews Harm Google Click Through Rates

Apr 23, 2025 - 7:41 am
Google Maps

Google Posts Go Missing From Local Panels

Apr 23, 2025 - 7:35 am
Google Maps

Google Business Profiles Video Verification Gets Video Previews

Apr 23, 2025 - 7:31 am
Previous Story: Google Android Boxer Statue By Fitness Center