Google Doesn't Technically Follow Links, It Extracts, Collects & Checks Later

Aug 16, 2024 - 7:31 am 0 by

Google Link Jar

Google's Gary Illyes clarified on the Search Off The Record podcast that Google technically does not follow links. Instead, Google will extract the links, collect them in a database, and then check them later. Of course, most of you know this already and it doesn't really matter much for SEO to know the difference but hey.

Gary Illyes from Google said at the 25:26 mark in that podcast:

Well, yeah, it's my pet peeve. On onesie [Google Search Central Site], we keep saying Googlebot is following links, like, no, it's not following links. It's collecting links, and then it goes back to those links. It's not like properly following links. The picture that we are painting is that Googlebot is like hopping from--

Gary then did a bit of a post on this on LinkedIn, explaining more. "You probably heard it before that Googlebot "follows" links. It doesn't. But it's a pretty illustrative way to describe what Googlebot does," he said.

He wrote:

A recent Search Off the Record episode (https://lnkd.in/eG566yve) caused some ruckus because we apparently "leaked" that Googlebot doesn't just "follow" links it finds in a page it just downloaded. If you ever spent some time analyzing your server's access logs in the past, say, 15 years, you already knew that that's not the case. There's more involved than just blindly making a request to URLs found in a elements; there's deduplication across protocol variants, there's prioritization of URLs, there's coffee or lack of, thereof.

So why "follow" then? As much as I don't like it, it is a very simple way to explain what Googlebot actually does. There's value in using simple analogies (similes?), but there's also a place for going for more indepth explanations. You choose the one that you think will work for the audience you're talking to at the time.

Here is the embed to listen to it:

Gary also added in a comment deep inside LinkedIn over here in a different language, "btw, we have another link extraction system in the indexing process (for fancy/stupid links)."

There is also this question from Kristine Schachinger who asked, "I am confused. I know that Google can trip dynamic sites to "create pages" from internal links, which I assumed only happens on crawl, so how does that happen in this scenario?" Gary responded saying "I don't think there's a relation between the two things. Crawlers see a link and eventually they go back to that link (and if they don't, at least in Googlebot's case, you end up with "Discovered, not crawled", or whatever Search Console reports). If they go back, the new page is dynamically created. The thing we've used to do with wget to recursively download stuff in ~realtime doesn't exist with modern crawlers."

So Google does link extraction in many ways and it does not immediately follow those links that it extracts.

Forum discussion at LinkedIn.

 

Popular Categories

The Pulse of the search community

Follow

Search Video Recaps

 
- YouTube
Video Details More Videos Subscribe to Videos

Most Recent Articles

Search Forum Recap

Daily Search Forum Recap: November 15, 2024

Nov 15, 2024 - 10:00 am
Search Video Recaps

Search News Buzz Video Recap: Google November 2024 Core Update, AI Overview Hyperlinks, SEO, Ads, AdSense & More

Nov 15, 2024 - 8:01 am
Google Maps

Google Maps Search For Products Nearby Carousel

Nov 15, 2024 - 7:51 am
Google

iPhone Gets Native Google Gemini App

Nov 15, 2024 - 7:41 am
Google

Google Chrome To Spotlight Merchant Center Promotions

Nov 15, 2024 - 7:31 am
Google

Google Search AI Sales Assistant

Nov 15, 2024 - 7:21 am
Previous Story: FTC Banned Fake Reviews