Google Search Indexing Tiers

Jan 20, 2021 - 7:21 am 1 by
Filed Under Google

Google Indexing Tiers

In addition to the language indexing diversity, Gary Illyes from Google said in the Search Off the Record podcast that Google uses different indexing tiers. He said the search company "might use different kinds of storages to build the index." Some of the index goes on cheaper storage and some go on more expensive storage to be served and accessed faster.

If a document needs to be served often, Google might use one type of storage device over another. This is to balance cost and efficiency.

This part started at about 7:03 into the podcast.

Gary explained how computers are built to explain why Google uses different levels of storage types for its indexing tiers. Gary said:

If you think about it, when you are building your computer, for example, if you are an idiot like me and builds their own computer, then you will think a lot about the storage mechanisms that you put in your computer. First, you are going to have RAM, for example, R-A-M, random access memory, which is the most expensive kind of storage that you could possibly put in your computer. While maybe the L1 caches or L2 caches are more expensive, but you are not putting those in your computer. Those are integrated.

But the first one that you can put in your computer, that's RAM. That's the most expensive kind of storage. They come in small capacities. And then after that, you have to choose between a hard drive, like a magnetic hard drive, or a solid state drive. The solid state drive is more expensive, but it's way faster. I don't remember the exact number, but it's orders of magnitude faster than a hard drive.

And that's because, for example, you don't have seek time on solid state drives. You can just go to a specific section right away at the speed of light quite literally and start reading from that section. While with a magnetic drive, like a hard drive, you actually have to move the arms of the hard drive to a specific section, to a specific disk, and start reading from the section where you believe that the data is.

He then explains based on "how many times we think that the document might be served, we might store the documents in our index in these different kinds of storage mechanisms." This is how Google defines its indexing tiers he said, "And that's what practically defines the index tiers that we have." "So for example, for documents that we know that might be surfaced every second, for example, they will end up on something super fast. And the super fast would be the RAM. Like part of our serving index is on RAM," Gary added.

He goes on a bit more "Then will have another tier, for example, for solid state drives because they are fast and not as expensive as RAM. But still not-- the block of the index wouldn't be on that. The bulk of the index would be on something that's cheap, accessible, easily replaceable, and doesn't break the bank."

It makes sense that Google would take this approach to storing information in its search index like this.

Now, you will ask, how does one optimize to be on the most expensive indexing tier? :)

Here is the embed so you can listen:

Forum discussion at Twitter.

 

Popular Categories

The Pulse of the search community

Follow

Search Video Recaps

 
Video Details More Videos Subscribe to Videos

Most Recent Articles

Search Forum Recap

Daily Search Forum Recap: October 30, 2024

Oct 30, 2024 - 10:00 am
Google Updates

Google: Search Ranking Update Coming Soon But HCU Victims May Not Recover

Oct 30, 2024 - 8:01 am
Google News

Google Ad Revenue Up 10% - With Biggest Ad Revenue Quarter Ever

Oct 30, 2024 - 7:51 am
Google Ads

New Google Ads Business Links Asset Type

Oct 30, 2024 - 7:41 am
Google Ads

Confirmed: Google Ad Rep Made Unauthorized Changes To Ad Account

Oct 30, 2024 - 7:31 am
Bing Ads

Bing Tests Hiding Ads Labels After Domain

Oct 30, 2024 - 7:21 am
Previous Story: Google Diversity With Indexing Languages For Search