Google: Crawl Budgets & Delays Not About Page Size

May 13, 2019 - 8:35 am 4 by

Google Tape Measure

This past Friday, while at the GooglePlex, John Mueller, Martin Splitt and Lizzi Harvey from the Google team hosted an office hours and Martin MacDonald was there to represent the SEOs. He asked a question about crawl budget and stuff and if a page the size of 10mb can reduce crawl budget compared to a page only 400kb. All the Googlers shook their heads no.

What does matter is the number of requests your server can handle. If Google detects that your server is slowing down in terms of the number of requests, Google will back off a bit to make sure GoogleBot isn't the reason your server crashes. But page size really isn't a direct factor to Google slowing down the crawl of your web site.

Here is the transcript but it starts at 31:11 minute mark into the video (you can also scroll back a bit more to hear more of the question):

Martin MacDonald: Is that [crawl budget] tied to a hard number of URLs... transfer size it is much that it might reduce a pages website at websites pages from 10mbs to 300kbs. Would that dramatically increase the number of pages they can crawl?

John Mueller: I don't think that would change anything.

Martin Splitt: It’s request

John Mueller: I mean what happens, what sometimes happens, is if you have a large response then it just takes longer for us to get that and with that we'll probably crawl them less because we really trying to avoid having too many simultaneous connections to server. So if you have a smaller response size and obviously we can get more simultaneous requests and we could theoretically get more. But it's not the case that if you reduce the size of your pages and suddenly solve problems.

Martin Splitt: Also it's also that when the response takes a long time it's not just the size of the page, it is also the response time, the service tend to respond slower than if overloaded or allowed to be overloaded. So that's also signal that we're picking up. Like this takes a really long time to get data from the server, maybe we should look into the crawl limits of the hosts code on this particular service so that we're not taking down the server.

John Muller: We look at it on a per server level.So if you have content from a CDN or from from other networks, other places then that would apply to their protocol. Essentially because like how how slow and embedded resources doesn't really affect the rest of the content on the site.

Here is the video embed:

Forum discussion at YouTube.

 

Popular Categories

The Pulse of the search community

Search Video Recaps

 
- YouTube
Video Details More Videos Subscribe to Videos

Most Recent Articles

Search Forum Recap

Daily Search Forum Recap: December 17, 2024

Dec 17, 2024 - 10:00 am
Google Search Engine Optimization

Google Site Reputation Abuse: Treating Some Sites Within A Site

Dec 17, 2024 - 7:51 am
Google Search Engine Optimization

Google: Disavowing Toxic Links Is A Billable Waste Of Time

Dec 17, 2024 - 7:41 am
Google Search Engine Optimization

Google Adds Faceted Navigation To Help Documentation

Dec 17, 2024 - 7:31 am
Google Ads

New Google Merchant Center Promotion For First Order Discount

Dec 17, 2024 - 7:21 am
Other Search Engines

OpenAI Opens ChatGPT Search To All Logged In Users

Dec 17, 2024 - 7:11 am
Previous Story: Old Google Cache Date Is Nothing To Worry About