Alexis Sanders of Merkle sat down with Martin Splitt of Google last year (before COVID) and spoke about crawl budget. It may be one of the more informative videos in the SEO mythbusting series to-date.
Here is what was covered, with the timestamps, if you are interested in just scanning the video:
- Why is crawl budget an interesting topic to discuss (0:00)
- What is crawl budget? (1:15)
- What is crawl rate, and what is crawl demand? (1:47)
- How does Googlebot make its crawl rate and crawl demand decisions? (2:44)
- ETags, HTTP headers, last modified dates, and similar (3:43)
- What size of sites should worry about crawl budget? (4:35)
- Server setup vs crawl budget (5:00)
- Crawl frequency vs quality of content (6:18)
- What to expect to see in one’s log files if Google is testing one’s server? (7:45)
- Tips on how to get your site crawled accurately during a site migration (8:18)
- Crawl budget and the different levels of one’s site’s infrastructure (9:40)
- Does crawl budget affect rendering as well? (10:37)
- Caching of resources and crawl budget (11:46)
- Crawl budget and specific industries such as publishing (13:34)
- What can be - generally speaking - recommended to help Googlebot out when crawling one’s site? (15:03)
- What are the usual pitfalls people get into with crawl budget? (16:52)
- Can one tell Googlebot to crawl one’s site more? (17:40)
As an added bonus, here are some questions Martin responded to related to this talk on Twitter:
That pattern is kinda normal as Googlebot might zig-zag around the maximal reasonable crawl rate.
— Martin Splitt at 🏡🇨🇭 (@g33konaut) July 15, 2020
Crawl budget issues are when you see us discover but not crawl pages you care about for quite a while and the pages have no other issues.
It's not a significant cost on our end
— Martin Splitt at 🏡🇨🇭 (@g33konaut) July 15, 2020
Either 404em or keep em around.
— Martin Splitt at 🏡🇨🇭 (@g33konaut) July 15, 2020
That'd qualify as dynamic rendering but in general these setups are "footguns" - sounds good & might work, but turns out to introduce lots of unnecessary complexity that backfires eventually.
— Martin Splitt at 🏡🇨🇭 (@g33konaut) July 15, 2020
If that's something you're concerned about, it might make sense. I don't think it's necessary normally, tho.
— Martin Splitt at 🏡🇨🇭 (@g33konaut) July 15, 2020
Correlation isn't causation 🙃
— Martin Splitt at 🏡🇨🇭 (@g33konaut) July 15, 2020
So in short: No.
It depends on how that drop down is implemented. If the links are valid links and in the rendered HTML, then the crawler can pick them up.
— Martin Splitt at 🏡🇨🇭 (@g33konaut) July 15, 2020
Lots of webmasters give us unhelpful dates.
— Martin Splitt at 🏡🇨🇭 (@g33konaut) July 14, 2020
Forum discussion at Twitter.