Home / Google News / Google Shares Its Robots.txt Parser Code With Open Source World

Google Shares Its Robots.txt Parser Code With Open Source World

Jul 2, 2019 - 7:46 am 0 — by Barry Schwartz

Filed Under Google

Google Open Source

Google announced yesterday as part of its efforts to standardizing the robots exclusion protocol that it is open sourcing its robots.txt parser. That means how GoogleBot reads and listens to robots.txt files will be available for any crawler or coder to look at or use.

It is rare for Google to share anything they do in core search with the open source world - it is their secret sauce - but here Google has published it to Github for all to access.

Google wrote they "open sourced the C++ library that our production systems use for parsing and matching rules in robots.txt files. This library has been around for 20 years and it contains pieces of code that were written in the 90's. Since then, the library evolved; we learned a lot about how webmasters write robots.txt files and corner cases that we had to cover for, and added what we learned over the years also to the internet draft when it made sense."

It's been awesome working with @methode and https://t.co/CPJfDQnxn1 on this. I am very happy that it is finally ready to be shared with you all! 😃 https://t.co/gyxvzrFLtp
— Edu Pereda (@epere4) July 1, 2019

If you have SERIOUS ideas about what else could be useful as OSS, leave a comment with the idea and an explanation how would you use that OSS https://t.co/cxxqhI9Nzo
— Gary "鯨理" Illyes (@methode) July 1, 2019

I helped write some of the earliest parts of this code from 1999-2002. Lots of fun times:

What should you do with robots.txt files in MS Word format?

One site had:
User-agent *
Disallow: /

Instead of:

User-agent: *
Disallow: /

(We made our parser less strict) https://t.co/7FnX8lFKqu
— Jeff Dean (@JeffDean) July 2, 2019

Forum discussion at Twitter.

Previous Story: Search Google For Fireworks & Get A Fireworks Show

Next Story: List Of All The GoogleBot Robots.txt Specifications Changes

The content at the Search Engine Roundtable are the sole opinion of the authors and in no way reflect views of RustyBrick ®, Inc
Copyright © 1994-2025 RustyBrick ®, Inc. Web Development All Rights Reserved.
This work by Search Engine Roundtable is licensed under a Creative Commons Attribution 3.0 United States License. Creative Commons License and YouTube videos under YouTube's ToS.

Google Shares Its Robots.txt Parser Code With Open Source World

Barry Schwartz / Executive Editor

Popular Categories

The Pulse of the search community

Search Video Recaps

Most Recent Articles

Daily Search Forum Recap: April 21, 2025

Google Drops Event Rich Results Carousel On Desktop Search: Bug?

Google Live Demo Of Android AR Glasses

IAB/PWC: U.S Search Ad Revenue Grew 15.9% With $102.9 Billion In 2024

Bing Webmaster Tools Working To Streamline Email Notifications

Microsoft Launches Copilot Merchant Program