Google has updated its open source robots.txt parser code on GitHub the other day. Gary Illyes from Google pushed the update yesterday morning to the repository there. Google originally released the parser to the world back in 2019.
Gary Illyes explained on LinkedIn that this updated parser has been being used by Google for a while now, but now Google has released that update to GitHub.
Gary wrote, "The release introduces new capabilities in the parser class that allows you to export parsing information about the passed in robotstxt body, and adds a new library to access that information. This new library has been used by Google Search Console for many moons now (in conjunction with the Java port) and so far we haven't encountered issues with it; if you do, file an issue on GitHub!"
When Google initially released this parser, Google wrote they "open sourced the C++ library that our production systems use for parsing and matching rules in robots.txt files. This library has been around for 20 years and it contains pieces of code that were written in the 90's. Since then, the library evolved; we learned a lot about how webmasters write robots.txt files and corner cases that we had to cover for, and added what we learned over the years also to the internet draft when it made sense."
Forum discussion at LinkedIn.