As part of the Search Off The Record podcast from Google on crawling, which we briefly covered on Friday, Gary Illyes from Google said he is investigating ways for Google to handle URL parameters better.
Google's John Mueller asked Gary Illyes, "What other kinds of optimizations do you see happening with regards to crawling?"
Gary responded, "Maybe better URL parameter handling."
URL parameters are the extra characters and parameters you find at the end of a URL, often used for tracking, but also for dynamic generation of pages, amongst other things. It can cause Google to crawl in an infinite space, like with calendar pages that go on and on forever.
Then John asked Gary what he meant, and asked, "something like the URL Parameter handling tool that we used to have more in a protocol format where you say, "This parameter is optional?" Gary responded, "Oh, that's a good idea."
The issue is that Google needs to crawl all the URL variations to know all the parameters in order to know what to canonicalize the the main URL. Gary wrote, "We basically have to crawl first to know that something is different, and we have to have a large sample of URLs to make the decision that, "Oh, this these parameters are useless."
Google had a URL parameter tool in Search Console but it decommissioned it back in April 2022. The reason Google got rid of it was "because it was not used," Gary said.
So what can Google do better? Gary said, "if someone is complaining that we are over crawling them because they have one of these weird URL spaces with an infinite number of URL parameters, then we could just tell them that, "Okay, use this method to block that URL space.""
Gary added that controls like "robots.txt could be used" for this or rules like anything after this symbol should be ignore or a combination of all of that, they said.
Here is the podcast, it starts at about the 22:42 mark:
Forum discussion at X.