Google added a support document for the web crawler Google uses for Duplex, its Google Assistant voice features that can have conversations with people. The bot's useragent is named DuplexWeb-Google and is now part of the modern set of GoogleBot crawlers.
Google wrote that "DuplexWeb-Google is the user agent that supports the Duplex on the web service."
Here is how it crawls:
- No services that use the DuplexWeb-Google user agent will perform any purchases or perform any other significant actions when crawling your site.
- The DuplexWeb-Google user agent crawls occur a few times a day to a few times an hour, depending on the feature being trained, but these runs are calculated to not overload your site or disturb your traffic.
- The DuplexWeb-Google user agent crawls are not used by Google Search for indexing. Because they are not used for indexing, the DuplexWeb-Google user agent does not recognize the noindex directive.
- Google Analytics does not record page requests made by the DuplexWeb-Google user agent during crawling and analysis.
Google said to block it, "you must explicitly block the DuplexWeb-Google user agent using the Disallow robots.txt rule to prevent it from crawling your site." DuplexWeb bot will follow the robots.txt ruleset with the exception of:
- When Duplex on the web is enabled using Search Console (the default), the DuplexWeb-Google user agent ignores the Disallow rules in the * wildcard user agent groups.
- When Duplex on the web is disabled using Search Console, the DuplexWeb-Google user agent respects Disallow rules in the * wildcard user agent groups.
Just to be clear, Duplex is not new, it has been around since 2018 or earlier. But I never saw any details on a spider/bot for Duplex before, I am sure it has been out there but I never saw it.
Forum discussion at Twitter.