Home / Other Search Topics / Search Technology / The Invisible Spider: Covert Crawler

The Invisible Spider: Covert Crawler

Jan 17, 2006 - 8:52 am 0 — by Barry Schwartz

Filed Under Search Technology

A thread over at Cre8asite forums named New kind of spider is in town links to a Wired article named Covert Crawler Descends on Web. In short, this article describes a new kind of spider designed to crawl the Web as human-like as possible.

How Does it work?

The program comes from different internet addresses, simulates different browsers and throttles itself to human-like speeds... Hoffman's program downloads everything that comes with a page -- images, JavaScript and components like ActiveX and Flash -- instead of just hitting the page itself like traditional spiders do. It also simulates a full web browser, keeping a cache and requesting only new material... To select which links to click on, Hoffman has settled on a solution somewhere between a masterful AI and completely random selection. "In some ways it's a very simplified Turing test -- you can assign the different threads a personality. This crawler, you're the slow reader, you read the entire page." Another thread may spend less time on a page before it starts clicking on different links. "Each individual crawler has its own browser habits," he added.

Barry Welford calls this spider, "somewhat scary" and that I agree with. Ron Carnell has it right, "any robot that doesn't ask for and then follow robots.txt is, by definition, unethical." So Ron gives you a technique you can use to track and then block this type of bot.

Forum discussion at Cre8asite Forums.

Previous Story: Yahoo Submit Your Site Still Timing Out

Next Story: YPN Ads Lag Sometimes?

The content at the Search Engine Roundtable are the sole opinion of the authors and in no way reflect views of RustyBrick ®, Inc
Copyright © 1994-2025 RustyBrick ®, Inc. Web Development All Rights Reserved.
This work by Search Engine Roundtable is licensed under a Creative Commons Attribution 3.0 United States License. Creative Commons License and YouTube videos under YouTube's ToS.

The Invisible Spider: Covert Crawler

Barry Schwartz / Executive Editor

Popular Categories

The Pulse of the search community

Search Video Recaps

Most Recent Articles

Daily Search Forum Recap: July 4, 2025

Search News Buzz Video Recap: Google June 2025 Core Update, Search Volatility, Insights Report, Ads & More

Bing Search Tests Zoomable & Sticky Related Searches

Google AI Mode Can Respond In Non-English Languages

Bing Tests Local Place Listings In Green

Google Merchant Center Gains Automatic Shipping Updates