I have been corresponding with Microsoft about the weird spam-like referrals Live Search was sending to hundreds, if not thousands, of web site log files throughout the web. On September 6th, a Microsoft representative confirmed that these were actual tests being conducted by the Live Search team - but did not expand upon that. Since these tests continued to linger on until even today, Microsoft has shared more details with us what exactly this test is about.
The answer is, Microsoft was testing for cloaking. A post by the Live Search team scheduled to go live at 3PM (EST) named Live Search and Cloaking Detection has all the details. In short, Microsoft explains that "one of these tools is an extension to MSNBot, giving us an additional way to detect cloaking."
But as I reported three times in the past by way of the WebmasterWorld thread, these tests were running havoc on log files, causing concern and questions as to where these referrals were coming from and why. So the answer is, it is a form of MSNBot used to detect cloaking.
Microsoft has now promised that this MSNBot will not impact your AdSense/Overture reporting, will not statistically impact your site statistics with unfilterable bot traffic, will not continue to "pollute" your HTTP logs with inappropriate terms (spam keywords), and Microsoft will respond to your questions in their forum or via this form.
I asked Microsoft a few questions about their announcement, here is the Q&A:
Q: How have you come up with a way "to optimize the crawler and most webmasters should notice the referrer traffic dropping to almost nothing over the next month." Will traffic still be inflated overall, but you won't pass a referral data? A: Webmasters might still see some referral data being passed, but the keywords will be relevant to their sites, and it should not be statistically significant for any sized website. If webmasters are continuing to see issues, we recommend they contact us through our forums or feedback form.
Q:Why did Microsoft use spammy keywords as the referrer data in your initial tests? A:We were using a common list of terms to test against all websites when we first launched this tool. We have now optimized the tool to use only keywords that are relevant to your website.
Q:Have you consulted with Google or Yahoo on how they handle the cloaking issue? If so, can you provide any details on that? Would you say you handle the cloaking issues the same way? A:There are some commonalities in how all the major search engines detect cloaking, however, we can only comment on our own system.
Forum discussion continued at WebmasterWorld.