There is a nice new forum thread at WebmasterWorld named Google is refusing to act on DMCA notices, where a member initially reports that Google has declined his request to remove a page from its index based on the DMCA guidelines and laws. As you read down the thread, the thread creator writes:
The sites are all scraper sites of one kind or another. Most are pulling 2-4 sentences of content from my sites (about 40-50 words of content). This content is then listed alongside content scraped from 10-12 other sites and capped with a block of Adsense ads to monitize it.
It's probably not enough duplicate text to have any effect on my rankings, but it's pretty annoying to me that both spammers and Google are knowingly making money off my original content.
From this statement you can see that bits and pieces were "scraped" from his site and put together to make a highly targeted AdSense page. This is nothing new, it is done all the time and it works. It is extremely hard for Google to automatically detect this type of content, since it is far from mathematically "duplicate" and for Google to do anything manual, outside of embarrassing things, is not reasonable. I like how member walkman put it; "don't blame Google then. Copyright is not absolute. 50 words is nothing and easily falls within fair use."