Google's Aurora Morales in her latest video spoke about duplicate content from the view of not just organic search but also from publishers who want to monetize their websites. She started off saying that your site needs to add value, value more than what you can find elsewhere.
Google may limit or disable your Google AdSense ads on your page if your content is duplicative or low value. Google's own AdSense guidelines disallows placing AdSense ads on "sites with scraped or copyrighted content." Examples include:
- Sites that copy and republish content from other sites without adding any original content or value
- Sites that copy content from other sites, modify it slightly (for example, by substituting synonyms or using automated techniques), and republish it
- Sites dedicated to embedding content such as video, images, or other media from other sites without substantial added value to the user
She said make sure your content is not copied or duplicative from other sites or even internally from within your own website. If you have many similar pages, she said consider either expanding the pages that are similar or consolidate them into a single page.
Then on the "organica" side she explained that sometimes and maybe often, duplicate content is not meant to be deceptive by the site owner. Specifically how URLs sometimes need to be canonicalized. She said this technical duplicate "is fine." But she explained it can cause issues for crawling and indexing in some cases. She also explained that this issues can arise with printer friendly pages, or category pages with snippets. All of this is fine but it might cause Google to not know which page to show in search. Google recommends using canonicalization to communicate to Google which page you want in search.
She explains how to report sites that copy your content, including the DMCA requests.
Here is the video:
Here is the video from John Mueller she referenced above:
Forum discussion at Twitter.