Below are a few examples of actions that lead to content duplication online, and not all – like syndicated content, for example, are a bad thing since they increase brand awareness by expanding your digital footprint.
Created on Purpose for Accessibility:
Often, duplicate content is the result of a Webmaster or editor wanting to make their site look more robust. They take content living in a single category and allow it to be found in multiple categories with the ease of sometimes just checking a box in their user-friendly CMS. This results in having the exact same content with different URLs, forcing search engines to determine on their own which URL should have the most authority.
This has been happening a lot lately as more and more companies start making the move from http to https due to the SEO value. However, it can be “one step forward, two steps back” if you don’t set things up correctly.
Warning: The same issues could also be happening if your site can be found on maindomain.com and www.maindomain.com. For these types of issues, I recommend creating 301 redirects so that only a single version of your site can be seen (pick either www. or non-www, and/or http or https). No need for all to exist and use a canonical here—in my opinion.
Separate Mobile Sites:
Another reason duplicate content is accidentally created on a site is companies creating maindomain.com sites and m.maindomain.com sites—both showing the same content, but with different layout dimensions.
Digital Presence Wanted Globally:
You want to leverage content that performs well in the U.S. on your UK site. If you simply copy and paste the strong content from one site to the other, it’s duplicate content because the same material lives on two separate URLs.
Syndicated Content:
In this pay-to-play world, it’s not uncommon for a piece of content to exist on multiple URLs. Even if the duplicate content isn’t contained within your site, having content on several URLs living on different domains can lead to problems with link and ranking signals being divided across several variations of the same piece of content. You’re forcing a search engine to determine which site to serve; and if your content exists on a site with more domain authority or less spam scores, it is very likely that your site (the one with the original content) may not get served in the search-results page.