A Content Marketer’s Guide to Managing Duplicate Content

400 hours of video content are uploaded every minute on YouTube, and more and more content is being produced daily on sites big and small. As the volume grows, search engines like Google are fighting to keep up and provide the most relevant content on their results pages for the 6 billion searches made each day.

There’s just too much content out there to be crawled, so search engines are pushing themselves to be smarter, creating rules for their bots to grade, praise and penalize online content. Original, relevant content is the golden ticket to online success, and on the opposite end is duplicate content.

What is Duplicate Content?

Simply put, duplicate content is content that appears on the Internet on more than one URL. When I say anywhere on the web, I mean that. It could be duplicated on your own site or a version could exist on your site and another.

Is Duplicate Content Bad?

No and yes—it’s complicated.

NO: Search engines don’t actually give you a penalty for duplicate content.

YES: It forces Google to guess which piece of content should be the main authority, leaving much to chance. Or worse, the search engines could be so confused which piece is the best, they may decide to de-prioritize your multiple content versions for a single competitor version in their search result pages.

The search engines want to provide the best, most relevant content to their audience, so if they aren’t sure what content of yours should be served, they lose confidence and trust in your site. Try not to let this happen!

How Duplicate Content Is Created

Below are a few examples of actions that lead to content duplication online, and not all – like syndicated content, for example, are a bad thing since they increase brand awareness by expanding your digital footprint.

Created on Purpose for Accessibility:

Often, duplicate content is the result of a Webmaster or editor wanting to make their site look more robust. They take content living in a single category and allow it to be found in multiple categories with the ease of sometimes just checking a box in their user-friendly CMS. This results in having the exact same content with different URLs, forcing search engines to determine on their own which URL should have the most authority.

This has been happening a lot lately as more and more companies start making the move from http to https due to the SEO value. However, it can be “one step forward, two steps back” if you don’t set things up correctly.

Warning: The same issues could also be happening if your site can be found on maindomain.com and www.maindomain.com. For these types of issues, I recommend creating 301 redirects so that only a single version of your site can be seen (pick either www. or non-www, and/or http or https). No need for all to exist and use a canonical here—in my opinion.

Separate Mobile Sites:

Another reason duplicate content is accidentally created on a site is companies creating maindomain.com sites and m.maindomain.com sites—both showing the same content, but with different layout dimensions.

Digital Presence Wanted Globally:

You want to leverage content that performs well in the U.S. on your UK site. If you simply copy and paste the strong content from one site to the other, it’s duplicate content because the same material lives on two separate URLs.

Syndicated Content:

In this pay-to-play world, it’s not uncommon for a piece of content to exist on multiple URLs. Even if the duplicate content isn’t contained within your site, having content on several URLs living on different domains can lead to problems with link and ranking signals being divided across several variations of the same piece of content. You’re forcing a search engine to determine which site to serve; and if your content exists on a site with more domain authority or less spam scores, it is very likely that your site (the one with the original content) may not get served in the search-results page.

How to Remove Duplicate Content From your Site

Does your site have duplicate content after checking for the above instances? If so, what can/should you do? Here are some recommendations.

Leverage Canonical Tags:

A canonical tag (rel=canonical) is basically code you put in the header of your page to tell search engines that you know you’re creating duplicate content and give them guidance as to what page should be the main page to collect all authority and rank. All search engines will then consolidate link and rank signals for all variations of this content onto the single URL you’ve chosen.

You must add the canonical tag to all pages that have this duplicate content, including the canonical page you’ve selected. You should also ask your native advertising partners, influencers, and content syndication networks to include canonical tags on their pages as well. Google discusses canonicals in depth, here. 

Steps to Install a Canonical Tag:

  1. Locate all duplicate content
  2. Decide which page will be the master page
  3. Put the below <link> tag in the header of all pages to assign which page should receive the authority

<link rel=”canonical” href=”https://www.maindomain.com/master-page-chosen” />

Note: Make sure you specify the absolute path (full URL) rather than relative paths (partial). For example, note how the above is full URL and not simply /master-page-chosen.

Implementing a Proper Geo-targeting Tag:

To avoid creating duplicate content living on sites created for different countries (like .com, .ca and .uk), you must use correct geo-targeting tags so the search engines understand what you’re doing. A canonical tag should not be used in this instance.

Place an hreflang tag to communicate to the search engines the language targeting of single pages to Google. This not only ensures correct indexing but also avoids problems with duplicate content because they see you’re specifically having each page target a region/language. An example is listed below.

<link rel=”alternate” href=”http://example.com” hreflang=”en-us” />

It’s also important to set the country code of your site within your Google Search Console account.

–––

It’s a new year, and as we all set rules, start diets and prepare for the best year ever, one of your goals should be gaining a better understanding of the online rules of content. Do your part to make sure you aren’t accidentally creating duplicate content on your site(s).

#SavingTheWorldOneBlogPostAtATime

Keep reading in Strategy