It’s been known to cause many sleepless nights as people worry about receiving a dreaded Google penalty, or fret over just how much duplicate content is acceptable.
However, there are also plenty of myths and misunderstandings around duplicate content, and a few things you can to do fix it. Let’s take a deeper dive into finding and solving issues with a duplicate content checker.
What is duplicate content?
The most important question you could ask: what actually is duplicate content? Essentially, it’s anything that isn’t unique content – and amazingly, according to Google Search Central, it makes up as much as 30% of the internet. Duplicate content can be a direct copy of a page, which is either republished by its original author or copied (with or without permission) by another site. But importantly, it can also be a near-direct copy, where a couple of small tweaks have been made but the basic content is the same.
What can happen if you have duplicate content?
Many people believe that the first thing that happens when they create duplicate content (most often accidentally) is a penalty. This is most likely issued by Google, but could also come from another search engine or platform. The thing is, Google prefers unique content and does its best to blacklist sites that engage in deceptive or spammy practices. And even though your content might be the original or posted in good faith, the duplicate might start to outrank it and cause Google to penalize your site. One search engine penalty can take away years of hard SEO work, and restoring your website to its rankings can be a long process – or even impossible. So, as you can see, the fear is real.
That said, there’s some good news. Google has repeatedly said it doesn’t impose a duplicate content penalty UNLESS it has good reason to believe the copied content has been created in bad faith (i.e. spam). However, some other issues can affect your SEO:
- Scraped (illegally copied) content may start outranking your original in search results
- Too much duplicate content uses up Google’s crawl budget, so your site may not get fully indexed
- Your number of backlinks may decrease, as duplicate content is referred to instead
- Unfriendly (fake) URLs can put off users and search engines, thereby lowering your organic traffic
In short, duplicate content isn’t good news for your SEO. But sometimes it happens naturally – so how much is okay?
How much duplicate content is acceptable?
This is a hard one to gauge, but a small amount of duplicate content is acceptable. After all, it can happen easily:
- Ecommerce sites with product pages may create duplicate versions with only small differences, e.g., garment color or size
- Written a guest post and republished it on your own blog? Yep, that’s a duplicate.
- Created several specific pages for local SEO, with very similar content but some geographic info swapped? Again, these can be indexed as duplicates.
- Even different aspects of your URLs (https vs. http; mobile- and print-friendly URLs), paginated blog comments, and the use of tags or categories can result in duplicate content.
Most of the time, you probably don’t even know you’re creating it. So, how can you find and consolidate those duplicated pages?
How to fix duplicate content issues
There are two simple steps to the process: find and fix. First up, you’ll need to make use of a duplicate content checker. There are plenty of tools out there, and most SEO suites offer one as standard. If you aren’t signed up to a service like Ahrefs or SEMRush, you could consider using one of the following:
- Google Search Console
Not only do these allow you to spot duplicate content you may have accidentally created, but they also find websites that have scraped (plagiarized, or illegally republished) your content. You can then take appropriate action.
Once you’ve made use of the checker, you can start to fix any issues to minimize your number of duplicates. First up, insert a canonical tag. This will inform Google that a particular page is a copy and ensure the search engine will only index or rank the original. This is preferable to simply blocking the offending page.
However, you could also choose to delete unnecessary duplicates. Streamline your use of tags and categories to avoid this. And, where possible, use 301 redirects – particularly important if your website can be found under both http and https.
Finally, if you’ve found other websites scraping your content, you can contact them to ask them to remove it or insert a canonical tag. As a last resort, you can request Google takes down the page – though remember, if it’s an inferior-quality website known for scraping, the search engine will likely catch and penalize it anyway.
Duplicate content can be a challenge, especially if you work hard on your SEO. But don’t lose sleep over a potential penalty – these are fortunately rare. Instead, put a checker to good use and fix those issues, and all your unique content can shine again.