Do you fear to appear twice on Google when someone somewhere hits the ‘search’ button? It could be a quote, a boilerplate, or even product details on your e-commerce site that you copied from a brand.
The fact is, it’s impossible to be 100% duplicate content-free on the internet no matter what. However, this doesn’t mean that anybody is free to copy and paste the content they see online without getting noticed.
Sure, you’ll still be ranked if you have duplicate content on your site, but your SERPs are going to be seriously messed up and eventually, your SEO scores—so, your last resort is to handle it smartly. Many marketers simply state that duplicate content is terrible, and they move on without reading Google’s guidelines on it, which brings us down to one question: how can you count on them if you have no idea what duplicate content is, and how to fix it?
Don’t you worry, though, because, by the time you’ll finish reading this, you’ll know what SEO duplicate content is, how you can spot non-original content both internally and externally and how to find out if it’s affecting your SEO internally or externally?
According to Google, duplicate content SEO means significant blocks of content within or across different domains that are entirely the same or similar. However, most of the time, this isn’t deceptive in origin.
Now, it’s pretty straightforward to understand that, as per Google, there two sorts of duplicate content SEO:
- On the same domain (internal)
- On several domains (external)
Why Doesn’t Google Encourage SEO Duplicate Content?
Google doesn’t encourage duplicate content SEO. If multiple pieces of appreciably similar text pop up in more than one place on the SERP, it would be hard for search engines to pick the most relevant one for searchers.
Duplicate Content—What’s the Penalty?
No, you won’t get flagged or penalized for duplicate content. It doesn’t drive Google mad per se, but it sure leaves it puzzled. However, since it’s next to impossible to eliminate all duplicate text from blogs, social media, websites, Google won’t come after to wreck your ranking (usually).
Consequently, if you have copied and pasted content that already has plenty of appearances online, Google uses that as an indication to overlook your site. To Google, identical or almost identical content won’t help searchers as much as high-quality sites would. Result? Low traffic/leads and even lower SERP ranking.
Tip: Always strive for great on-site design and unique content to stay in Google’s good books.
Duplicate Content and SEO
As mentioned, Google doesn’t penalize you for posting duplicate content, but it surely wants you to get it right. Google filters-out matching blocks of text, and in some cases, won’t even show your site on the SERP. The reason? Well, it’s because search engines treat such content as spam, mainly when aiming to trick or manipulate the search. This means if your entire site is flooded with forged content, don’t expect Google to rank it above genuine, more actionable content.
Let’s see how duplicate content SEO confuses the search engine:
- It won’t know which version to include or exclude from its indices.
- It won’t know whether to direct link metrics to one page or keep it between different versions apart.
- It doesn’t know which version is the best to rank for in search results.
How Do Duplicate Content SEO Issues Happen?
Duplicate content exists everywhere; in fact, nearly 30t% of the internet is comprised of SEO duplicate content, and interestingly, many webmasters don’t even intend to create it.
Let’s see some of the common ways site owners create accidental duplicate content.
Types of Duplicate Content on the Same Domain
In simple words, boilerplate is any content that is spread across different sections or paged within your website. It can be the ‘about us’ section, homepage or the navigation bar, sidebar, and other elements. A great tip would be to remove pages that contain identical text or Meta description so that search bots don’t see it as duplicate. However, this doesn’t affect your SEO ranking as bots are smart enough to see that it’s normal and don’t detect it as malicious.
Having said that, nothing beats writing a unique Meta description for every page as no one minds more click-throughs after all.
Always remember these two points if you want to avoid being flagged as ‘duplicate.’
- Each page must include an exclusive page title and Meta description in the HTML.
- All headings and subheadings must also be unique.
Parameterized Duplicate Content
Many duplicate content issues arise from URL variations mainly because parameters and URL extensions form multiple versions of the same content.
In most cases, e-commerce websites are a victim of this issue.
For instance, on a fashion brand website, if you have filters to sort products by color, you’ll most probably get:
This is called faceted navigation. Simply put, one object, many faces, which means the pages are almost the same but have differently written URLs causing the number of pages to grow and ultimately more duplicate content issues. Moreover, different users will connect to different variants depending on the recent filter they were on. Always remember that there can be way many filters, so the number of possible facets can be huge. This causes link signal dilution by making every version visible to bots instead of making just one more prominent. Result? You’ll suffer a SERP ranking drop.
Different URL Structures
A lot of times, organic search experts forget to specify their preferred version of the URL. As a result, they might seem the same, but search engine bots think they are two different URLs.
- WWW (http://www.mysite.com) and without www (http://mysite.com), two different URLs contain the same content, but www is a subdomain.
- HTTP (http://www.mysite.com) and https (https://www.mysite.com). Both look similar, right? But Google thinks otherwise. They are different URLs with the same content so that bots will detect them as duplicate content.
- The same goes for trailing slash at the end of a URL (http://www.mysite.com/) or minus trailing slash (http://www.mysite.com).
Instances of Duplicate Content on Different Domains
Content scraping is when a site owner lifts off from other high-quality sources on the internet to drive more organic traffic. They may,
- Copy/paste exact content
- Steal content from other sites, tweak it a little and publish it as their own
Site owners who scrape content can also use spinning tools and automated programs to “rewrite” or spin the scraped content they smuggled from other sites.
However, scrapers usually give themselves away. Scraped content can easily be spotted as some scrapers are too lazy to swap branded keywords in the content.
So, what are the stakes for using scraping content? Of course, you’ll be penalized by Google’s manual action. Here’s how it works:
- A human reviewer at Google will evaluate pages on your site to see if it conforms to Google’s Webmaster Quality Guidelines.
- If you are caught manipulating Google’s search index, either your site’s ranking will be dropped significantly or Google will eradicate it completely from SERPs.
Also, if you notice any traffic drops and want to check if it’s caused by manual action:
- Go to Webmasters Tools in Google Search Console
- Check warning messages if you’ve been flagged for “artificial” or “unnatural” links as you can see below:
Tip: When you’re done fixing all the issues listed in the manual action report, you can ‘Request Review’ to get your site back on the SERPs.
Copying content from someone’s site without their consent is wrong in Google’s eyes, too. If you cannot produce original content for your visitors, hire someone to do it for you, or consider other work lines as copied content can’t save itself from Google bots. Your site’s future will be in jeopardy—Google may remove it from search pages or demote it to one of the last SERPs.
Content syndication is growing in popularity among content marketing circles. It’s when you willingly permit other websites or blogs to repost your content that was probably initially posted on your site. However, don’t confuse it with content scraping as your consent doesn’t matter to site scrapers.
Many elite websites syndicate content to drive more traffic since it makes their content stand out more and helps them reach a greater audience.
Checking for Duplicate Content
Specific tools can help you spot duplicate content SEO on your site. Just back these tools with a budget-friendly and reliable internet connection to easily spot duplicate content issues by following the tips given below.
Use Google Search
An easy way to find SEO duplicate content is to use Google search. Just take a 2-3 sentence long piece of text from your site and put it “in quotes” as a search on Google to see if your content to see if your content’s been plagiarized.
Keep an Eye on Google Webmasters Alerts
You can also use the Google Search Console webmaster tool to get regular alerts about the duplicate text.
Look for Crawler Metrics
The crawler metrics in your Webmasters dashboard also contain valuable insight into the pages crawled by Google. If you find them crawling everywhere on your website, either you have inconsistent URL references, link text, or probably need to apply ‘rel canonical’ tags.
To check Crawlers:
- Sign in to Google Webmasters account.
- Next, select the ‘crawl’ option on the left panel to open the expanded menu
- Now tap ‘Crawl Stats’ to see crawler activity
Use Crawler Tools
You can also get different (free and paid), site audit tools that can search crawlers to catch multiple types of non-original content, and detect other URL issues.
The above methods will surely come handy to identify duplicate content SEO. Now let’s find out how you can fix duplicate content issues.
Standardize Your Link Structure
If you’ve made it this far, you don’t need to recap how inconsistent URL is the biggest killer of your SEO gains. The only way to fix that is to bring consistency in your link structure and employ canonical tags correctly to different pages on your site.
Unfortunately, Google no longer has the “preferred domain” settings in Google Console. However, you can still tell Google which version of domain you prefer by reading this help document or through these methods:
- Using rel=” canonical” link tag on HTML pages
- Using rel=” canonical” HTTP header
- Using a sitemap
- Using 301 redirects for retired URLs
Thankfully, to combat content duplication, Google has come up with Canonical tags to organize your content and help bots differentiate between similar search categories and tell them which pages to show. By adding a canonical tag to your chosen content URL, search engine bots easily see the canonical tag and get the link to the source page. Moreover, all duplicate page links are counted under the source page, so your SEO value stays intact.
Check out this link to implement canonical tags or follow these steps:
- Choose the page you prefer as the canonical version.
- Add a rel=canonical link from the non-canonical page to the canonical one. If we pick a shorter URL, the other URL will link with it in the <head> section like so:
Set up a 301 Redirect
Many times, content duplication may result from moving to a new domain or changing the link format. To tackle that, set up 301 redirects from duplicate to resource content pages to keep bots from confusion. The bots go to the URL, see a 301 redirect, and go to the original resource page, so the ‘correct’ page is ranked on the SERPs. You can easily do this by following these steps.
Apply Meta Robots Noindex
Another solution is to use noindex Meta tags to stop some pages from indexing. The noindex Meta tag tells the search engine bot not to index a particular resource, so it doesn’t ignore the duplicate pages’ links. However, make sure that you get rid of them (if they don’t have any backlinks) to resolve this issue once and for all. In case if they do have backlinks, go for the above two methods.
To Sum Up
Even though Google has no penalty on duplicate content, the threat is still out there and pretty much alive and affects thousands of websites daily. Someone somewhere might be using your original work, and you might not even know.
So if you’re fighting a long hard battle to knock duplicate content SEO out, keep going. It might be taxing, but it will be well worth the returns you’ll get in the end. The good news is that most non-original content issues can easily be fixed if you understand what it is.
Hopefully, this guide will enlighten you even more about duplicate content and issues surrounding it so you can improve your rankings and get rid of scrapers and stealers.