There are many technical reasons for duplicate content, from session IDs and syndicated content to the utilization of printer-friendly pages and URL parameters.
Yet they all boil down to one thing: duplicate URL structures that lead to the same content, which can kill search performance. Here’s how to fight back.
While the ultimate best practice is site architecture that doesn’t create duplicate content, 301 redirects and rel= canonical are two best-practice solutions to reduce duplicate content and encourage Google and other search engines to crawl your web pages effectively.
Note: For a comprehensive understanding of URL redirects, take advantage of our Technical SEO Guide to Redirects.
301 redirects help search engines and users find content that has moved to a new URL. It’s like moving and giving the post office a change of address: it shows that the content of the page has permanently moved somewhere else.
Let’s say your home page can be reached via http://site.com/home, http://site.com, or http://www.site.com. It’s important to pick one URL as your canonical (preferred) destination, and utilize 301 redirects to route traffic from the other URLs to the preferred URL. You can also use Google Search Console to set your preferred domain.
301 redirects can effectively redirect results from the old URLs to the new ones.
Use 301 redirects to link outdated URLs to the correct pages.
Often times developers implement a temporary 302 redirect instead of a permanent redirect, which make the link juice lose its flow. There are many important differences between 301 redirects and 302 redirects that you should know.
For example, if there is a site migration of 500 pages, each page should have its own 301 redirect to the relevant page on the new site. A common mistake is to redirect all 500 pages to a single URL, typically the homepage.
Since http://site.com and http://www.site.com create different versions of the URL, be sure to set up a redirect from all of the different iterations of your brand's domain. So, if your preferred domain is www.site.com yet you type or direct to site.com, you will get an error that this site doesn’t exist.
The link rel=canonical tag, often called “canonical link” (or simply just "canonical tag"), is an HTML element that helps webmasters prevent duplicate content issues. A rel=canonical tag lets search engines know that certain similar URLs are actually one and the same. It does this by specifying the “canonical URL” as a preferred version of the page.
Like 301 redirects, canonical link elements are inserted into the http header of your page, but they lead search engines to the canonical URL, where the original content (that should be ranked) lives.
In other words, use rel=canonical tags to push search engines to your most-complete content (your canonical URL), when you have multiple URLs or multiple pages with similar content for the same topic or item.
Rel=canonical tags pass the same amount of link juice as a 301 redirect, and can be implemented quickly. Of course, it’s vital to determine your canonical URL first.
For example: https://www.site.com/product?category=paper&color=white and https://www.site.com/paper/white/whitepaper.html
Rel=canonical tags help search engines push the URLs of each category to your canonical URL, for example: http://www.blog.example.com/category1/blog-post-name and http://www.blog.example.com/category2/blog-post-name
Rel=canonical tags help search engines push both of these URLs to your canonical URL, for example: http://www.site.com/example and https://www.site.com/example
Rel=canonical tags can push search results — from those various blogs, web feeds and the like — to your canonical URL.
An Absolute URL uses the entire address on the page that you link to. For example:
<a href = http://www.site.com/xyz.html>
A relative URL uses the “relative” path, not the full address. It assumes that the page you type in is on the same site. For example:
<a <href = “/xyz.html”>
If you mistakenly use a relative URL, the search engines might ignore your rel=canonical tag. So, be sure to specify the full absolute URL. Make sure you are familiar with the differences between absolute and relative URLs.
If you do this, the search engines will only index the first page – skipping the content on subsequent pages. It’s best to specify a URL that has all of the content on a single page.
For example:
SINGLE PAGE WITH ALL CONTENT
example.html?page=all
(arrows from PAGE 1, 2 and 3 point to SINGLE PAGE)
PAGE 1 CONTENT example.html?page=1
PAGE 2 CONTENT example.html?page=2
PAGE 3 example.html?page=3
It's always wise to double-check your work – especially if you're picking up a template, setting multiple rel=canonical tags to different URLs or using an SEO plugin with default rel=canonical tags (Yoast is a popular plugin option). At the end of the day, use caution when implementing 301 redirects and rel=canonical tags. Used correctly, they can turn duplicate content into seamless search results. Used incorrectly, they have the potential to impede search performance and harm your site and analytics performance. It’s always good to test on a small set of URLs first to make sure you get the visibility you want, before implementing either of these solutions across your site.