Each piece of content should only be accessible under a single URL, otherwise the problem called duplicate content arises. Should similar or identical content also appear on another page, for example, on a partner website, the second URL must point to the original URL.
If there is no such reference, at least one of the two sites will be punished by Google and taken off the index. To avoid this, you can make use of the so-called canonical URLs. Canonical URLs or canonical tags are indications in the source code of a website.
Why should website owners use canonical URLs?
But duplicate content is not the only danger – website owners should manually choose the canonical URL. There are many more reasons:
Avoiding keyword cannibalization
If two pages on the same domain are optimized for the same keyword, they might cannibalize each other. The search engine algorithm then independently decides which of the two pages is more relevant to a search query.
Differentiating multiple URLs for a page
If multiple URLs exist for a page, you must decide which URL should appear in the search results.
As a site operator, you may want to prioritize a specific URL
www.example.com/product/exampleproduct.Furthermore, it is possible that content is accessible through different URLs and/or domains: e.g. www.example.com / www.example.net / www.example.com/home.
Consolidating link signals for similar or duplicate pages
For search engines, it helps to consolidate the information available on several individual URLs into a single preferred URL.
The canonical URL for links from other websites pointing at
example.com/product/exampleproductshould be consolidated at
www.example.com/product/color/productin.html.It is possible that your server is configured to allow both HTTP and HTTPS variants and both with and without "www" for the same content.
Gathering correct metrics on a product or topic
Setting a canonical tag makes it easier to generate consolidated metrics for specific content.
Syndicated content management
If content is to be syndicated to other domains for publication, the canonical tag has to specify the preferred URL.
- Saving crawl time on duplicate pages
If you want to get the best possible Google rankings for your page, Googlebot should be advised to crawl only the latest version of any URL on your website.
How do site owners know which URL Google considers the canonical one?
The URL review tool in Google Search Console lets you check which page Google considers to be canonical. Note that for various reasons (such as the performance or content of a page), Google may select a non-canonical page other than the one you specified. This can be due to incorrectly marked language versions or incorrectly configured servers.
In this Google Webmasters video, John Mueller explains how Google chooses which page to rank when there is duplicate content: