If we do not have the Canonical tag, Google bots themselves will index the address that looks best as the original version and display it in search results. The main problem occurs when the choice of Google is not the same as ours.
Understanding the concept and application of canonical tags is one of the most difficult parts of learning SEO. But this simple tag is so valuable and effective that it can be said that it should be used in 99.9% of sites. If we use WordPress content management system, we will have less worries about this because the core of WordPress manages this tag well. However, in sites designed with dedicated coding or CMS, the decision and management of the Canonical tag will play a special role.
First we provide a definition of the canonical concept and then we see together what the lack of attention and use of it creates problems for the site. After reading this article, the first thing you will consider for SEO analysis of a site is the correct use of the Canonical tag!
What is a Canonical Tag?
Canonical is actually a link tag that sits at the top of the page and introduces the address of the best version of this content to Google. This tag has no effect on the content of the page, the user experience and how it is displayed, so the user does not notice the presence or absence of Canonical on the page.
<link rel = “canonical” href = “https://seo-teaching.com/” />
The most important canonical use is to prevent indexing of pages that have different URLs but the same content. In such cases, Google tries to select the best version from these URLs and show it to the user in search results. Using the Canonical tag helps us to introduce the original (suggested) version to Google ourselves and prevent confusion and possible mistakes of the bots.
You might think that this happens so rarely, why should we have two pages with different URLs but the same content ?! We will talk a little earlier about the conditions and reasons for creating these pages, but before that it is better to get acquainted with the types of addressing in Canonical;
1- Self-Referencing; The URL of the Canonical tag is the same as the URL of the page we are on.
2- Preferred-URL; The suggested (original) version is redirected to another page in the same domain.
3- Cross-Domain; Refer the original version to a page from another domain.
I think it’s been confusing enough so far, so before we talk more about the Canonical tag, let’s look at the reasons for duplicate pages on the site.
How are duplicate pages created on the site?
There are different conditions for producing duplicate content on the site and no site can be considered an exception to this rule. Here are the most common reasons:
1- Change the address based on the application of the page
A page of our site may have different URLs depending on the application or display conditions. Consider the following example.
Designing a separate URL for the mobile version, using the AMP framework, having a dedicated Print version, and using RSS on the site can be considered examples of producing pages with separate URLs but the same content. Each of these pages requires a Canonical tag that links to the original version.
2- Manage www and https
All web pages have access to 4 different URLs. But Google bots only need one of them to display in search results. If the canonical tag is not used properly on our site, some pages will be indexed with www, some with https and others with other modes. This will make it very difficult to analyze the performance of the site in different tools.
Multiply the number of pages of your site by 4, now we understand the importance of the Canonical tag and its effect on the indexing speed of the site. Of course, in this particular case, the best suggestion is to have the three side modes redirected to the original 301 URL, and the side versions are not available to Google at all. For a complete introduction to sample codes, see What is a 301 redirect?
3- Dynamic addressing (filter, search and…)
This problem is most often found in online stores or large sites where search plays an important role. A common technique for searching or filtering results in the online store is to refer the user-selected items to the page address and receive information from the server based on it. Simply put; When we select the “blue” option on the T-shirt purchase page, the phrase color = blue is added to the page address and only blue products are displayed to us.
Take a look at the image below. By choosing the “Arldam” brand, the “existing” status and the layout based on the “cheapest” page address have undergone major changes, but its content, title, description and many other factors are not fundamentally different from Mobile Holder. If Google mistakenly indexes this version of the page as the main URL and displays it to users, practically a large part of our products will not be visible to the user.
You may think some of these pages might be valuable, and rightly so. For example, the “Samsung Power Bank” page may have been created with the same filters and is very popular, but are the pages produced in this way always valuable? How many pages might be created this way? If our site has 5 filter options and each provides 5 choices for the user, hundreds of new pages will be created for each product category. Pages, most of which have no content value or no results. For example, a big discounted blue round neck t-shirt with a pattern!
The following URLs are examples of dynamic page creation:
The last two did not catch your attention? The great danger that lurks in all the sites appears here.
4- Production of pages with external linking
To generate a new URL (with duplicate content) it should not be a problem of our site or its technical infrastructure. Sometimes the wrong links we get from other sites or the use of UTM Campaign in advertising cause a page of our site with multiple URLs to be available to Google.
The use of the Canonical tag of the Self-Referencing model is specified here. By referring a page to itself, we practically prevent the creation of duplicate pages through dynamic URLs used in external linking.
As we read this article, there are many pages that have been indexed by their ad campaign address in Google due to the lack of canonical tags! Because this address has received many valuable links compared to the original version and has been selected as the main reference by Google.
There are other ways to generate pages with duplicate content and separate URLs, but I think we are familiar enough with the different reasons for producing these pages. Now that we’re talking about this, let’s take a closer look at Google’s criteria for determining the original version.
Why does Google choose the reference URL?
Google bots try to identify the best results in terms of volume and quality of content and display them in search results after encountering several pages with similar content. The best address (reference) according to Google is determined based on the following factors:
Pages defined on our sitemap have a better chance of being selected as a reference by Google as compared to other versions. So once again, we realize the importance of sitemaps and their impact on page indexes.
Number and quality of internal and external links
The address that is suggested more than the others (received the link) is a better option for users according to Google. Most of the internal pages of our site are poor in terms of external linking, and receiving one or two valid links with the wrong address can change the reference version from Google’s point of view.
Content volume and quality
In filtering systems, dynamic URLs usually display less content (number of products or articles) to the user, so they are less likely to be selected as a reference. But if we have options to change the display order, such as “cheapest”, “most visited” and…, the content of the production pages is very similar, and here the role of the canonical tag is very prominent.
If you are still not convinced to use the canonical tag, it means you are a stubborn account! Here are some important reasons why you should have no doubts.
How does the Canonical tag affect SEO?
Some SEOs believe that it is up to Google to understand the structure of the site and find the reference version, and it does this well. As a result, you no longer need to manage the Canonical tag. To some extent, they are right, because in most cases Google chooses the best address. But the problem is not just duplicate content; Consider the following:
1- Consolidate the validity of pages in one address
When we use the Canonical tag to submit a page to Google as the primary URL, any internal and external links given to the copies are assigned to that page, and the value and credibility of all of them are gathered together. Prior to the canonical presence on the page, it was dangerous for us to get side links, but using it, practically all links are referred to the original version by Google.
In some cases, using a canonical tag is not enough to aggregate the validity of the pages and they continue to compete with each other in search results. For more information in this area, we suggest you read the article What is Cannibalization?
If you are unsure about this, I suggest you read your Google guide to the Canonical tag in the Consolidate duplicate URLs section. The image below is an excerpt from Google’s description of the effect of this tag.
2- Preventing change of reference version periods
With the addition and subtraction of content or a change in the balance of external linking, the focal point may change from Google’s point of view and another URL may appear in search results (for example, http to date and https will be seen from now on). Coincidentally, our position is not lost, but all statistical tools are confused. The information we see in the search console panel, Google Analytics tool, or any other analytics platform is subject to change, effectively barring us from accurate analysis.
The image below is an example of a Google Search Console panel for a site that has changed its address from http to https. Since the change of address, the console search information has been practically divided into two parts, making it very difficult to analyze statistics compared to the past.
3- Time management for Google bots
Google bots spend a certain amount of time on each site, depending on the content rate, credibility, and domain history. For example, they visit the website of WebSima Academy for two hours a day and index its latest changes. This time is called the Crawl Budget.
The less valuable or empty pages on our site, the less chance we have of reviewing and indexing new articles, and the more Google bots on our site get confused. Proper use of the Canonical tag plays an important role in managing creep budgets and increasing index speed.
Closing remarks; Having a Canonical tag on the page is not enough!
What we have reviewed together in this article will help us decide on the correct choice of focal address. If the URL for similar pages is different from the version on the sitemap, practically all our efforts will be wasted. Or if we link two pages of a site that are not related in terms of concept and content with this tag, Google ignores our suggestion and indexes both pages separately.
The importance of this tag is so great that in the Google Search Console tool and in the Coverage section, four different modes are reported based on the canonical tag of the page
Alternate page with proper canonical tag; Pages that have been referenced with the canonical tag to another address, and Google suggested that the suggested address was quite appropriate.
Duplicate without user-selected canonical; Pages that Google thinks have duplicate content but we do not use the canonical tag.
Duplicate, submitted URL not selected as canonical; The content of the page is duplicate, but Google has not accepted the URL we provided as a reference.
Duplicate, Google chose different canonical than user; The content of the page is duplicated by Google, it did not accept the URL we introduced as the focal page, and it chose another page as the original version.
Did you think the issue was so important and serious ?! If this article is useful to you and has clarified the importance of the canonical tag, please let us know in the comments section of this page. If you have any questions or ambiguities after reading the article, ask them here and you can be sure that we will give you the best possible answer.
Canonical related questions
Tips on the subject of an article by John Müller published in Google Hangouts.
In addition to the canonical tag, Google also pays attention to other things
If the number of pages of your site that display the same content to the user and cause cannibalization is high, in addition to the canonical tag, Google also pays attention to other things such as internal linking and sitemap. Recognizes the original. For example, this happens in store pages where a different URL with the same content is generated by selecting each feature.
Having multiple pages for different types of a product is not a big problem
John Müller suggested two solutions for managing these products. The first way is to create a separate page for each type of product and index it in Google results. The second way is to create a single page and different types of a product can be selected on the same page. He explained that the choice of one of these two methods depends on factors such as site size and number of products, and the uniqueness of each type of product.
Keep Google bots access to different language versions of the site
Some multilingual sites use the user’s IP and browser language to identify the user’s target language and redirect them to the target language version. John Müller recommends that these websites make their landing pages available to Google bots for all languages, but all of them use the Hreflang tag to specify the language of each page and the Canonical tag to specify the language version. Use the main website.
HTML and AMP pages with similar content are not considered duplicates
Do not worry if you have similar content in the AMP and HTML versions of your site pages. Google notices that the content of these pages is duplicate, but does not have a negative effect on your site ranking. Of course, this creates the problem of cannibalism and competition between the two pages in the search results, but this problem can be solved using the canonical tag. Using the canonical tag, you can show Google which page it should focus on ranking.
Using a canonical tag is much better than not using it!
John Müller stressed that it is best to use the canonical tag (canonical tag to the page itself) on pages that are the original version, but explained that this is only one way of identifying the original version of the pages by Google bots. Müller explained that there are ways to distinguish the home page from duplicate pages, and if the canonical tag is not used, in most cases Google will be able to recognize the original version of the pages, however, using the canonical tag is better than accepting the risk of not using it. Is.
Use the canonical tag to fix the Duplicate content problem
When we re-index a page, it means that all the input signals to that page are lost. When you want to restrict Google access to two identical or duplicate content, instead of No-Indexing each of them, it is better to use the Canonical tag to link one of them to the other so that Google notices Make the two contents the same.
Do not use the canonical tag for Noindex pages
If you use a canonical tag on the page whose target page is in Noindex mode, Google will receive conflicting signals. In this case, Google decides which page to index and which to ignore, based on other criteria such as internal linking.
Google ignores canonical tags on pages that are not the same
If the canonical tag is used in a situation where the two target and source pages are not the same, Google will ignore the canonical tag. The canonical tag only applies to pages that are identical to each other.
Even if you use the canonical tag correctly, it can still be indexed by Google
Links that contain additional parameters are commonly used for in-store product features, UTM links, and the like. Even if the canonical tag is executed correctly, it is still possible for Google to ignore the main pages and index the parameterized versions.