Start here: how our SEO split tests work
If you aren't familiar with the fundamentals of how we run controlled SEO experiments that form the basis of all our case studies, then you might find it useful to start by reading the explanation at the end of this article before digesting the details of the case study below. If you'd like to get a new case study by email every two weeks, just enter your email address here.
In this week's #SPQuiz, we did a deep dive into canonical tags and how they influence SEO. We asked our followers the question, does Google recommend combining rel="canonical" tags with specified canonicals in your site map to send extra strong signals about canonical pages?
Read on to see what our followers thought, and our insights into testing canonical tags for SEO.
Poll Results:
Table of contents
- What are canonical tags?
- Why should you use canonical tags?
- What Google says about canonical tags
- How to implement canonical tags
- Best practices for canonical tags
- The impact of canonical tags for SEO
- 8 ways to test canonical tags for SEO
What are canonical tags?
When search engines index pages, they need to decide which page is the best result for a search query. For sites with lots of pages with similar content, there’s a risk that Google will make the wrong determination or miss important ranking signals if it’s treating each different URL as its own distinct page.
Fortunately, there’s an easy way to make sure that ranking signals are pointing to the best version of pages. To show search engine crawlers which page should take priority for crawling and displaying in SERPs, you can use canonical tags to point crawlers to the appropriate URL. This means that your canonical page has the best possible chance of ranking for relevant search queries.
Why should you use canonical tags?
On your website, you may need to have multiple pages or URL paths displaying similar or identical content. For example, you may have pages for offering the same service to different regions, pages for products in a range of sizes and colors, or pages with protocol variants for HTTP and HTTPS versions of your site. It’s important for users to have different pages for these, but creating completely unique content for each variant of a page is unrealistic, especially on large websites with thousands or tens of thousands of pages for every possible variation.
What Google says about canonical tags
Google recommends using canonical tags to specify which URL users should see in search results, to consolidate ranking signals for similar pages, and to avoid wasting your crawl budget on duplicate pages. There are a variety of ways to set the canonical URL of a page. You can add a rel="canonical" link within the code of duplicate pages, put a rel="canonical" HTTP header in your page response, or use a sitemap to specify which pages are canonical. Google doesn’t recommend using redirects to send signals about canonical pages to Googlebot.
How to implement canonical tags
There are a couple of ways that canonical tags can be implemented on a page.
Link rel element in the HTML
The first is using the rel="canonical" link element. This is an element used in the head section of the HTML and involves using a link element with an href attribute pointing to the URL that is determined the canonical.
<html>
<head>
<title>Search Pilot Case Studies</title>
<link rel="canonical" href="https://www.searchpilot.com/resources/case-studies/" />
<!-- other elements -->
</head>
<!-- rest of the HTML -->
Link rel element in the HTTP header
If you publish files such as PDFs on individual URLs you can still add a canonical link, using a HTTP header. This works in a similar way, with a rel canonical element and a link to the canonical URL.
HTTP/1.1 200 OK
Content-Length: 20
...
Link: <https://www.searchpilot.com/downloads/canonical-white-paper.pdf>; rel="canonical"
...
Best practices for canonical tags
No matter which method you use for canonicalization, you should follow Google’s best practices.
- Make sure that each page has only one canonical URL. If you’re using too many different canonicalization techniques at the same time, you may accidentally end up with one URL in your rel="canonical" tags but a different URL in your sitemap.
- When Googlebot crawls your site, it will follow the links to other pages as they appear on pages. Make sure that all links that crawlers can follow are the proper canonical URL.
- Don’t use a redirect to set a canonical URL unless you’re deprecating a duplicate page.
- The robots.txt file shouldn’t be used for canonicalization. Even if you try to hide a page from Google with robots.txt, it may index it anyway if other pages point to it.
- You shouldn’t use noindex to try to point Google away from a non-canonical page, since that will block the page from appearing in all search results.
- The URL removal tool shouldn’t be used for canonicalization, since it will hide all versions of a URL, not just the ones you don’t want canonicalized.
- One common reason to use canonical tags is to specify which page is the canonical version when you have regional variations. If you’re using hreflang elements on your localized pages, ensure your canonical tags point to a page with the same language. If that isn’t an option, Google recommends canonicalizing to the best possible substitute language.
- Don’t use a rel="canonical" link element on PDFs or other files that may appear in search results, since this will only work on HTML pages. The rel="canonical" HTTP header should be used on files.
- If you have AMP (Accelerated Mobile Pages) versions of URLs on your site, make sure that the domain you’re hosting them on does not look completely different from your regular pages so users don’t get confused about which site they’re on.
The impact of canonical tags for SEO
Canonical tags play an important role in influencing various aspects that impact organic traffic and rankings. This includes managing duplicate content, consolidating PageRank & authority, crawl management, and enhancing user experience.
Managing Duplicate Content
Canonical tags are instrumental in resolving the issue of duplicate content, a common challenge faced by many large websites. By indicating the preferred version of a page, canonical tags guide search engines to index the intended content, preventing the dilution of SEO efforts.
Managing duplicate content with canonical tags ensures that search engines direct traffic to the most relevant page, positively influencing organic search rankings and traffic.
Consolidating PageRank, Link Signals, and Authority
Canonical tags aid in consolidating the PageRank, link signals, and authority of similar or identical pages. This consolidation strengthens the signal for the chosen canonical URL, enhancing the overall authority of the page in the eyes of search engines.
This improved authority contributes to higher search engine rankings, subsequently attracting more organic traffic and increasing the likelihood of clicks.
Crawl Management
Efficient crawling is essential for search engines to index and rank web pages effectively. Canonical tags help streamline this process by guiding search engine bots to the preferred version of a page, reducing confusion, and ensuring that resources are allocated optimally.
A well-organized and easily crawlable site structure facilitates better visibility in search results, leading to increased organic traffic.
Canonical tags also play a crucial role in managing the crawl budget allocated by search engines. By consolidating signals and guiding crawlers efficiently, websites can ensure that their most important pages are crawled and indexed.
Enhancing User Experience
Canonical tags not only benefit search engines but also contribute to a positive user experience. By consolidating similar content and directing users to the preferred version, websites can reduce confusion and improve the overall navigation of the site.
A clear and concise website structure, aided by canonical tags, can lead to higher click-through rates as users find the content they're looking for more easily, resulting in increased satisfaction.
International SEO Considerations
For websites with multiple language versions or regional variations of content, canonical tags together with hreflang tags are crucial. They help specify the preferred version for search engines, ensuring that users are directed to the most relevant content based on their language or location.
Proper implementation of canonical tags for internationalization can positively influence rankings and organic traffic by presenting users with content tailored to their language or region.
8 ways we can test canonical tags for SEO
1. Test updating the correct URL version (trailing slash vs non-trailing slash)
Canonical tags play a pivotal role in signaling to search engines which version of a page's URL should be considered the authoritative one.
Search engine crawlers rely on clear and consistent signals to understand a website's architecture. When canonical tags point to a version of the URL that contradicts the established truth there is a potential that it could hinder a crawler's ability to accurately index and rank pages. The consequence is a potential decrease in visibility and rankings for affected pages.
Recognizing and fixing this inconsistency can become complex and require a lot of technical resources. Testing enables us to gauge the effectiveness of aligning canonical tags with the preferred URL version before implementing widespread adjustments.
2. Test using just a canonical tag or noindex tag rather than both
Canonical tags and noindex tags are both essential tools for guiding search engines in understanding how to treat duplicate or non-indexable content, and best practice recommendation is that they shouldn’t be used simultaneously. This is because they can cause conflicting signals, indexing uncertainty and dilute their effectiveness.
Implementing either a canonical tag or a noindex tag alone, rather than using both, will lead to a more consistent and clear signal for search engines, which may improve organic traffic to the page set as the canonical.
There’s a lot of conflicting advice on this topic, so a test like this can also be used to validate whether the things that Google says are best practices do have the impact we think they do and which approach is best to take for your individual website. And importantly, whether resources should be put towards making this change across the site.
This is also an example where testing different approaches to find the most impactful result is a useful strategy. Therefore, you can start with a test to remove the noindex tag and just have the canonical present and then follow up a test removing the canonical and just have the noindex tag present. We will be evaluating whether, by implementing only one of these options, we will observe an improvement in signals to search engines and therefore increase organic traffic to the page that is currently set as the canonical.
3. Test adding canonical tags to paginated pages
Pagination introduces complexities that can lead to challenges in determining the most relevant page for search engine indexing. Unmanaged pagination can cause issues with ranking the correct page and lead users to unhelpful content.
Pagination also often introduces challenges associated with duplicate content. The addition of canonical tags seeks to address these concerns, signaling to search engines the primary pages for indexation and potentially mitigating duplicate content issues.
Canonical tags on paginated pages also help to consolidate SEO signals, providing a clear directive to search engines about the preferred version of content. This consolidation may lead to more accurate indexing and enhanced rankings.
Testing the addition of canonical tags to paginated enables us to evaluate fluctuations in organic traffic and understand how the addition of the tags influences traffic from organic search.
4. Test fixing broken canonical links
In the dynamic landscape of large websites undergoing continuous updates, it's not uncommon to encounter instances where canonical links point to broken pages. This can impact a website's SEO performance as it can pose a challenge to Google's crawling and indexing processes, which can in turn cause a negative impact to rankings. In some instances, the confusion caused by canonical links leading to broken pages might even trigger soft 404 errors, where search engines perceive the page as less relevant or important.
Addressing broken canonical links can become a complex and intricate task, particularly for enterprise websites with the volume of pages and the interconnected nature of the site. Before committing resources to fix this issue, it becomes imperative to assess the scale of the problem and prioritize efforts based on its impact.
When considering the impact of broken canonical links, the ultimate goal is to safeguard and enhance organic traffic. Testing becomes a strategic approach to ensure you can measure the real-world impact on organic search rankings and user experience. Conducting tests allows for a data-driven evaluation of the proposed changes before a widespread implementation. This not only aids in gauging the effectiveness of the fix but also provides crucial data to support decision-making.
5. Test updating domain version (HTTPS vs. HTTP) even if they are redirecting
Addressing the use of HTTP pages in canonicals, especially in the context of large websites grappling with technical debt, is a crucial aspect of maintaining a trustworthy online presence. While it's common for HTTP pages to redirect to their HTTPS counterparts, the persistence of HTTP links, even within canonical tags, can introduce issues that affect the overall trustworthiness of a website.
Search engines may perceive such mixed signals as potential security vulnerabilities, leading to repercussions in terms of rankings. For large-scale websites with extensive technical debt, transitioning from HTTP to HTTPS across canonicals can be a large undertaking. However, the strategic benefits in terms of trustworthiness and SEO performance make it a worthwhile consideration.
Testing this change first enables us to evaluate the impact on key metrics before committing to a full-scale implementation, providing valuable insights into the actual improvements in trustworthiness, rankings, and organic traffic.
6. Test adding self-referential canonical tags
Self-referential canonical tags help eliminate variations in URLs that may arise due to parameters, case sensitivity, or differences in the use of www and non-www. By explicitly specifying the canonical URL on the page itself we can ensure that search engines understand the preferred version of the content, even when there are multiple ways to access it.
In a previous webmaster hangout, John Mueller recommended the use of self-referential canonicals saying:
"I recommend [using a] self-referential canonical because it really makes it clear to us which page you want to have indexed, or what the URL should be when it is indexed. Even if you have one page, sometimes there are different variations of the URL that can pull that page up. For example, with parameters in the end, perhaps with upper lower case or www and non-www. All of these things can be kind of cleaned up with a rel canonical tag."
Before implementing self-referential canonical tags across a website, it's advisable to conduct testing to understand the potential impact on SEO metrics. This can also help you to prove the impact of making this small but powerful change.
7. Test removing canonicals for near-identical content
There are often instances where canonical tags are applied between near-identical content which can introduce complexities that can impact search engine interpretation and ranking.
Near-identical content can arise in various forms, from product variations to regional duplicates. While canonical tags are designed to guide search engines to the preferred version, their application to content that is nearly identical can lead to ambiguity.
Search engines may face challenges in accurately ranking pages when near-identical content is marked with canonical tags, potentially impacting the visibility of specific variations. Canonical tags may lead to issues in determining the appropriate version to index, resulting in suboptimal representation in search results.
As canonical tags are only hints to Google, there are times it may choose a preferred canonical to display in search results. This may lead to pages not being indexed or appearing in search results, despite the intention being that they should.
An example of this is managing out-of-stock product pages by implementing temporary canonical tags that direct search engines to the category page over the temporarily unavailable product pages. Google advises against this and are likely to ignore them, and it can cause an impact to indexing and rankings.
8. Test removing multiple canonical tags
Instances where multiple canonical tags are present can introduce confusion, potentially impacting how search engines interpret and prioritize pages. This dilemma is particularly prominent on large websites with complex structures, where unintentional duplication of canonical tags can occur.
The presence of multiple canonical tags can have implications for indexing and rankings, as search engines may struggle to determine the primary version of a page, impacting the accurate indexation of content. Conflicting signals can also introduce uncertainty into search engine algorithms, potentially affecting the ranking positions of affected pages.
Before implementing changes on a larger scale, testing is essential to understand the impact of removing multiple canonical tags. It helps verify whether the removal positively influences key metrics including rankings and organic traffic. It’s important to prioritize testing on subsets of pages representing different content types or structures to ensure a comprehensive understanding of the impact.
SEO test case study examples
Utilizing redirects instead of canonical tags
Redirects and canonical tags address different aspects of content management. Redirects guide users and search engines to a new location, while canonical tags indicate the preferred version of a page's content. The decision to use one over the other depends on the specific scenario and desired outcome.
We know that Google sees canonical tags as a hint, and may choose to ignore the one that is set or define their own based on several factors, such as internal links and sitemaps. On the other hand, redirects are directives, meaning Google must follow the instructions given by the site. Adding a redirect will therefore ensure, without any confusion, that only one version of the page will be indexed and consolidate all of the preferred signals.
Therefore, testing the use of redirects instead of canonicals, or canonicals instead of redirects is a useful way of determining the best approach for an individual website’s performance.
We have tested a similar change previously, with a customer that was looking at whether redirecting trailing slash URLS to the non-trailing slash equivalent had any impact on organic performance. The pages already had a canonical to the non-trailing slash URL, but were getting a small amount of traffic indicating that some of the pages were still indexed and ranking.
Upon running this experiment, we saw an expected decrease of organic sessions to the pages that we redirected. While the non-trailing slash URLs did see a marginally positive impact, the result remained inconclusive at a statistically significant percentage. However, due to the strong hypothesis of this test, the fact that removing duplicate pages is best practice and that the change was not trending negatively, our customer decided to take a default to deploy approach.
Despite not being statistically significant, there was evidence that our customer saw benefits by adding 301 redirects from trailing slash URLs to non-trailing slash URLs site-wide.
Canonicalizing to more specific pages
For ecommerce websites in particular, it’s always a challenge to ensure all variations of a product are indexed and able to rank alongside pages that may appear to be very similar.
A recent test run by one of our ecommerce customers, involved changing the canonical tag on a main product page from being self-referential, to pointing to one of the product variations. The hypothesis here was that the product variations would start getting indexed, while the main product page continued to perform well.
The way their site was structured was with one URL for the main product page, with separate URLs for each variation with a URL parameter to differentiate them. They had previously made a change to make all product variations indexable with self-referential canonical tags, including quantity and size variations. Unfortunately, those pages were not getting indexed consistently, and were not receiving as much organic traffic as had been hoped.
The impact of this test was measured in two different places: the main product pages, and the product variations. This allowed us to weigh up the impact for both the main and variation pages, to see what the net effect of this change would be.
For the main pages, we saw no negative impact, with the test being inconclusive overall. However, on the product variation pages, we saw a positive result, with the best estimate being a 22% uplift to organic traffic to those pages. As such, the overall result was positive, and this change was deployed to all relevant products.
How our SEO split tests work
The most important thing to know is that our case studies are based on controlled experiments with control and variant pages:
- By detecting changes in performance of the variant pages compared to the control, we know that the measured effect was not caused by seasonality, sitewide changes, Google algorithm updates, competitor changes, or any other external impact.
- The statistical analysis compares the actual outcome to a forecast, and comes with a confidence interval so we know how certain we are the effect is real.
- We measure the impact on organic traffic in order to capture changes to rankings and/or changes to clickthrough rate (more here).
Read more about how SEO A/B testing works or get a demo of the SearchPilot platform.