Semrush Profile picture
Online visibility management SaaS platform that has been used by over 10,000,000 marketers worldwide 🔥

Dec 14, 2021, 22 tweets

::: Canonicalization - What, Why and How :::

Want to know about Canonicalisation?
How to handle Duplication?
What to avoid doing?

I'm Darth Autocrat, and we're going to look at Canonicalisation:
* What it is
* Why we need it
* How to do it

1/22
@darth_na #SEOThread

:: What is Canonicalisation?

Definition: the selection of one of multiple as the preferred or representative version.

For SEO, this typically means picking a URL out of 2+.
But there are different reasons, methods and types of canonicalisation!

2/22
@darth_na #SEOThread

:: Reasons for Canonicalisation

Canonicalisation is to help with two main types of issue:
Duplication and High Similarity.

Duplication can occur due to variant URLs.
High Similarity and sometimes Duplication may occur due to content/spam issues

3/22
@darth_na #SEOThread

These can cause:

1. Split ranking value
(values spread across 2+ versions)

2. Fluctuations in the SERPs
(SE’s may switch between different versions)

3. Crawling issues
(SE’s may crawl dupes rather than new/edited content)

4/22
@darth_na #SEOThread

4. Potential Spam issues
(masses of nigh identical/cookie-cutter content may be perceived as spam)

5. User confusion/irritation
(encountering the same/highly similar pages, thinking they are new/different)

5/22
@darth_na #SEOThread

6. Complexity in tracking
(SERP performance across multiple pages is a pain)

7. UX/CRO monitoring
(campaigns and interactions across multiple pages is a pain too)

8. 3rd party’s ranking
(syndicated content may get SERP attention instead of yours)

6/22
@darth_na #SEOThread

:: Methods of Canonicalisation ::

There are 4 methods of canonicalisation:
* Hard
* Soft
* Hybrid
* Wrong

Hard and Soft have their own purposes.
Hybrid and Wrong should be avoided.

7/22
@darth_na #SEOThread

Hard canonicalization:

When the non-canonical URL cannot be accessed,
instead, the designated canonical is returned/displayed.

This is often done via 301 Redirects,
and used for things like TLDs, Domains and Subdomains etc.

8/22
@darth_na #SEOThread

Soft canonicalization:

When the non-canonical URL(s) can be accessed (viewed/used),
but the preferred canonical is indicated such as via a Canonical Link Element/Response or Sitemap.

This is used for variant pages (such as products by colour etc.).

9/22
@darth_na #SEOThread

Hybrid canonicalisation:
Is dodgy!

It is a last resort if you are using Soft and SEs like Google “get it wrong”.
(Ideally, you should alter other signals such as internal links etc.)

10/22
@darth_na #SEOThread

Wrong:

* Robots.txt/NoIndex/404s are not for canonicalisation
* Canonicalising different languages (use “hreflang”)
* Self-referencing Canonicals - every page referencing itself, including variants
* Multiple different canonicals on the same page

11/22
@darth_na #SEOThread

:: Types of Canonicalization ::

* Self canonical
* Cross Page canonical
* Cross Domain canonical

In most cases, when people talk about these things,
they are referencing a Canonical Link (CLE/R).
Remember, CLE/R's are "soft" (suggestions).

12/22
@darth_na #SEOThread

Self Canonical:

The page uses a Canonical Link Element or Response (CLE/CLR),
and references it’s own URL.

This can help combat URL pollution/abuse (such as unusual parameters and values appearing in the URL, or parts of the path changing Case)

13/22
@darth_na #SEOThread

Cross Page Canonical:

The normal use of a CLE - point from one or more variant pages to a single preferred URL.

* Category filters
* Variants of products
* Dupe content under multiple URLs
* URL variants (case/param order etc.)

14/22
@darth_na #SEOThread

Cross Domain Canonical:

Pointing to a URL on a different domain/site.
Ideal (should be used) for things like Syndicated content.

IMPORTANT:
G may ignore the CLE/CLR as it’s “Soft”;
the syndicated content may rank instead of/above your version!

15/22
@darth_na #SEOThread

::: Canonical Confusions :::

Canonical Links are Optional:
People sometimes don't understand that the use of a CLE or CLR (canonical link element/response) are only "suggestions", they are not "directives".

SE's decide what to do with them.

16/22
@darth_na #SEOThread

This decision may be influenced by relevance to query,
(including user/browser preferences such as language, as well as “keywords),
or popularity/prominence (volume/value of internal/inbound links).

17/22
@darth_na #SEOThread

Variant URLs:

Things such as:
* case (upper/lower/mixed)
* extra/non-functional parameters
* parameter order
can all result in the same page/content,
but Different URLs as far as Browsers/SEs are concerned.

/this?a=1&b=2
/ThiS?b=2&a=1&fake=76

18/22
@darth_na #SEOThread

Not Alternatives:

1. Noindex - the page won’t be ranked. Any value will be lost.
2. Noindex & CLE/CLR - contradicts (either don’t rank it, or merge it)
3. Disallow - G can't see if the page is a dupe, nor see any CLE/CLR!
4. 404 - Value is lost

19/22
@darth_na #SEOThread

Dead-ends:

Don’t canonicalise to URLs that are “dead” (404, 410, Noindex).
This tells SE’s that you prefer the non-canonicals to not be shown,
and instead prefer a page that won’t get indexed!

20/22
@darth_na #SEOThread

Chaining:

You shouldn’t “chain” canonicals if you can avoid it.
This not only applies to 301 Redirects, but to CLE/CLR/Sitemaps too.

Wrong : A > B > C > D
Right : A > D + B > D + C > D

21/22
@darth_na #SEOThread

Relying on band-aids:

Though canonicalisation is needed in many cases (such as multiple domains etc.),
plenty of cases are caused by systems (site/code).

Instead, try to fix the system so the automatic duplicates don’t occur.

22/22
@darth_na #SEOThread

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Keep scrolling