::: Canonicalization - What, Why and How :::
Want to know about Canonicalisation?
How to handle Duplication?
What to avoid doing?
I'm Darth Autocrat, and we're going to look at Canonicalisation:
* What it is
* Why we need it
* How to do it
1/22
@darth_na #SEOThread
:: What is Canonicalisation?
Definition: the selection of one of multiple as the preferred or representative version.
For SEO, this typically means picking a URL out of 2+.
But there are different reasons, methods and types of canonicalisation!
2/22
@darth_na #SEOThread
:: Reasons for Canonicalisation
Canonicalisation is to help with two main types of issue:
Duplication and High Similarity.
Duplication can occur due to variant URLs.
High Similarity and sometimes Duplication may occur due to content/spam issues
3/22
@darth_na #SEOThread
These can cause:
1. Split ranking value
(values spread across 2+ versions)
2. Fluctuations in the SERPs
(SE’s may switch between different versions)
3. Crawling issues
(SE’s may crawl dupes rather than new/edited content)
4/22
@darth_na #SEOThread
4. Potential Spam issues
(masses of nigh identical/cookie-cutter content may be perceived as spam)
5. User confusion/irritation
(encountering the same/highly similar pages, thinking they are new/different)
5/22
@darth_na #SEOThread
6. Complexity in tracking
(SERP performance across multiple pages is a pain)
7. UX/CRO monitoring
(campaigns and interactions across multiple pages is a pain too)
8. 3rd party’s ranking
(syndicated content may get SERP attention instead of yours)
6/22
@darth_na #SEOThread
:: Methods of Canonicalisation ::
There are 4 methods of canonicalisation:
* Hard
* Soft
* Hybrid
* Wrong
Hard and Soft have their own purposes.
Hybrid and Wrong should be avoided.
7/22
@darth_na #SEOThread
Hard canonicalization:
When the non-canonical URL cannot be accessed,
instead, the designated canonical is returned/displayed.
This is often done via 301 Redirects,
and used for things like TLDs, Domains and Subdomains etc.
8/22
@darth_na #SEOThread
Soft canonicalization:
When the non-canonical URL(s) can be accessed (viewed/used),
but the preferred canonical is indicated such as via a Canonical Link Element/Response or Sitemap.
This is used for variant pages (such as products by colour etc.).
9/22
@darth_na #SEOThread
Hybrid canonicalisation:
Is dodgy!
It is a last resort if you are using Soft and SEs like Google “get it wrong”.
(Ideally, you should alter other signals such as internal links etc.)
10/22
@darth_na #SEOThread
Wrong:
* Robots.txt/NoIndex/404s are not for canonicalisation
* Canonicalising different languages (use “hreflang”)
* Self-referencing Canonicals - every page referencing itself, including variants
* Multiple different canonicals on the same page
11/22
@darth_na #SEOThread
:: Types of Canonicalization ::
* Self canonical
* Cross Page canonical
* Cross Domain canonical
In most cases, when people talk about these things,
they are referencing a Canonical Link (CLE/R).
Remember, CLE/R's are "soft" (suggestions).
12/22
@darth_na #SEOThread
Self Canonical:
The page uses a Canonical Link Element or Response (CLE/CLR),
and references it’s own URL.
This can help combat URL pollution/abuse (such as unusual parameters and values appearing in the URL, or parts of the path changing Case)
13/22
@darth_na #SEOThread
Cross Page Canonical:
The normal use of a CLE - point from one or more variant pages to a single preferred URL.
* Category filters
* Variants of products
* Dupe content under multiple URLs
* URL variants (case/param order etc.)
14/22
@darth_na #SEOThread
Cross Domain Canonical:
Pointing to a URL on a different domain/site.
Ideal (should be used) for things like Syndicated content.
IMPORTANT:
G may ignore the CLE/CLR as it’s “Soft”;
the syndicated content may rank instead of/above your version!
15/22
@darth_na #SEOThread
::: Canonical Confusions :::
Canonical Links are Optional:
People sometimes don't understand that the use of a CLE or CLR (canonical link element/response) are only "suggestions", they are not "directives".
SE's decide what to do with them.
16/22
@darth_na #SEOThread
This decision may be influenced by relevance to query,
(including user/browser preferences such as language, as well as “keywords),
or popularity/prominence (volume/value of internal/inbound links).
17/22
@darth_na #SEOThread
Variant URLs:
Things such as:
* case (upper/lower/mixed)
* extra/non-functional parameters
* parameter order
can all result in the same page/content,
but Different URLs as far as Browsers/SEs are concerned.
/this?a=1&b=2
/ThiS?b=2&a=1&fake=76
18/22
@darth_na #SEOThread
Not Alternatives:
1. Noindex - the page won’t be ranked. Any value will be lost.
2. Noindex & CLE/CLR - contradicts (either don’t rank it, or merge it)
3. Disallow - G can't see if the page is a dupe, nor see any CLE/CLR!
4. 404 - Value is lost
19/22
@darth_na #SEOThread
Dead-ends:
Don’t canonicalise to URLs that are “dead” (404, 410, Noindex).
This tells SE’s that you prefer the non-canonicals to not be shown,
and instead prefer a page that won’t get indexed!
20/22
@darth_na #SEOThread
Chaining:
You shouldn’t “chain” canonicals if you can avoid it.
This not only applies to 301 Redirects, but to CLE/CLR/Sitemaps too.
Wrong : A > B > C > D
Right : A > D + B > D + C > D
21/22
@darth_na #SEOThread
Relying on band-aids:
Though canonicalisation is needed in many cases (such as multiple domains etc.),
plenty of cases are caused by systems (site/code).
Instead, try to fix the system so the automatic duplicates don’t occur.
22/22
@darth_na #SEOThread
Share this Scrolly Tale with your friends.
A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.