The distinction is important, because a penalty is an applied negative.
Getting hit with a penalty for things like mass-cookie-cutter or stolen content is vastly different to G filtering out X of Y pages.
For external dupes - you may well be stuffed (better off to alter your version (and learn Not to syndicate full/exact versions! (Not even if they provide a link and use a CLE/CLR!))
Then there's nigh-identical and cookie-cutter,
again ... different!
There's seldom a good reason to have multiple versions of an informational piece.
Typically highly-similar/nigh-identical applies to "products" and "services".
Legitimately - you may have numerous versions
(such as different colour iPads, with different RAM etc.)
If done properly - these are canonicalised.
G will then attempt to show the most relevant out of the set to match a query.
(if someone searches for "car sales in X", G may show your X page, even if the designated canonical is the Y page!).
Cookie-cutter pages though are often in greater quantity, with only the "keyword" different.
Worse, the quality of the content is often lower ... G doesn't want it, and may penalise it, or the "quality" is marked down (lower crawl/index rates).
Google's reaction to "duplicates",
and the impact it can/does have on:
* indexing
* crawling
* ranking
can and does vary - depending on the nature, quantity and quality.
And the only way that's going to happen is if people use correct descriptions.
Things like "boilerplate" aren't duplication.
Highly similar/expanded mashups aren't duplication.
Even cookie-cutter-content isn't technically "duplication" (as they aren't identical).
• • •
Missing some Tweet in this thread? You can try to
force a refresh
A topic that is often ignored,
despite the huge influence that reputation plays in marketing,
the impact it has on sales,
and what a PITA it is for SEO.
Proper (and continuous!) research should yield insights into motivation/cause, and locations.
You should be able to utilise "personas" (or demo-/firmo- and psycho-graphics) to locate additional locations, probable channels/sources.
Failing that, use Search for questions!
Search for the same things your consumers do,
and you'll likely find where they go for info ... and where you should be!
(providing answers, running ads, providing sponsorships etc.)
Do searches for Product/Service -brand, and see what comes up. Or +Comp. brand!
Originally, Keywords were THE thing.
Meta Keywords and string matching.
Other SEs came along, things evolved, Meta-Keywords basically died.
Yet the term remained.
Though how they are used has evolved,
the way they are used for research hasn’t really.
3/*
As competition for “keywords” got harder,
new terms came to the fore:
* Head term
* Longtail
* And then Mid-tail joined in
As more businesses went online, and more sites, pages and content appeared - it became harder to rank for the shorter “keywords”.
+ When looking at TLDs for Domain Names, check for confusion points (same name, different TLD etc.)
>>>
>>>
+ Sort the HTTP > HTTPS out, and pick either www or non-www - then get the 301s sorted out from day one.
+ Own your Name! Make sure you own a domain with your Brand, and you have social profiles for it (same for unique product names etc.).
Same for Directories.
* GA BR (Google Analytics Bounce Rate)
* GA ToP (Google Analytics Time on Page)
* GA ToS (Google Analytics Time on Site)
* SERP CTR (Search Engine Result Page Click Through Rate)
are NOT (direct/indirect) ranking factors!
>>>
3/*
For starters, only a % of sites use Google Analytics,
so there'd be a Huge data hole.
And each of them are ambiguous/noisy,
with various reasons for whatever value,
(a page with the weather - quick visit, leave - does not mean the page sucks or is irrelevant!)