- retention is useless
- don't work on your retention
or anything like that.
I'm making a CLEAR distinction between what CATCHES attention and what RETAINS attention.
Both are important but for 2 entirely different purposes.
First the basics: Correlation vs. Causation
Causation means one thing DIRECTLY causes another.
Correlation means things happen together in a specific context.
Example:
A "hot" sun ☀️ DIRECTLY causes sunburn & ice cream sales to increase 📈.
There's a correlation between increasing sunburns and increasing ice cream sales, BUT neither of them CAUSES the other.
So in the specific context of YouTube recommendations:
Does higher retention DIRECTLY cause more views?
In other words, does improving your retention over time mean you'll get more views from recommendations over time?
The answer is no and we will see exactly why.
The answer can be deduced by understanding these 3 fundamental concepts:
- Market Dynamics
- Zero-sum Games
- Emergence
1. Market Dynamics
If you're thirsty and come across a waterfall, you'll be able to drink a portion of it but the water will keep flowing, you can't drink it all.
The supply (water flowing) is way bigger than the demand (you drinking it).
Well, on YouTube it's the same with attention.
The attention available is WAY lower than the supply on the platform.
Let's do some quick math:
On YouTube, there are 30,000 hours of newly uploaded content daily, or:
- 82 years of content /per day
- 30,000 years of content /per year
It's safe to assume there have been at least 150,000 years of content on YouTube since 2005.
Now, the demand (2023 data):
- 2.7B monthly users
- 47mn per user daily on average
2.7B users x 47mn = 127B minutes = ~242,000 years of attention spent monthly on YouTube (watch time).
Are you stupid wono? The demand is way higher than the supply!
You're right.
Except you didn't take into account the concept of "asymmetry."
Just like money in real life with products and services, attention is not equally distributed across all the content.
Let's see how asymmetry operates in an easy example.
Let's consider 2 videos of 10 minutes each:
- One is from MrBeast (100M views)
- One is an MC gameplay of a 13 yo (100 views)
That's asymmetry.
Both are 10 minutes long, but one polarizes 99.99% of the attention.
This asymmetry is called "the power law" where a minority has a lot, while the majority has very little.
Just like irl with money, a few people are very wealthy and the majority have little.
Don't take my word for it, here's the data (only 0.77% of videos have 100k+ views):
This is why I said "attention available" not "attention" because most of it is already vacuumed by a minority of videos (0.77% of the supply), it's not evenly spread.
And if you compare the attention available to the supply, it tells a completely different story.
2. Zero Sum Game
Competing for attention on YouTube is a zero-sum gamee because our attention is limited, we can't watch everything.
For example if you're reading this, you're not reading another tweet, you can't spend your attention in two places simultaneously.
While YT is not a STRICT zero-sum game, it can be considered as such since it is a highly complex version of it (involving market dynamics).
As you can see here from a Google paper, the algorithm's recommendation system RANKS videos against each other (zero-sum).
I could stop right here, and it would be enough to prove my point.
In a zero-sum game, beating the market requires something not everyone has access to (an edge).
"+Retention = more views" is mathematically impossible because for every winner, there MUST be losers.
Anyone can spend a few months learning retention techniques & storytelling.
You can only get better at it as you consume more content and practice, it's a static target.
Catching attention however is a moving target, it's a dynamic process with rules CONSTANTLY changing.
NO ONE is safe, you must reinvent yourself continually.
You can be really good at it for a few months, and 2 years later, don't understand why you're not getting views.
This leads to my last point that only a few people I've met truly understand at core (with all the complex nuances it brings).
3. Emergence
100 threads wouldn't be enough to dive deep into this concept so I'll try to keep it simple.
Emergence can be defined as follows:
Small things with very basic set of rules form very complex patterns (complexity emerging from simplicity.)
Take a shoal of fish for example.
One could think "how can all these fish form such complex patterns in real time and so quickly?"
The answer is emergence.
From a very basic set of rules every fish must individually follow, complex patterns emerge.
Rules:
- Separation: Avoid crowding your neighbors
- Alignment: Steer towards the average heading of your neighbors
- Cohesion: Steer towards the average position of your neighbors
- Survive: Avoid predators
Very simple rules lead to very complex patterns.
There are many other examples:
𝗠𝗮𝗿𝗸𝗲𝘁 𝗗𝘆𝗻𝗮𝗺𝗶𝗰𝘀: Individuals buying & selling decisions lead to complex macroeconomic trends.
𝗕𝗿𝗮𝗶𝗻: Neurons firing simple electrical signals results in consciousness and thought's complex phenomena.
etc.
Here is a great visual example using a Wolfram rule.
Very basic rules:
Complex patterns emerging from them:
Here is another from the Wolfram Physics Project that illustrates perfectly the idea:
Guess what is based on emergence and produces something similar?
YouTube. (Users behavior + the recommendation algorithm)
Because many people don't understand emergence, they think the algorithm compares videos when it actually compares viewers.
Let me explain.
If you're in ADHD mode right now, refocus because it's the most important piece of the puzzle right there.
It's not "this video has a great CTR, AVD...(whatever metric)" therefore it should be recommended.
It's "many viewers choose to watch this video, therefore it should be recommended further to similar viewers".
Everything else is emergence.
Rules:
𝗖𝗿𝗲𝗮𝘁𝗼𝗿𝘀: Upload videos and aim to maximize attention to their content
𝗩𝗶𝗲𝘄𝗲𝗿𝘀: Watch videos, ignore videos, engage (like, comment..) based on their interests & preferences.
𝗔𝗹𝗴𝗼𝗿𝗶𝘁𝗵𝗺: Learns from viewer behavior to predict what each viewer might want to watch next.
And from these very basic and simple rules, very complex patterns emerge:
- the meta (red arrows, react, MrBeastification...)
- markets (niches)
etc.
I explain here why even developers working on the algorithm can't understand how a decision has been made (because deep learning is a process that involves emergence).
They can only code and adjust the basic rules to avoid undesirable outcomes and maximize desirable ones.
Because the algorithm's goal is specifically optimized to predict the next content a viewer will watch (expected watch time), many different results will emerge from basic simple rules.
One of them is: "a simple function of expected watch time per impression"
While for many the conclusion of this is:
"get a higher retention because more retention = more watch-time = more expected watch-time = more views".
It's because they don't understand emergence and market dynamics.
What causes a video to get views is what emerges from "trying to predict what a viewer is likely to watch next".
And the answer is remarkability (standing out, curiosity, catch attention, anti-average... whatever you want to call it).
The reason is dead-simple:
1) What catches attention is more likely to be seen 2) What is seen is more likely to be watched 3) What is more watched is more likely to be recommended
When you understand that, it's perfectly logical for you to see these videos emerge and go viral:
Which video do you think will be more pushed:
- 30sec watch time per user, chosen 1,000,000 times (500,000 minutes watch time)
- 1h watch time per user chosen 10 times (600 minutes watch time)
Yes, many videos with a lot of views have good retention, not because of retention but because the idea/concept is well executed.
You could make the best documentary in the world (retention-wise) about a topic no one cares about, good luck getting views.
This was my point with the Dodford tweet.
Anyone who has seen a @dannymcmahon video knows how excellent his videos are and how well executed retention-wise they are (storytelling, pacing, editing).
This is PRECISELY why the more viral a video, the lower the CTR/Retention will be.
Because a video that is more likely to be watched is more recommended to a colder and colder audience.
Recommendations go up but fewer people click & fewer people who click watch longer.
This phenomenon by itself is proof that retention is not what causes more views.
YouTube has been at the top of the game because it's not dogmatic, it relies on emergence instead of specific metrics (neutral, lets the market decide).
It's bottom-up not top-down.
While psychology research helps draw an outline of human behaviors, human interactions remain highly chaotic and unpredictable at scale.
Viewers' behavior will always have more weight than specific metrics because that's what the algorithm is trained for.
What is retention important for then?
The answer lies within the word itself, it's to RETAIN the attention you've been able to catch (build an audience, sell a product/service etc.)
To conclude, if more "retention = more views" is true, then the universal law of emergence and supply/demand is wrong.
Guess which side I've chosen ¯\_(ツ)_/¯.
It's your time to choose now.
To summarize:
1) Supply is infinite but human attention is limited.
2) Catching attention on YouTube is a zero-sum game. Videos are ranked against each other, every second spent watching one video is a second not spent on another.
3) The power law: a minority of videos gets the majority of views
4) The algorithm uses billions of data points in its obscure decision-making process. It’s like a vast digital ocean, with each viewer a wave. The algorithm just tries to predict where the wave will go next 🌊
Therefore, retention can't be the cause for more views, it's only a correlation in some specific context.
Hope this helps you understand YouTube better and avoid unnecessary scars for you in the future.
If you want to learn more about YouTube with me, you can apply to my private (but free) newsletter here:
Read this & press the like button to prove to others it’s not.
The video you see above flatlined for 6 months and blew up after switching to a better thumbnail & title.
(Screenshot from @viewstats)
What took this creator half a year to figure out (because sacrificing impressions is part of the process, science has a price) is now totally predictable.
Not only before uploading, but even before the video is produced thanks to the tool we've built: