What is 𝐒𝐭𝐚𝐧𝐝𝐚𝐫𝐝 𝐝𝐞𝐯𝐢𝐚𝐭𝐢𝐨𝐧? What's the intuition behind it? Does it have any real world application?
It's going to be a long 🧵!
I give you 3 numbers — 3,5,7 and ask for their average/mean
“Duh.. 5” you say, while thinking “maybe this thread wasn’t such a good idea…”. On that, let me pull up another question and the standard of this thread
Can you?
However, you can guess- (3y,4y,8y), (2y,6y,7y) or (5y,5y,5y) being few of them
Given what you know, they are all valid guesses to the question, and respectable ages for children!
***The next time you hear people summarise huge sets of observations with just averages, remember that’s just the opening act of the story.***
That gets us to the next question- What can make our summary more representative of the underlying observations?
To answer that, we find ourselves on the ever friendly number line. Why?
“Can we not use “how far is each observation from its mean” as some kind of measure to make our summary better- After all, there seems to be different spread for each set, even though they have the same mean”
Right on!
Eg, for the set of ages (2,6,7)- the average deviation from mean would be
[(2–5)+(6–5)+(7–5)]/3= (-3+1+2)/3 = 0. Huh?!
*Explanation in next tweet
Eg- When you say the average age of a class is 5 years, what you are saying is that there are some students who are less than, some more than, and some exactly 5 years of age
Let’s call it the Average squared deviation from mean.
[(2–5)²+(6–5)²+(7–5)²]/3 = (9+1+4)/3 = around 4.3….but 4.3 what? What is the unit of Variance? Years? But remember we squared the terms above — So it’s 4.3 years²
***√Variance = Standard deviation***
So, Standard deviation is simply saying- “For this set of observations, there is this much deviation from the typical behaviour”
Touch the tip of your index finger and thumb- the distance b/w them denoting the spread/deviation from the typical behaviour
If you didn’t move your index finger, you did well!
What about children with ages 1y, 1y and 13y? Can you move the index finger to indicate the spread here?
Step 1 - Identify the typical behaviour of the observations- We know the mean is 5 years
Step 2 - Visualise the three observations around the mean
Just move your finger to some arbitrary spread for now (and remember how much)
Now, I am assuming that the spread you did for (4,5,6) was definitely less than that for (1,1,13), right?
Feel that hi-five coming you way too!
Let’s talk about sports 🎾 and a practical application of why we care about “deviation from typical behaviour”
*Drop in a comment with your replies (from the previous tweet)
1. The 68–95–99 rule for Standard deviation - This is very interesting because with this rule, we can answer questions like...(next tweet)
This might be a little too detailed for some, however, as decision makers who try to help our users, we owe it to ourselves to understand the very basics of being data informed
You can find the entire article in my blog at liminal.substack.com/p/developing-i…
👋






