Kareem Carr, Statistics Person Profile picture
Sep 10, 2021 19 tweets 5 min read Read on X
How did I get this poll with almost 29k responses to balance perfectly? A thread. 👇
Assuming most people didn't secretly flip a coin, where's the randomness in the poll coming from? I think it comes from three sources:
1. Some folks were genuinely picking randomly

2. Based on the comments, even for folks who used a system, the method they used was very unique to them and therefore really random relative to other people
3. From the perspective of the Twitter algorithm, each new person that gets shown the poll is a toss up in terms of whether they favor heads or tails, much like flipping a coin. It doesn't matter if they picked non-randomly. From the perspective of the poll, they appear random.
So, now that we've established that the people answering the poll are probably going to act a little bit like a flipping coin, what does statistics have to say about flipping a coin 29k times?
Law of Large Numbers
The average of a large number of observations should get closer to a particular value as more observations are collected. This value is called the "expected value". If we code heads as 0 and tails as 1 then the expected value for a fair coin should be 0.5.
Central Limit Theorem
The average of a large number of observations tends to cluster around the "expected value" in an increasingly tightly-clustered pattern that resembles a bell curve. We can't see the pattern with just one experiment but we do see it with lots of experiments.
You might be wondering what's a bell curve? It looks like this. The previous tweet is saying that most of the experiments will have averages that cluster around the center with fewer and fewer as the averages get more extreme.
If the bell curve feels a little abstract, don't worry. It's a lot more familiar than you might think. Men's heights are roughly distributed like a bell curve and so are the heights of women. So we've actually all been experiencing bell curves our whole lives.
You also might be wondering why I'm talking about "experiments". We only did one poll. Often statistics means thinking about the multiverse. We don't just think about our universe but every other universe where randomness would have caused our experiment to turn out differently.
Looking at our experiment in the context of the "multiverse" is what allows us to see that the results become from predictable as we get more observations.
If we assume our poll is like a fair coin then using the math of the bell curve, we can figure out what kind of results we can expect to get after 29k answers. As you can see below, there was about a 95% chance that the percent of heads would be between 0.494 and 0.506.
The precise proportion of heads in this poll was 0.504 which was well within the realm of possibility!
The one thing I did get lucky on is that the preference for heads and tails seems to be symmetric in the polling population. So for every person that prefers heads, there's an equal and opposite person that likes tails, and vice versa.
This didn't have to be true but will tend to give a close to 50-50 split when you select people randomly, even if their choices aren't random.
The first time I tried this poll experiment, it was pretty biased. I think because there was lack of symmetry in beliefs. My statistics savvy audience thought that people would be more likely to select the first option so they tried to "unbias" the poll by selecting the 2nd one.
My solution was just to tell them they were biased which caused them to be confused about what would happen on this poll, which unbiased them.
So there you go. That's the magic trick. I'm not a wizard. I'm just a statistician. 😏
Hope you enjoyed the thread. If you like this content and want to support it, please like and retweet the thread so others can enjoy it as well, and follow me to get more threads like this one in the future.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Kareem Carr, Statistics Person

Kareem Carr, Statistics Person Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @kareem_carr

Feb 18
Took one for the team and made a histogram of the Elon social security data. Not sure why his data scientists are just giving him raw tables like that. Image
Image
It’s also weird that they keep tweeting out these extremely strong claims without taking a few days to do some basic follow up work.
It doesn’t come off like they even:
- plotted the data
- talked to any of the data collectors
- considered any alternative explanations
Read 6 tweets
Feb 8
Here's my solution to teaching this kid probability 🧵 Image
Let's just take his system of assigning probability at face value. What's the probability of getting a six when I roll a die?

Well either it happens or it doesn't happen. So, the chances of getting a 6 are 50%.
What's the probability of it being a one? Also 50%. What's the probability of it being a two? Also 50%.

That all adds up to 300% across all scenarios. No problem though. There's a solution.
Read 5 tweets
Feb 6
Nate Silver's latest book reads to me like a roadmap of the current moment. It's about a kind of chaotic, aggressive quantitative thinker who's usually wrong, but in calculated ways that lead to massive wins when things break their way. Image
These would include venture capitalists, crypto bros, tech evangelists, AI boosters and even a few influencers. They also seem to be among the most powerful members of MAGA.
Their constant wrongness tempts the rest of society to see them as idiots. That's a mistake. They're often making calculated bets on rare events with massive payoffs.
Read 6 tweets
Jan 23
This is a resource thread about the Datasaurus Dozen data and how to get it.

The Datasaurus Dozen is a collection of extremely different datasets with near identical summary statistics.

It’s a reminder to all of us to ALWAYS plot our data.
Here’s what all the datasets look like: Image
It’s available through R using the following code. Technically, all you need is the library call:

library(“datasauRus”)

and then you can access the datasauruss_dozen variable containing the datasets. The rest is just for plotting. Image
Read 6 tweets
Jan 20
Nassim Taleb has written a devastatingly strong critique of IQ, but since he writes at such a technical level, his most powerful insights are being missed.

Let me explain just one of them. 🧵 Image
Taleb raises an intriguing question: what if IQ isn't measuring intelligence at all, but instead merely detecting the many ways in which things can go wrong with a brain?
Imagine a situation like this, where there's no real difference between having an IQ of 100-160 in terms of real world outcomes, but an IQ of 40-100 suggests something has gone seriously wrong in a person's life: anything from lead poisoning to severe poverty. Image
Read 11 tweets
Jan 15
Here's something counterintuitive, that a lot of people don't understand about heritability as it relates to race, if skin color is heritable, and discrimination based on skin color is common, the bad outcomes due to racism is going to be heritable as well.
Whenever you get any race-related heritability numbers, the first thing you absolutely should do is ask the person giving you those numbers what they did to rule these pathways out as a possibility.
In my experience, the answer is almost always nothing.
Read 4 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(