How did I get this poll with almost 29k responses to balance perfectly? A thread. πŸ‘‡
Assuming most people didn't secretly flip a coin, where's the randomness in the poll coming from? I think it comes from three sources:
1. Some folks were genuinely picking randomly

2. Based on the comments, even for folks who used a system, the method they used was very unique to them and therefore really random relative to other people
3. From the perspective of the Twitter algorithm, each new person that gets shown the poll is a toss up in terms of whether they favor heads or tails, much like flipping a coin. It doesn't matter if they picked non-randomly. From the perspective of the poll, they appear random.
So, now that we've established that the people answering the poll are probably going to act a little bit like a flipping coin, what does statistics have to say about flipping a coin 29k times?
Law of Large Numbers
The average of a large number of observations should get closer to a particular value as more observations are collected. This value is called the "expected value". If we code heads as 0 and tails as 1 then the expected value for a fair coin should be 0.5.
Central Limit Theorem
The average of a large number of observations tends to cluster around the "expected value" in an increasingly tightly-clustered pattern that resembles a bell curve. We can't see the pattern with just one experiment but we do see it with lots of experiments.
You might be wondering what's a bell curve? It looks like this. The previous tweet is saying that most of the experiments will have averages that cluster around the center with fewer and fewer as the averages get more extreme.
If the bell curve feels a little abstract, don't worry. It's a lot more familiar than you might think. Men's heights are roughly distributed like a bell curve and so are the heights of women. So we've actually all been experiencing bell curves our whole lives.
You also might be wondering why I'm talking about "experiments". We only did one poll. Often statistics means thinking about the multiverse. We don't just think about our universe but every other universe where randomness would have caused our experiment to turn out differently.
Looking at our experiment in the context of the "multiverse" is what allows us to see that the results become from predictable as we get more observations.
If we assume our poll is like a fair coin then using the math of the bell curve, we can figure out what kind of results we can expect to get after 29k answers. As you can see below, there was about a 95% chance that the percent of heads would be between 0.494 and 0.506.
The precise proportion of heads in this poll was 0.504 which was well within the realm of possibility!
The one thing I did get lucky on is that the preference for heads and tails seems to be symmetric in the polling population. So for every person that prefers heads, there's an equal and opposite person that likes tails, and vice versa.
This didn't have to be true but will tend to give a close to 50-50 split when you select people randomly, even if their choices aren't random.
The first time I tried this poll experiment, it was pretty biased. I think because there was lack of symmetry in beliefs. My statistics savvy audience thought that people would be more likely to select the first option so they tried to "unbias" the poll by selecting the 2nd one.
My solution was just to tell them they were biased which caused them to be confused about what would happen on this poll, which unbiased them.
So there you go. That's the magic trick. I'm not a wizard. I'm just a statistician. 😏
Hope you enjoyed the thread. If you like this content and want to support it, please like and retweet the thread so others can enjoy it as well, and follow me to get more threads like this one in the future.

β€’ β€’ β€’

Missing some Tweet in this thread? You can try to force a refresh
γ€€

Keep Current with πŸ”₯ Kareem Carr πŸ”₯

πŸ”₯ Kareem Carr πŸ”₯ Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @kareem_carr

8 Sep
Here’s the result of yesterday’s statistics experiment!

The poll is significantly πŸ˜‰ biased!

WHY???

A thread.πŸ‘‡ Image
Here’s my plot of the responses as they came in.
With 7291 responses, this is *really* baised. The chances of it being a β€œfair coin flip” are basically 0. πŸ˜‚ What’s going on? Image
As a good data scientists, we can use our qualitative data to help us understand our quantitative data! What qualitative data? The comments! Apparently, some folks tried to think one step ahead of the other respondents.
Read 6 tweets
13 Aug
THINK LIKE A DATA SCIENTIST:

Probability is hard because counting is hard.

A thread. πŸ‘‡
For a lot of people, mathematics is true in the same way that "Kermit The Frog and Miss Piggy are a couple" is true. It's true in an imaginary world where we have agreed upon rules. If that's how you think about math then it's pretty obvious that "2+2=4".
To me, "2+2=4" means that "2 things + 2 things will always be 4 things no matter what the things are". Turns out this is not technically true. You can create all kinds of mathematical systems and physical situations where 2 things + 2 things is not 4 things.
Read 24 tweets
12 Aug
Just found this. Not sure if @michaelshermer is confusing @nhannahjones with me or somebody else because I never said most of that stuff either. What I will say is I learned from my (mostly white) grad school professors how to construct mathematical systems where 2+2 isn't 4.
If that seems contrary to reason to you then I humbly suggest that maybe you don't understand reason as well as you think you do. I know many of us probably learned in grade school that 2+2=4 but the relevant context is it's basic math that they teach to kids.
My race seems to suggest to people that this is a race thing somehow. It's not. Check out the link for a PhD who's not black and who also agrees that 2+2 is not always 4. As Dr. Hossenfelder puts it, "It's not woke. It's math."
Read 4 tweets
6 Aug
THINK LIKE A DATA SCIENTIST:

Are you frustrated with how organizations like the CDC and the WHO are handling the pandemic? Do you wish they did a better job of following the data?

If so, read on... πŸ‘‡
One of the earliest lessons of the pandemic was covid outbreaks can get really bad really quickly. While the costs of over-responding are easy to predict like unnecessary financial losses and physical discomfort, the costs of under-responding are harder.
Some areas got away with relatively small outbreaks. Others experienced tremendous disruptions to their healthcare system and significant losses of life.
Read 18 tweets
14 Jul
Lets clear up some things about:
- race
- social constructs
- biological constructs
- sociological causation
- biological causation
- predictive accuracy
A thread. 1/n
Let's say X predicts Y. This doesn't mean X is in any way causally related to Y. Therefore, if I say "X predicts Y", it doesn't mean I'm saying "X causes Y". So in particular, if I say race predicts a health outcome, this doesn't mean race caused that outcome. 2/n
The next thing we should talk about is causation. Sociological causation involves entities from a sociological framing of the world. Biological causation involves constructs that originate out of a biological framework. 3/n
Read 11 tweets
10 Jul
Speaking as a black statistician, I don't think we can completely eliminate using race in medical decisions if we want to make the best decisions for each patient given our current state of technology. Gene testing for specific ancestry would be better but we aren't there yet.
Leaving out race can have a lot of unintended consequences. Algorithms might default to the standard of care for the largest group. This means treating everybody as if they are white which would be problematic in many cases.
Alternatively, algorithms may relearn race from the data. They will use family history, geography and other demographic variables to guess the race. This can be tricky to detect if you aren't looking for it. "Racist" algorithms get released to the public all the time.
Read 9 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(