Bob is a misogynist who subconsciously avoids intelligent women. Bob randomly selects a list of men and women he knows. He uses statistical best practices to analyze their cognitive abilities and concludes women are less intelligent than men.

Is Bob's analysis scientific?
Reasons to say Bob's analysis is scientific:

1. Anybody else analyzing the same data will get a similar answer

2. The model is predictive of future data. Bob will continue to avoid intelligent women in the future and so the model will accurately predict his future experiences.
A second misogynist Tom replicates Bob's study using people he knows. Tom confirms Bob's findings.

Additionally, a group of 1000 scientifically-inclined misogynists pool together all of the people they know and a third party data scientist finds a similar result.
At this point, the finding is seemingly robust and highly replicable.

Should we now be more convinced than ever that Bob's study is objective and scientific?
If you object that this isn't science, what standard scientific norm could you apply to disqualify the finding?

Note that:
- the data (as collected) are accurate
- the models are predictive of future data
- the finding has been replicated multiple times by multiple groups

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Kareem Carr | Data Scientist | 📊📈

Kareem Carr | Data Scientist | 📊📈 Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @kareem_carr

Sep 1
When people hear a scientific concept is "socially constructed", they interpret it as meaning "arbitrary". That's wrong!

I argue it means "constructed via a networked computation over multiple brains".

A mathematical proof is socially constructed because it needs to be convincing to multiple minds.

It's not that one mind can't simulate other perspectives, it's that one mind can't empirically demonstrate that other minds will find it convincing.
Similarly, a single lab can't fully verify that a scientific finding is replicable.

We need other labs to repeat the experiment and reproduce the finding.

A scientific fact that can only be produced in one place at one time by one person is no scientific fact at all.
Read 5 tweets
Aug 8
The answer to this one may surprise you. Frequentist or Bayesian?  P(H|E) = P(E|H)P(H)  /  P(E|H)P(H) + P(E|~H)P(~H)
IT'S NEITHER.

Read on for details as I zoom in on each half of this graphic. Graphic comparing Bayes Theorem in both the Bayesian and Frequentist context. Sensitivity, specificity, prevalence and positive predictive value vs prior, posterior and likelihood ratio.
BAYESIAN VERSION

The most common way of interpreting this equation is Bayesian.

The PRIOR (our level of certainty before seeing the data) is updated using the following equation to obtain the POSTERIOR (our level of certainty after seeing the data). Image of Bayes Theorem with prior P(H), likelihood P(E|H) and posterior P(H|E).
Read 14 tweets
Aug 7
Should we teach Calculus or Data Science in high school? Why not both?

Here's how I'd explain the calculus concept of a limit from a data science perspective: Image
Imagine you have a machine. Let's call it "f". If we feed a number x into the machine as input, then we get out a new number as output. Let's call the new number f(x).

There's just one issue.

In the real world, your inputs are noisy. So, your output ends up being noisy too. Image
This is where the idea of a limit comes in. The limit is a guarantee on the quality of your outputs.
Read 7 tweets
Aug 6
Here are three different ways of thinking about linear regression and why it works. Image
PERSPECTIVE 1: Physics

Hooke's law is a simple mathematical model of a metal spring. It states that the force on a spring is proportional to the length that the spring is stretched. Image
If you take a bunch of springs that follow Hooke's law and you attach one end to each of your data points and the other end to a straight line, the equilibrium point of that physical system would be the linear regression line. Image
Read 10 tweets
Aug 4
TEN types of statistical averages

THREE simple frameworks for thinking about measures of central tendency.

This thread has it all! Image
Warning: You may have heard people say there's only one thing called "the average" or "the mean". In this thread, we're going to use the word "average" or "mean" to apply to any one of a large family of measures of central tendency.
1. Mode

(Let's start slow. Feel free to skip the stuff you already know!)

This is the value that occurs most frequently in your data. Image
Read 18 tweets
Aug 1
TEN TIPS ON HOW TO SPEED READ:

One of the most valuable classes I took at Harvard was a short course on speed reading. Here's what I learned: Image
1. Minimize Fixations

Fixations are all the positions where your eyes stop as you're scanning a line of text.

Minimize these by read words in chunks. Don't focus on just one word at a time. Broaden your focus so you're always taking in multiple words at once. Image
2. Avoid Regressions.

"Regression" is a technical term for going back and reading stuff you just read. It's normal to feel like you need to do this but you don't. It's hard but you have to force yourself to keep pushing forward, and eventually the urge to regress will go away. Image
Read 20 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(