Levi Profile picture
I explain Data Science on Grandma's level. Writing https://t.co/25jLCDRZms

Sep 17, 2023, 8 tweets

A surprising statistical result 🔽

You have tested positive for a disease.

- The test is 99% accurate.

- 1 out of 10,000 people has the disease.

What is the probability that you truly have the disease, given that you have tested positive?

Let's figure out

🧵

Look at a random group of 1 million people.

Fact 2 says 1 out of 10,000 people has the disease.

In our sample, 100 people have the disease, and 999,900 are healthy.

Run the test on the 100 sick people.

Fact 1 says the test is 99% percent accurate.

- 99 people will be diagnosed correctly as sick.

- 1 person will be misdiagnosed as healthy.

Now test the 999,900 healthy group.

The test makes 1% mistake.

1% of these 999,900 healthy people are misdiagnosed as sick.

Putting it all together:

- the total number of people who tested positive is 99 + 9,999 = 10,098.

- Out of these, only 99 are sick.

Therefore the probability that you have the disease is 99/10,098 = 0.0098

Less than 1%!

Why do we have this surprising result?

People tend to focus on fact 1, the 99% accuracy.

But fact 2 is also crucial. 1 out of 10,000 means 0.01%

The 1% is much larger than the 0.01%

In other words, the error rate is larger than the rate of being sick.

That's it for today.

I hope you've found this thread helpful.

Like/Retweet the first tweet below for support and follow @levikul09 for more Data Science threads.

Thanks 😉

If you haven't already, join our newsletter DSBoost.

We share:

• Interviews

• Podcast notes

• Learning resources

• Interesting collections of content

dsboost.dev

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Keep scrolling