THINK LIKE A DATA SCIENTIST:

Probability is hard because counting is hard.

A thread. πŸ‘‡
For a lot of people, mathematics is true in the same way that "Kermit The Frog and Miss Piggy are a couple" is true. It's true in an imaginary world where we have agreed upon rules. If that's how you think about math then it's pretty obvious that "2+2=4".
To me, "2+2=4" means that "2 things + 2 things will always be 4 things no matter what the things are". Turns out this is not technically true. You can create all kinds of mathematical systems and physical situations where 2 things + 2 things is not 4 things.
For instance, imagine you're sitting at a table. I give you 2 beakers with 2 cups of a clear fluid in each one. You add them both to a big measuring cup and find that it's actually less than 4 cups.
Turns out one beaker had ethanol in it and the other had water. When you mix those two fluids together, it turns out that they interact in an unexpected way.
You might argue that "2 cups of clear fluid plus 2 cups of clear fluid is always four cups of clear fluid" is false but "2 cups of water plus 2 cups of water is always 4 cups of water" isn't. This might actually be the case but I would argue that we don't know even this.
What if some new physical effect kicks in that we don't know about yet? Then we'd be back in the same boat as before of trying to come up with a way of redefining the situation so that "2+2=4" makes sense. We have to be physicists to figure out if we can trust our arithmetic.
The problem might be the mixing. Perhaps we should have a rule that as long as we don't let the volumes of liquid that we're adding up physically interact then we'll have 4 cups of fluid just not all in the same container.
So perhaps with the right rules about how to set things up, we can always make them line up. I notice we humans do that a lot with arithmetic. We make excuses for it. If things don't add up in the way we were expecting, we fiddle around with our definitions until it works out.
Imagine you're hiring four people to help cater a wedding. Two very trusted recruiters each send you a list of two excellent candidates so you move ahead with your plans, only to find out just before the wedding that "Elizabeth Jones" on list one is also "Beth Jones" on list two!
In this case, the "interaction" wasn't physical. It was conceptual. We intended to count how many people we had but instead we were just counting names. Because the same name can refer to multiple people & vice versa, we ended up in an unexpected situation.
When you learn "2 apples + 2 apples = 4 apples" in grade school, they don't bring up these kinds of problems. A pair of apples can't be, for instance, a projection into our universe of a single apple that exists in a higher dimensional space. I know I sound crazy right now but...
When I give examples like these, the main objection folks make is that I'm using "+" and "=" in a different way than they're meant to be used. When somebody makes this kind of objection, it's clear to me that they look at math in a Kermit-and-Miss-Piggy way.
For them, "2+2=4" is just statement about rules in an imaginary world. It doesn't matter to them that it doesn't apply to some stuff in our world. "2+2=4" is always true because it lives in a perfect realm of abstract forms where it is untarnished by our fallen world.
To a Kermit-and-Miss-Piggy type person, we've all agreed that the symbols "2+2=4" must always refer to things in the context of that ideal world. Therefore, I'm breaking the social contract when I use "2+2=4" in the context of our world.
"Breaking the social contract" is another way of saying that I'm being an asshole. In case that's how you're feeling right now, let me try and defend myself a little.
One of the reasons I got obsessed with thinking about things like this is I'm a data analyst. A big part of analyzing data is just counting stuff. If you can't answer basic questions like how many things you have, it's really hard to figure out the probabilities of things.
Turns out how many "things" you have can be really dependent on your definition of a "thing" as I hope I've convinced you by now. Lots of data scientists work with huge datasets. Think about a company with billions of customers. It can take days or even weeks to count things.
This sort of thing makes you want to think really carefully about what you're counting because messing up is expensive. It's in this context that I learned an important lesson about definitions. How much of something I had depended a lot on my definitions.
This really important lesson transfers to probability because probability comes from counting how often stuff happens. One of the reasons people don't understand probability is they don't understand how incredibly sensitive both counting and probability are to their definitions.
I won't go into tricky examples with probability right now because this thread is long enough but I wanted to explain the basic challenges using something simple like "2+2=4" which is easier to understand. Hope you found this thread helpful. πŸ‘
If you enjoyed the thread, please like and retweet so others can enjoy it as well. I talk about this stuff all the time so give me a follow for more of the same.😁
P.S. My apologies for the anti-intellectual tone of some of the responses. Don't worry. I'm used to it.πŸ˜‰
P.P.S. The thread was retweeted by some large troll accounts. It's mostly bad faith feedback by people that were looking to lash out before they even saw the thread. It's best to ignore it. πŸ™ƒ

β€’ β€’ β€’

Missing some Tweet in this thread? You can try to force a refresh
γ€€

Keep Current with πŸ”₯ Kareem Carr πŸ”₯

πŸ”₯ Kareem Carr πŸ”₯ Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @kareem_carr

12 Aug
Just found this. Not sure if @michaelshermer is confusing @nhannahjones with me or somebody else because I never said most of that stuff either. What I will say is I learned from my (mostly white) grad school professors how to construct mathematical systems where 2+2 isn't 4.
If that seems contrary to reason to you then I humbly suggest that maybe you don't understand reason as well as you think you do. I know many of us probably learned in grade school that 2+2=4 but the relevant context is it's basic math that they teach to kids.
My race seems to suggest to people that this is a race thing somehow. It's not. Check out the link for a PhD who's not black and who also agrees that 2+2 is not always 4. As Dr. Hossenfelder puts it, "It's not woke. It's math."
Read 4 tweets
6 Aug
THINK LIKE A DATA SCIENTIST:

Are you frustrated with how organizations like the CDC and the WHO are handling the pandemic? Do you wish they did a better job of following the data?

If so, read on... πŸ‘‡
One of the earliest lessons of the pandemic was covid outbreaks can get really bad really quickly. While the costs of over-responding are easy to predict like unnecessary financial losses and physical discomfort, the costs of under-responding are harder.
Some areas got away with relatively small outbreaks. Others experienced tremendous disruptions to their healthcare system and significant losses of life.
Read 18 tweets
14 Jul
Lets clear up some things about:
- race
- social constructs
- biological constructs
- sociological causation
- biological causation
- predictive accuracy
A thread. 1/n
Let's say X predicts Y. This doesn't mean X is in any way causally related to Y. Therefore, if I say "X predicts Y", it doesn't mean I'm saying "X causes Y". So in particular, if I say race predicts a health outcome, this doesn't mean race caused that outcome. 2/n
The next thing we should talk about is causation. Sociological causation involves entities from a sociological framing of the world. Biological causation involves constructs that originate out of a biological framework. 3/n
Read 11 tweets
10 Jul
Speaking as a black statistician, I don't think we can completely eliminate using race in medical decisions if we want to make the best decisions for each patient given our current state of technology. Gene testing for specific ancestry would be better but we aren't there yet.
Leaving out race can have a lot of unintended consequences. Algorithms might default to the standard of care for the largest group. This means treating everybody as if they are white which would be problematic in many cases.
Alternatively, algorithms may relearn race from the data. They will use family history, geography and other demographic variables to guess the race. This can be tricky to detect if you aren't looking for it. "Racist" algorithms get released to the public all the time.
Read 9 tweets
5 Jul
As a response to the rightwing, some academics are pushing for a world where scientists are above social criticism.

Critiquing the motivations and behaviors of scientists isn't anti-science. It's democracy.

A thread.
I frequently see academics imply it's morally wrong or anti-science to tweet negative things about them and their field. This is somewhat understandable. Who wouldn't want to live in a world where it was morally wrong to criticize us and our work?
I guess it's possible a mean tweet could hurt a field's reputation but so what? Tweeting mean things about Pepsi could cause Pepsi to be less popular and Pepsi employees to lose their jobs. That doesn't mean tweeting mean stuff about Pepsi is morally wrong.
Read 11 tweets
4 Jul
Folks have been bashing this mentorship program because of Google’s recent track record of what some might call β€œanti-blackness” but it doesn’t seem like most folks read the materials. I did and I have concerns. πŸ§΅πŸ‘‡πŸΎ
Look at this. They say they will β€œdesk reject”, as in not even READ your application, if it’s not max 2 pages, 8.5” by 11”, Times New Roman font, 1” margins, single spaced, in PDF format. This is more stringent than a grad school application and probably quite a few term papers.
What else will they desk reject for? Including your contact information. That’s right. They will not even consider your application if it has your name in it.
Read 8 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(