P(A | B) is the probability of A given B. It is the probability that we will observe A given that we have already observed B.
P(A | do(B)) is the probability of A given do(B). It is the probability that we will observe A given that we have intervened to cause B to happen.
In this context, an intervention simply means to take an action of some kind. Therefore do(B) means to take an action which causes B to happen.
The expressions P(A | B) and P(A | do(B)) might seem very similar but they represent very different situations.
2. We can only learn P(A|B) from the data alone.
Bob has an extremely accurate weather app and is always very good about bringing his umbrella when it rains. We observe Bob over several years and we find that whenever it rains, Bob always has his umbrella and he never brings his umbrellas on days when it doesn't rain.
In the language of probability, we say P(Umbrella | Rain) = 1 and P(Rain | Umbrella) = 1 as well.
What we can learn from this data alone is how to predict whether it rains with a 100% accuracy by checking whether Bob has an umbrella. We can also learn to predict with 100% accuracy whether Bob has an umbrella by checking if it's going to rain.
What we cannot learn is what will happen if we give Bob an umbrella on a random day of our choosing. The answer to this question is P(Rain | do(Umbrella) ) and it's unknowable from the data alone.
We need prior knowledge about how the world works to properly interpret the data we collected. We need to know that rain has an effect on Bob's behavior, but Bob's behavior has no effect on the rain.
Information about the effects of interventions are simply not available in raw data unless it is collected by controlled experimental manipulation.
3. Scientific Experiments work because they produce a very special kind of data.
You may have heard of what many people call a scientific experiment. Take a collection of objects, animals or people. Randomly split that collection into a control group and a treatment group. Apply your intervention to the treatment group while leaving the control group alone. If you observe any differences between the treatment group and the control group, it is logical to attribute these differences to the treatment. You can therefore say the differences were caused by the treatment.
In statistics, the procedure I just described is called a Randomized Controlled Trial. It is a procedure for generating a specific kind of data where:
This is why traditional science experiments work. They are designed to capture causal information. This is not the case for vast majority of data that we collect in society.
Without human guidance or access to real world knowledge, statistical algorithms and artificial intelligences can only learn P(A | B) from the raw data. This is a fundamental mathematical limitation on the use of data alone.
That's it for now. This post is part of a series of posts about the concept of causal inference. They are based on the content of the Book of Why by Judea Pearl with lots of commentary from me.
Follow me (@kareem_carr) so you don't miss out on the next post.
Please show support by liking and retweeting the thread.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
I have debunked this map of global IQs, and the study it was based on, so many times, but it just won't die. Help me spread the word about how much this study sucks. For every 10 likes, I will tweet a ridiculous fact about how badly this study was conducted.
Real science is about paying close attention to the quality of your sources. Notice that the original poster doesn't bother to say where the data comes from.
The source is the book Lynn, R. and Vanhanen, T. (2002). IQ and the wealth of nations.
It's actually incredibly bad that X downranks posts with links. It's perverse that supporting your claims with an external source makes it less likely people will see your post which is the exact opposite of what you'd ideally want in a truth-seeking forum.
Sure you can put the link in the next post but very few people see the second post compared to the first. I also suspect that they downrank threads that have links in them just not as much when it's in the first post.
The disincentivization of connection with the rest of the internet, combined with the loss of the blue checks as a connection to external social institutions in actual physical reality, can only make the misinformation and grifting worse.
Here is a problem I see with modern liberalism: if you tell a certain kind of liberal that there are two kids drowning and that they can only save one, they would immediately declare that they can save them both, and then act completely surprised when both drown shortly after.
If that same liberal could magically go back in time with all the knowledge of what had happened, that person would do the exact same thing again, and then be just as surprised when both kids drowned for the second time.
It's very hard to say we must sacrifice this one good thing for the sake of this other good thing and remain a liberal in good standing.
I honestly get a lot of value out of ChatGPT. It feels built for people like me. I find identifying and correcting its mistakes pretty easy because I'm used to grading student assignments, but I also do things that minimize mistakes like:
I input:
- examples of past solutions to similar problems
- a high-level sketch of the solution to the current problem
- background information if needed
- warnings about any potential complications or pitfalls
For instance, if I want ChatGPT to do a certain kind of computation, I might:
- do a sample calculation by hand on a piece of paper
- get ChatGPT to read the piece of paper and translate it to LaTeX
- tell ChatGPT to study the calculation and extend it to the new situation
This Musk meme speaks to something true which is America is splitting culturally between the college educated vs the non-college educated.
There is however a third group. People who went to college but who think and act like people who didn’t.
Basically you have these people who went to elite schools like Harvard or Stanford or Yale, who have law degrees and doctorates in many cases, telling the non-college educated that there’s no point to college because it’s not great job training.
I’m no historian but I don’t think an education was historically about job training. People apprenticed with tradesmen for that. Education was about being acculturated into the superstructure of your civilization. It taught you what humans had done so far and your place in it.