, 14 tweets, 3 min read
My Authors
Read all threads
There are two facts which make correlations controlling for confounds uninformative about many (but not all) causal effects: 1) the R^2 of the mechanisms we understand is low, 2) our uncertainty about not well-understood mechanisms should be high. (1 / about 13-15)
I think fact 1) especially is often not appreciated by people who don't work regularly with data and believe that our understanding of what causes outcomes like mortality or wages is almost complete, when in fact the opposite is true. Its implications are also not appreciated.
Consider an association: drinking coffee every day lowers your annual mortality by 12% relative to if you don't drink coffee (livescience.com/59759-drinking…). Suppose we control for income, education, physical activity, smoking, fruit and veggie consumption and red meat consumption.
You might say, "Look at all of those controls!" Any confounding story not dealt with by those controls is far-fetched. But this is a crazy: the R^2 of those controls on mortality is likely tiny and would be tinier still if we computed it by randomizing each of those controls.
We know very little about what determines most variation in mortality. The fact that these controls are present tells us little. The next step is to enumerate other things that might be different about coffee drinkers that might impact mortality. It is not hard to list some:
Other dietary habits not controlled for above, tendency to socialize with other people, family status and demographics, household you grew up in, region where you live, etc...
You might be tempted to dismiss these and say, "I don't think any of those are very important." But be more thoughtful. Make assumptions about a 95% CI for the impact of each confound on coffee drinking and mortality. Do a simulation.
When you write bounds, take uncertainty seriously. You might feel confident that socializing regularly doesn't lower mortality by 10%. But you shouldn't. It's hard to predict the results of randomized experiments before they occur, even if there is high-quality existing evidence.
Try the following exercise: open a registry of randomized trials like the one here: socialscienceregistry.org. Try to write down 95% CIs for the effect size before looking at results. You can start to train yourself to see how much uncertainty you should have about the world.
Now, given confounds you listed (and assumptions about their correlations -- do you know those?), you can simulate how much uncertainty you should have about due to the confounds you enumerated. It will likely be far larger than any plausible range of true effects.
But guess what? That almost surely understates the uncertainty you should have. After all, even if you could somehow control for all those things we enumerated that we can't control for, your R^2 would still be low! Most of the mechanisms determining mortality are still unknown!
Chances are, you've enumerated only a small fraction of the possible confounds. If you could somehow enumerate all of those, you would see that in a Bayesian sense, the variation in estimates from confounding factors will dwarve any plausible variation in direct effect sizes.
In other words, inference by controlling for confounds is really hard unless you have a research design that circumscribes the set of possible confounds at the outset, or an unusual setting where the R^2 is well-understood (e.g. a treatment to prevent acute mortality)
I should mention there are methods that can be used in observational data that formalize related ideas: brown.edu/research/proje…. These methods correctly conclude that adding controls with little impact on R^2 teach you little.
Missing some Tweet in this thread? You can try to force a refresh.

Enjoying this thread?

Keep Current with Jason Abaluck

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!