, 25 tweets, 6 min read
There are two fundamental strategies for causal inference in any setting (experimental or observational): bias control, and bias avoidance. The two are not mutually exclusive.

However, they are not equally likely to produce replicable or useful scientific inference.
Bias control is the strategy where the researcher carefully selects, measures, models, and implements a statistical model which "controls for" confounding, selection biases, etc.

This is the strategy nearly exclusively favored by epidemiology in observational data.
Bias avoidance is the strategy where the researcher looks for (or creates) scenarios where controlling for things isn't necessary. That might be a "natural experiment," a clever use of rules or variables, etc.

This is the strategy nearly exclusively favored in econometrics.
I am counting randomized controlled trials (RCTs) as a bias avoidance strategy, but we can ignore them temporarily and focus exclusively on observational data.

While you certainly can make both strategies be useful, they aren't equally likely to do so.
To get a flavor for why, let's walk through the biggest, baddest, and most fundamental assumptions required to make the "control" strategy work.

These are only the biggest, most existential study design threats. If only ONE of these fails, the whole model fails.
1) Ever "important" variable must be measured to a reasonable degree of accuracy and included.

If you missed even one, your model is likely to be severely biased. If you miss more than one, your model is likely to be severely biased in unknown directions (but never randomly).
2) Every variable you DO include must be modeled with the "right" functional form, and no one has any idea what the "right" form is for just about anything. This can sometimes be avoided with some methods, but not always.

Missing just one is likely to result in severe bias.
3) All of the variables you include must also not be biasing in some other way (like acutally be a collider, or be both a collider and a confounder) through any reasonable pathway outside the model.

Missing just one of these is likely to result in severe bias.
In other words, you must control for EVERYTHING, and do so nearly perfectly for the control strategy to "work,"

There are extremely limited circumstances in which this is reasonable and would produce anything other than noise. In humans, it's virtually non-existent.
Meta-analysis doesn't save us here either. Just like the errors WITHIN the studies will virtually never cancel themselves out as if by magic, the bias BETWEEN studies won't do so either.

Which brings us to the "bias avoidance" strategy.

Which is also really really ridiculously difficult, and has a lot of the same problems as above, with two extremely important notable differences:
1) You only do the bias avoidance strategy when you have some plausible reason to believe you can do so. You may have a rule to exploit, or a natural experiment, or something else. But lacking that you have nothing.
2) All the "bias control" rules still apply, but typically only to ONE relationship, which you specifically choose as being likely to pass. Good scenarios for this are rare, but not non-existent.

Bias avoidance is a strategy to reduce the problem to something manageable.
Often, you can't do either. No good bias avoidance strategies to exploit (although people often aren't trained to look for them, so a lot of low hanging fruit here), and bias control isn't likely to work.

Often, people believe that the problem is important enough to try anyway.
And trying anyway almost always means "control for everything."

That's the rough equivalent of Leeroy Jenkinsing into a problem where actual people's decisions are impacted.

It's often better to do nothing at all, and embrace that some questions can't be answered w/ stats.
Epidemiology is particularly guilty of this. And yes, I really do mean the field as a whole, although not even remotely equally distributed across epi researchers, subfields, institutions.

That's true even among many in the "causal revolution" epi crowd.
The general feeling is that these are important questions. We must make decisions, and we should do the best we can with what we've got.

And I agree with all of the above, but with a notable caveat: often the best we can is either nothing or VASTLY more expensive.
The more important the question, the more CONSERVATIVE we need to be. The costs of being wrong are almost always greater than the benefits of being right when people's lives and livelihoods are on the line, and moreso when you consider societal trust and resources.
If we allow ourselves to admit we just can't know the answers to any reasonable degree of actionable certainty, we allow ourselves to put our resources into the questions we CAN answer.

Inevitably, that will lead us mostly toward questions answerable by bias avoidance.
I am thinking a lot about this problem this week, in part due to the Duflo/Kremer/Banerjee prize announcement.

It took three truly brilliant superstars a few decades to get the field of economics to recognize its past failures, and pave a new way forward.
Does epi have a Duflo right now? There are lots of really fantastic folks on the rise in the epi world doing excellent reform work, no doubt.

But as far as I can tell, none are publicly questioning the fundamental beliefs and methods institutionalized in the field.
If you're an epidemiologist here at the bottom of this very long thread, troubled by this problem just like I am troubled, are YOU willing to risk it to be the next Duflo??

I'm an outsider so I don't really count. But maybe you do?
* also I am not even remotely as boss as those three.
Important revision/clarifications to this thread, as pointed out from some EXCELLENT discussion with @Epi_D_Nique @anecdatally and @robertwplatt:

I did NOT mean to imply either that there are no good epi studies (there are lots), nor that controlling for stuff never works.
@Epi_D_Nique @anecdatally @robertwplatt My claim here is that the control strategy is fragile to the point of futility when it is the primary and only strategy.

It can (and does) work great as a secondary strategy, i.e. when you are in a scenario in which the bias has already been avoided to manageable levels.
Missing some Tweet in this thread? You can try to force a refresh.

Enjoying this thread?

Keep Current with Noah Haber

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!