Tweetorial on going from regression to estimating causal effects with machine learning.

I get a lot of questions from students regarding how to think about this *conceptually*, so this is a beginner-friendly #causaltwitter high-level overview with additional references. Hand-drawn graphic of a regression formula E(Y|T,X)=\beta_0+
One thing to keep in mind is that a traditional parametric regression is estimating a conditional mean E(Y|T,X).

The bias—variance tradeoff is for that conditional mean, not the coefficients in front of T and X. Hand-drawn graphic of a regression formula E(Y|T,X)=\beta_0+
The next step to think about conceptually is that this conditional mean E(Y|T,X) can be estimated with other tools. Yes, standard parametric regression, but also machine learning tools like random forests.

It’s OK if this is big conceptual leap for you! It is for many people! Hand-drawn graphic of the conditional mean E(Y|T,X) with red
But now you’re also worried. Where did the coefficients go?

I care about a treatment effect and if I estimate E(Y|T,X) with some machine learning tools, the coefficients aren’t there. Graphic of a simple decision tree where the first split is f
We can think about defining our parameters more flexibly outside the context of a parametric model!

Can write the average treatment effect as the contrast: E_X[E(Y|T=1,X)-E(Y|T=0,X)]. Hand-drawn graphic of the average treatment effect: E_X[E(Y|
Now we can move to thinking about how to operationalize estimating that treatment effect with machine learning. Here is how we write down our estimator.

You can see the conditional means, except we need to have estimates under the setting that treatment is equal to 1 and 0. Hand-drawn formula for the treatment effect estimator: 1/n\s
This involves:

(1) estimating E(Y|T,X) with our machine learning tool.

(2) Setting all observations to T=1 and using our fixed algorithm to obtain predicted values for each observation.

(3) Repeating (2) for T=0.

Now we can plug these values into the estimator! Hand-drawn formula for the treatment effect estimator: 1/n\s
What I described is a machine learning-based substitution estimator of the g-formula.

There are other ML-based estimators for effects, including methods that use the propensity score or both the outcome regression and propensity score.

Read more: academic.oup.com/ije/advance-ar… Screen cap of the title and authors for the paper linked in
Now interpreting any of these effects as ***causal*** requires an additional set of assumptions.

The statistical model can be augmented with causal assumptions that allow an enriched interpretation of the treatment effect parameter.

Read more: journals.lww.com/epidem/Fulltex… Hand-drawn graphic of the formula E_X[E(Y|T=1,X)-E(Y|T=0,X)]
I describe these steps from regression to machine learning for causal inference in more detail in my short courses (drsherrirose.org/short-courses), for example this workshop at UCSF: dropbox.com/s/wmgv51j21t3n… (starting slide 147). Screen cap of a slide from the mentioned UCSF course where t
There are many books on causal inference (I have co-authored two). Our targeted learning books on machine learning for causal inference can be downloaded free if you have institutional access, and two of the introductory chapters are free on my website: drsherrirose.org/s/TLBCh4Ch5.pdf. Image that says "Causal Inference and Effect Estimation
This targeted learning tutorial is free access: academic.oup.com/aje/article/18…. It has steps for double robust machine learning in causal inference and information on calculating standard errors as well as why we want the bias—variance tradeoff for the effect, not the conditional mean. Graphic from the linked paper that displays commonalities an
Happy to answer questions or requests for further resources on machine learning for causal inference. ☺️

If you find this thread after my rotating curator week is over (October 30, 2020), I can be found at @sherrirose. Image of a chihuahua wearing a collar and tie with a cartoon

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Women in Statistics and Data Science

Women in Statistics and Data Science Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @WomenInStat

26 Oct
What do professors do day-to-day (in a pandemic)?

This varies *a lot* by type of role, seniority, and institution.

I’m tenured at a research-intensive institution and I am not teaching this term.
I spend a fair amount of time meeting with students and collaborators. Today, Monday, I have 5 such meetings.

There are also lots of emails and administrative tasks all the time.
Each day this week I’ll drop a tweet in this thread to add in unique things I haven’t mentioned yet to demystify the life of this particular professor.
Read 6 tweets
8 Oct
🧵 time! I’d love to talk about the responsibilities we have as data practitioners. In this ~~information age~~ I think it’s critical we use data, ML, stats, and algorithms fairly, and with an eye toward making the world better for people.
I found this piece on data visualization very striking: medium.com/nightingale/it…
“Knowledge is never subjective.” As the creator of a graph, you hold the narrative power.
Read 11 tweets
7 Oct
Lots of people have asked me if studying biostats has actually been relevant in my career as a software engineer, and I’ve found the answer to be a resounding yes! It's super relevant in lots of engineering problems and in understanding the world generally. 🧵 follows!
When I worked on payment fraud prevention, I was always talking about diagnostic testing for rare diseases!
Diagnostic testing was something we studied at length in our early biostat & epi classes in grad school and it turns out “fraud” behaves similarly to a “rare disease” in a lot of ways.
Read 9 tweets
6 Oct
Gerrymandering gets its name from one Elbridge Gerry, who in 1812 drew a voting district in Boston that looked like a salamander because it was politically expedient.
the practice persists through today, from city council districts all the way up to (arguably) the Electoral College!
math, statistics, and measurement have played a key role in several court cases related to the ongoing discussion and fight for fair and representative districts.
Read 6 tweets
4 Sep
One more quick tweet, unrelated to the Gelman-Rubin diagnostic.

Someone asked, "I hear C++ is fast but a little hard to grasp. That true?"

Mostly yes. Like Python, R is mostly easier to learn and often is slower than C/C++.
I recommend you think about how your code will be used when you decide what language to code in. If you're coding for yourself and you probably just need to run it once, then R may be a good choice. Optimizing for speed may be overkill. (2/)
If you are writing a function/package for public consumption, then speed is much more of a concern. You can profile your code to see which parts are time-consuming. You can also just google what things R is slow at (ex loops). (3/)
Read 10 tweets
2 Sep
Let's extend the linear model (LM) in the directio of the GLM first. If you loosen up the normality assumption to instead allow Poisson, binomial, etc (members of the "exponential family" of distributions), then you can model count, binary, etc responses. (4/)
You've probably heard of Poisson regression or logistic regression. These fall under the umbrella of GLM. (5/)
The LM regression equation is E(Y) = X Beta, where X is the model matrix, Beta is the vector of coefficients, Y is the response vector, and E(Y) is the expected val.

For Poisson regression, we have log(E(Y)) = X Beta.
For logistic regression, we have log(p/(1-p))= X Beta (6/)
Read 16 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!