A lot of my research is in the observational study space. This basically mean that participants in the study were not randomly assigned treatments or exposures, but rather we just observe how a certain exposure affects an outcome
♥️ For example: Is a diabetes drug associated with heart disease?
Instead of randomly giving some patients drug A and some drug B, we evaluated the electronic health records of patients who were already taking the drugs & assessed their health after.
There are some issues with this analysis - since we didn’t randomly assign patients to drug A and drug B, it is possible that doctors selected one drug over the other for certain reasons that reflect patient characteristics
4/
Perhaps healthier patients are often prescribed drug A – this could make it look like those who take drug B are more likely to have heart disease simply based on their pre-treatment characteristics
✨Propensity scores can help to adjust for these pre-treatment characteristics
5/
✨ A propensity score is the probability of being assigned to a certain treatment, conditional on pre-treatment (or baseline) characteristics
Here is a mirrored histogram of propensity scores for treatment (top) and control (bottom) groups
6/
Let's spend a second staring at the graph. Two things jump out to me:
☝️ More mass on the *right* in the treatment group (top) means that more people in that group had a higher probability of receiving treatment (makes sense!)
✌️ More people received the treatment vs control
7/
Ultimately, to make an apples to apples comparison, we want to make these two groups comparable. There are lots of ways to do this! That is where the *estimand* of interest comes in.
8/
We could estimate the *average treatment effect*. Here the target population is the *whole* population. To make these populations comparable I could *upweight* everyone based on their propensity score.
This graph overlays the pseudo-population created by doing this
Here, the light green distribution (the up-weighted treatment group) is pretty comparable to the blue distribution (the up-weighted control group). The weights are:
Notice those weights can range from 1 to infinity! Yikes! If someone in the treatment group has a really small propensity score (or control has a really large one) they could count a whole lot in our analysis. This can lead to finite sample bias // variance issues (boo!)
11/
Another estimand is the average treatment effect among the *treated*
💊 everyone in the treatment group gets a weight of 1
💨 control: propensity score / (1 - propensity score)
Notice the blue and green distribution still match! But they look different from the ATE graphs
12/
Because this particular example has more treated folks than control, we ended up having to upweight a bunch of the control arm to match -- again this can be unstable
13/
One of my *favorite* estimands is the average treatment effect among the overlap population. The weight is quite simple:
These weights are bounded by 0 and 1, so they have nice variance properties!
Check out this amazing preprint by @noah_greifer & @Lizstuartdc on how to choose an estimand based on your question (and how that maps to particular weighting / matching choices)
👋 @LucyStats here! It's been a very exciting week for folks in Causal Inference with the Nobel Prize announcements, I thought it'd be neat to dive back in history to hear about a previous Nobel winner, Ronald Ross
1/
This topic is fun because it spans a whole myriad of my interests!
✔️We've got stats!
✔️We've got poetry!
✔️We've got infectious disease epidemiology!
Ronald Ross won the Nobel Prize for Physiology or Medicine in 1902 "for his work on malaria, by which he has shown how it enters the organism and thereby has laid the foundation for successful research on this disease and methods of combating it."
Today, I would like to share some resources on causal inference. - a thread ⬇️
I came to this topic, while working with clinicians who use IPW and matching on a daily basis (they are not familiar with double robust approaches). I don’t know for you, but I am so admirative of them as they combine their work with patients with research to advance knowledge
Now, I would like to mention an R package, FactoMineR that I use on a daily basis to explore and visualize heterogeneous data: quantitative, categorical, with group structures, (multiple) contingency tables.
At its core, SVD! (I am also an SVD fan, @daniela_witten ;-).
@daniela_witten Note it was also the case for the famous @SherlockpHolmes, a role model for reproducibility, who I admire both from a scientific and personal point of view.
Hello!
So today, I will share a few thoughts and advice I usually give to my PhD students. I hope this might be helpful for a wider audience, even if it is obvious and already stated by others. Anyway, as a teacher, we know repetition is important ;) - a thread ⬇️
1)Ask questions
Ask questions
Ask questions
….
Ask questions!
I mean that: don't hesitate to ask questions in seminars (in France in particular, we don't dare enough). Be curious, don’t be shy.
2) If you are tired and can't work, just don’t. Take a break, take a walk if you can. I've never regretted it, although I've often regretted staying in front my computer all day because I couldn't get anything done