Profile picture
Adam Sacarny @asacarny
, 12 tweets, 4 min read Read on Twitter
a lil thread about the magic of randomization inference and blinded data analysis.

@ml_barnett @andrewolenski & i are doing a study where we already know our confidence intervals but have no idea what our point estimates are! and here’s why i think that’s so cool.
@ml_barnett @andrewolenski in 2015 CMS sent strong letters to prescribers of antipsychotic Seroquel. we showed they reduced Seroquel Rx ~15%.

now we're studying if the letters had peer effects, i.e. did they affect doctors who worked with the ones who received a letter?
we had 2 goals, among others: (1) don't p-hack our specifications and (2) get our p-values right.

for (1), we loaded up our data and blanked out the true treatment status. this is called blinding. it’s hard (~not impossible) to p-hack when you don’t see the variable of interest.
but how to (2) get p-values right? with peer effects, the network structure of the data creates complicated patterns of dependence. good luck clustering! thankfully, randomization inference provides a super-easy solution. viz.:
we created 1,000 fake randomizations of physicians to the letter treatment.
then, we pretended these randomizations were real and estimated, 1,000 times, the supposed treatment effect of having a doctor peer get a letter.
under the 'sharp null' that the letter does nothing to anyone, the 1,000 fake estimates we get from these randomizations represent the distribution of the estimator. the middle 95% of estimates represent the 95% confidence interval (see the red bars).
even better, under the joint sharp null of no effect for anyone in either model, we now have 1,000 draws from the joint distribution of the two estimators. that gives us the distribution of the joint F-statistic of no effect.
once the analysis plan is finished, we’ll swap in the true treatment status and re-run all the regressions. if the point estimates are outside the 95% CI bounds above, we have ourselves a significant effect.

it’s weird to already know what we’re up against.
also: by analyzing the data in this way, we learned that adding lots of statistical controls didn’t raise our power (shrink our CIs) at all. so we pre-specified a simple, parsimonious model without worrying that we left statistical power on the table.
but we didn’t bother with a more formal specification search. even with treatment status blinded, we could still get ourselves into trouble by estimating models until we got one that spuriously fit the data well – overfitting.
machine learning could help with optimal selection of power-raising controls without overfitting, though i haven’t seen it paired with randomization inference. (this seems like something @jakewbowers & @sherrirose would have thoughts about?!)
if this all seems cool, i can't recommend enough Alwyn Young's work w/ randomization inference re-analyzing major economics RCTs. turns out we're often not that close to the asymptopia that makes clustered standard errors valid personal.lse.ac.uk/YoungA/Channel…
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to Adam Sacarny
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member and get exclusive features!

Premium member ($30.00/year)

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!