🚨Hello #EconTwitter! I am very happy that my paper with Brantly Callaway, "Difference-in-Differences with multiple time periods", is now forthcoming at the Journal of Econometrics. sciencedirect.com/science/articl…

What are the main take aways? I will ask my daughter to help me out.

1/n
Our main goal here is to explain how one can transparently use DiD procedures in setups with (a) multiple time periods, (b) variation in treatment timing (staggered adoption), and (c) when a parallel trends is plausible potentially only after conditioning on covariates.

2/n
But why should one care? Don't we all know all these things already? Why can't we just use TWFE regressions and move on?

3/n
To answer this important question, we can run a simple simulation.

Consider 4 treatment cohorts, where units are assigned to these cohorts completely randomly.

Think of perfect DGP for DiD that looks like this 👇

4/n
If you are interested in treatment effect dynamics, the "status-quo" procedure in the literature is to use a two-way fixed-effects linear regression model with leads and lags (with potentially binned event-times)

If you do it here, you get substantial bias! 😱😱😱😱

5/n
However, this is not a problem of the *design*! It is a problem of the estimation procedure!

Indeed, if one uses the DiD procedure that Brant and I propose in our paper, you get very precise event-study estimates!

This is why we insist that *TWFE IS NOT DiD*!

6/n
Now, time to explain *HOW* we do this so our goal of making DiD tools really accessible is closer to be achieved.

7/N
With multiple time periods and variation in treatment timing, the three main points a DiD procedure needs to address are:
1) What is the parallel trend assumption (PTA)?
2) Is treatment anticipation something we need to worry about?
3) What are the parameters of interest?
8/n
By answering these 3 questions, we specify a) which units should be used in your comparison group (e.g., never-treated or not-yet-treated), b) which pre-treatment time period should serve as the baseline comparison, and c) how you can use the data to answer your scientific q.
9/n
The main building-block of our proposal is the ATT(g,t). This is the average treatment effect at time t among the units that were first treated in time period g.

This is the parameter we aim to identify by making a PTA and a limited treatment anticipation assumption.

10/n
Once you are set on these assumptions, we show that you can identify the family of ATT(g,t)'s by using diff estimands based on a) Outcome-regression (OR) , b) inverse-probability weighting (IPW), and c) Doubly-robust (DR)approach.

11/n
The OR, IPW and DR estimands identify *the same* parameters, but suggest different estimation procedures.

We like the DR one because it is less "demanding" on modelling assumptions, but you are free to chose your favorite.

12/n
Regardless of which approach on you choose, the estimation procedure always rely on subsetting your data to construct "comparisons of means". This is really the "magical" part of our approach. Subset the data to respect your assumptions. That is it! Simple, right?

13/n
Now, you may be worried that you are subsetting the data "too much" and losing precision. We hear you. That is why we also propose aggregation schemes to construct summary measures of the treatment effect such as the event-study-type parameters we talked a while back!

14/n
In the paper, we propose several different aggregation schemes that can highlight heterogeneity in different dimensions. We like this quite a bit because it allows researchers to answer different questions of interest.

And the question always come first!

15/n
In order to conduct valid inference, we also propose and justify the use of a computationally attractive bootstrap procedure that is fairly fast. Importantly, it does not rely on resampling data, so we do not run into the problem of having "few" observations per group here.

16/n
Final question: How can you actually use the tools we proposed in this paper?

Surprise, surprise! We have a R package with everything ready to go: bcallaway11.github.io/did/

We have examples, guidelines, documentation, etc.

We did our best to make it easy to use!
#RStats

17/n
In the paper, we also have an empirical application. We also discuss how to use these tools with repeated cross-section data, etc.

I will stop here, because Sofia wants to go back to play!

Thanks everyone, and please feel free to DM or email me if you have questions!

n/n
You can download the paper using this link (until February 2021) authors.elsevier.com/a/1cFzc15Dji4p…

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Pedro H. C. Sant'Anna

Pedro H. C. Sant'Anna Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @pedrohcgs

9 Jul
Time for another #Econometrics thread!

Today I want to talk about my paper with Jun Zhao (absolutely great PhD candidate from Vanderbilt ), "Doubly robust difference-in-differences estimators", which is now forthcoming at the Journal of Econometrics!

sciencedirect.com/science/articl…
1/n
Before I go on, let me make it clear that everything that I say here or that we proposed in the paper can be easily implemented in #R via the package DRDID: pedrohcgs.github.io/DRDID/

I hope you find this easy to use!
2/n
Now to the paper. First, why should you pay attention to *another* Difference-in-Differences paper?

I think we propose a cool set of new tools that can be very handy. We talk about robustness, efficiency, and inference.

I'll cover the main points here, one-at-a-time!

3/n
Read 13 tweets
29 May
Here I go, on another #Econometrics related thread.

Today, I want to talk about the "debate" related to health policies, economic growth and the 1918 Spanish flu.

Everything I have to say is here (with codes): pedrohcgs.github.io/posts/Spanish_…

Let's get to it!

1/n
Recently, Correia, Luck and Verner (2020) (CLV) put forward a very interesting paper that, among other things, analyze whether non-pharmaceutical interventions helped mitigate the adverse economic effects of the 1918 Spanish Flu pandemic on economic growth.

2/n
CLV find suggestive evidence that NPIs mitigate the adverse economic consequences of a pandemic.

Although today's society has a different structure from 100 years ago, these findings can help shape the current debate about covid policies.

3/n
Read 22 tweets
21 May
Today I want to give a shout-out to @TymonSloczynski paper, "Interpreting OLS Estimands When Treatment Effects Are Heterogeneous: Smaller Groups Get Larger Weights", that is currently available here:

people.brandeis.edu/~tslocz/Sloczy…

Paper is conditionally accepted at ReStat!
This is a very interesting paper that highlights some potential pitfalls of not separating the identification and estimation/inference steps when doing causal inference.

In other words, OLS may be messing up your regression interpretations.

So good to see that I am not alone!
Let go straight to the main message of Tymon's paper.

We all have seen and probably run linear regressions like this and attach causal interpretation to \tau after invoking selection on observables type of assumptions.
Read 10 tweets
12 Feb
Well, although I skipped last week, here I am with another interesting econometrics paper that I really enjoyed reading --- Chen and Santos (2018, ECMA), "Overidentification in Regular Models",

Link: onlinelibrary.wiley.com/doi/full/10.39…

1/n
The main idea of the paper is simple and very powerful.

In (unconditional) GMM models, we know that some estimators are more efficient than others when you've # moment restrictions > parameters of interest. In this case, we can also test the validity of the moments (J-test)

2/n
In many applications, however, our model is not based on unconditional moment restrictions, but on conditional moment restrictions . Furthermore, many times we are also interested in estimating functions (i.e., infinite dimensional parameters).

3/n
Read 9 tweets
17 May 19
Here is something I do not quite follow: Why do people almost always refer to two-way fixed effects (TWFE) models as synonymous to Difference-in-Differences (DID)??

TWFE is an *estimation method*, while DID is a *research design*. These are very different things!
IMO, what defines a DID analysis is the parallel trends assumption (PTA). There are different parallel trends assumption out there, sure, but this is the crucial part of a DID analysis. In general, the PTA shows how we can nonparametrically identify the causal effect of interest.
TWFE, on the other hand, does not depend on PTA! It is a estimation method (or model specification). You can use it regardless of the underlying causal assumption, though the interpretation of the results would of course change (your beta may not be an ATT anymore)
Read 9 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!