At which level should one cluster standard errors? What's good empirical practice? Have you ever heard about placebo regressions in this context? Here is a very useful guide that was recently published in the JoE doi.org/10.1016/j.jeco…. 🧵 with a short summary. #EconTwitter 1/9
The general idea is of course that we divide the sample into clusters and that we allow for heteroskedasticity/dependence within clusters while assuming independence across clusters. 2/9
This works asymptotically (Section 2 of paper), but it turns out that in finite samples, inferences may not be reliable. Then, the bootstrap could work. 3/9
The purpose of this paper here is to guide empirical practice. In Section 3, the paper discusses when to use cluster-robust inference. It also discusses the role of cluster fixed effects and describes several procedures for deciding the level at which to cluster. 4/9
Section 4 then talks about inference, in particular factors that determine how reliable, or unreliable, these inferences are likely to be in practice. Sections 5 and 6 then go into detail. 5/9
One thing that seems particularly useful is the so-called placebo regression (Section 3.5). They provide a simple way to check the level at which to cluster. 6/9
They are constructed, roughly speaking, such that when one clusters at the right level, then one gets the "right" rejection rates. 7/9
The paper ends with a useful summary in 5 points. 8/9
The last two points are things to keep in mind. This is a very rich paper and I hope much of what is discussed will become the usual practice in applied work. 9/9
• • •
Missing some Tweet in this thread? You can try to
force a refresh
We have a new working paper on the dependence of search result quality on the amount of user-generated data. This picture here shows the main result. Let me explain. 🧵1/14
The central question we answer is: Do some search engines produce better search results because their algorithm is better, or because they have access to more data from past searches? 2/14
If the reason for the better search results is that they have access to more data, mandatory data sharing, a policy that is currently discussed, could trigger innovation and would benefit all users of search engines. The idea is that it's unfair that firms like Google 3/14
In the following mini 🧵 I summarize his 5 ✅ principles
The 5 principles are:
✅ show your variation with descriptive analysis
✅ use the descriptive evidence to provide preliminary evidence
✅ use the descriptive analysis to guide choices of what you model (and not model)
✅ clearly articulate the value added of the model and …
✅ choose parameters of interest and counterfactuals that are informed by your variation
#EconTwitter:
I wish I had seen this one-pager 📄 on how to write excellent papers while I was doing my Ph.D.: scholar.harvard.edu/files/shapiro/…
Jesse Shapiro suggests doing this in 4 steps.
Here is a summary. 🧵
Step 1:
dream up a somewhat realistic introduction with a description of results and so on
if it does not excite you, abandon the project (very! important advice)
Step 2:
do the research
start with whatever is least clear to you
use the introduction as your compass