Tweet

More from @jmwooldridge

Jeffrey Wooldridge

@jmwooldridge

10 Mar

In 2018 I was invited to give a talk at SOCHER in Chile, to give my opinions about using spatial methods for policy analysis. I like the idea of putting in spatial lags of policy variables to measure spillovers. Use fixed effects with panel data, compute fully robust ses.

For the life of me, I couldn't figure out how putting in spatial lags of Y had any value. After preparing a course in July 2020, I was even more negative about this practice. It seems an unnecessary complication developed by theorists.

As far as I can tell, when spatial lags in Y are used, one always computes the effects of own policy changes and neighbor policy changes, anyway, by solving out. This is done much more robustly and much more easily modeling spillovers directly without spatial lags in Y.

Read 6 tweets

Jeffrey Wooldridge

@jmwooldridge

7 Mar

I think frequentists and Bayesians are not yet on the same page, and it has little to do with philosophy. It seems some Bayesians think a proper response to clustering standard errors is to specify an HLM. But in the linear case, HLM leads to GLS, not OLS.

#metricstotheface

Moreover, a Bayesian would take the HLM structure seriously in all respects: variance and correlation structure and distribution. I'm happy to use an HLM to improve efficiency over pooled estimation, but I would cluster my standard errors, anyway. A Bayesian would not.

There still seems to be a general confusion that fully specifying everything and using a GLS or joint MLE is a costless alternative to pooled methods that use few assumptions. And the Bayesian approach is particular unfair to pooled methods.

Read 5 tweets

Jeffrey Wooldridge

@jmwooldridge

6 Mar

What about the control function approach to estimation? It's a powerful approach for both cross section and panel applications. I'm a fan for sure.

However, the CF approach can impose more assumptions than approaches that use generated IVs.

#metricstotheface

In such cases, we have a clear tradeoff between consistency and efficiency.

In models additive in endogenous explanatory variables with constant coefficients, CF reduces to 2SLS or FE2SLS -- which is neat. Of course, the proof uses Frisch-Waugh.

The equivalence between CF and 2SLS implies a simple, robust specification test of the null that the EEVs are actually exogenous. One can use "robust" or Newey-West or "cluster robust" very easily. The usual Hausman test is not robust, and suffers from degeneracies.

Read 7 tweets

Jeffrey Wooldridge

@jmwooldridge

6 Mar

If you teach prob/stats to first-year PhD students, and you want to prepare them to really understand regression, go light on measure theory, counting, combinatorics, distributions. Emphasize conditional expectations, linear projections, convergence results.

@metricstotheface.

This means, of course, law of iterated expectations, law of total variance, best MSE properties of CEs and LPs. How to manipulate Op(1) and op(1). Slutsky's theorem. Convergence in distribution. Asymptotic equivalence lemma. And as much matrix algebra as I know.

If you're like me -- and barely understand basic combinatorics -- you'll also be happier. I get the birthday problem and examples of the law of very large numbers -- and that's about it.

Read 4 tweets

Jeffrey Wooldridge

@jmwooldridge

6 Mar

When I teach regression with time series I emphasize that even if we use GLS (say, Prais-Winsten), we should make standard errors robust to serial correlation (and heteroskedasticity). Just like with weighted least squares.

#metricstotheface

I like the phrase "quasi-GLS" to emphasize, in all contexts, we shouldn't take our imposed structure literally. In Stata, it would be nice to allow this:

prais y x1 x2 ... xK, vce(hac nw 4)

vce(robust) is allowed, but it's not enough. The above would be easy to add.

To its credit, Stata does allow

reg y x1 ... xK [aweight = 1/hhat], vce(robust)

to allow our model of heteroskedasticity, as captured by hhat, to be wrong. I've pushed this view in my introductory econometrics book.

Read 4 tweets

Jeffrey Wooldridge

@jmwooldridge

3 Mar

I've often wondered why many econometricians seem to have an aversion to row vectors, even when using a row vector simplifies notation.

#metricstotheface

Probably the most common way to write the linear model for a single observation is

y(i) = x(i)'b + u(i)

for a column vector x(i). To me, the prime muddies the waters. For several reasons, I prefer

y(i) = x(i)b + u(i)

for x(i) 1 x k.

It's natural to define x(i) to be the ith row of the data matrix X, especially when visualizing how data are stored.

Plus, insisting x(i) is a column leads to this inelegant formula, where the primes are in different locations:

X'X = Sum(x(i)x(i)')

I feel bad for row vectors.

Read 4 tweets

Share this page!

Jeffrey Wooldridge

Try unrolling a thread yourself!

More from @jmwooldridge

Jeffrey Wooldridge

Jeffrey Wooldridge

Jeffrey Wooldridge

Jeffrey Wooldridge

Jeffrey Wooldridge

Jeffrey Wooldridge

Did Thread Reader help you today?

Like this author's thread?